LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 08-07-2010, 04:57 AM   #1
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Make x number of changes matching pattern with sed


So i tried searching on google but found it difficult to say exactly what I was looking for.

Task - Capitalise x number of letters at the start of words.

eg. Original line - one.two.three.four
Revised line - One.Two.three.four (here only requiring 2 changes)

Test data:
Code:
wire.in.the.blood.s04e01.ws.pdtv.xvid-river.avi
wire.in.the.blood.s04e02.ws.pdtv.xvid-river.avi
wire.in.the.blood.s04e03.ws.pdtv.xvid-river.avi
wire.in.the.blood.s04e04.ws.pdtv.xvid-river.avi
Current code:
Code:
sed -r 's@(\b[a-z])@\u\1@' file
So this will change the first letter to be capitalised.

Adding a 'g' at the end will cause the first letter of every word on a boundary to be capitalised.

Adding a '4g' will process all words from fourth match to the end.

My task is to process all up until the fifth word on each line??
Code:
Wire.In.The.Blood.S04e01.ws.pdtv.xvid-river.avi
i know it will be simple but I am buggered if I can get it
 
Old 08-07-2010, 05:23 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Possible solution:

sed 's/\([a-z][a-z]*\)\.\([a-z][a-z]*\)\.\([a-z][a-z]*\)\.\([a-z][a-z]*\)\./\u\1.\u\2.\u\3.\u\4./' infile

As you can see, this is not dynamic, you need to add/subtract \([a-z][a-z]*\)\. and \u\X to make it hit 3 or 5 words.

Anyway, hope this helps.

BTW: I only use a-z, if numbers are present as well, you need to add these.
 
Old 08-07-2010, 06:12 AM   #3
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Original Poster
Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Hi druuna

Thanks for the reply. I guess what I am trying to get at is, is there something similar in sed's arsenal to '4g' changing from fourth onwards
to be able to change upto fourth?
 
Old 08-07-2010, 09:26 AM   #4
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Too my knowledge there is no range option in sed to do this.

If you need some dynamic solution maybe this will help:
Code:
#!/usr/bin/perl

print "File to use : " ;
$file = <> ;

print "Amount of words to initial uppercase : " ;
$amount = <> ;

open( FNAME, $file ) or die "Cannot open: $file : $!\n" ;
while ( $line = <FNAME> ) {
   $_     = $line ;
   @words = split( /\.+/ ) ;
   for ( $x = 0 ; $x <= $amount - 1 ; $x++ ) {
      $words[$x] = ucfirst( $words[$x] ) ;
   }
   $y = join( '.', @words ) ;
   print $y;
}
close( FNAME ) ;
Example run:
Code:
$ cat testfile 
wire.in.the.blood.s04e01.ws.pdtv.xvid-river.avi
wire.in.the.blood.s04e02.ws.pdtv.xvid-river.avi
wire.in.the.blood.s04e03.ws.pdtv.xvid-river.avi
wire.in.the.blood.s04e04.ws.pdtv.xvid-river.avi

$ ./dynamic.to.upper.pl 
File to use : testfile
Amount of words to initial uppercase : 4
Wire.In.The.Blood.s04e01.ws.pdtv.xvid-river.avi
Wire.In.The.Blood.s04e02.ws.pdtv.xvid-river.avi
Wire.In.The.Blood.s04e03.ws.pdtv.xvid-river.avi
Wire.In.The.Blood.s04e04.ws.pdtv.xvid-river.avi

$ ./dynamic.to.upper.pl 
File to use : testfile
Amount of words to initial uppercase : 2
Wire.In.the.blood.s04e01.ws.pdtv.xvid-river.avi
Wire.In.the.blood.s04e02.ws.pdtv.xvid-river.avi
Wire.In.the.blood.s04e03.ws.pdtv.xvid-river.avi
Wire.In.the.blood.s04e04.ws.pdtv.xvid-river.avi
Hope this helps.
 
Old 08-07-2010, 10:21 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Original Poster
Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Thanks again druuna ... always a pleasure to an alternative. I ended up sticking a couple of seds together:
Code:
sed -r 's@(\b[a-z])@\u\1@g;s@(\b[a-zA-Z])@\l\1@5g' file
I generally despise repetition but it works ... I will leave a little longer in case a sed guru has another change
 
Old 08-07-2010, 10:27 AM   #6
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

To be honest: I like your solution (post #5) better then the perl one _and_ the sed one I posted.

But it is always nice to play with perl and have an alternative

Last edited by druuna; 08-07-2010 at 01:08 PM. Reason: Fixed typo.
 
Old 08-08-2010, 06:50 AM   #7
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
as always, if you have structured data with field and field delimiters, use awk.

Code:
# awk -F"." '{for(i=1;i<=5;i++) $i=toupper( substr($i,1,1)  ) substr($i,2) }1' OFS="."  file
Wire.In.The.Blood.S04e01.ws.pdtv.xvid-river.avi
Wire.In.The.Blood.S04e02.ws.pdtv.xvid-river.avi
Wire.In.The.Blood.S04e03.ws.pdtv.xvid-river.avi
Wire.In.The.Blood.S04e04.ws.pdtv.xvid-river.avi
No need for messy regex using sed.
 
Old 08-08-2010, 06:58 AM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Original Poster
Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Hi ghost

I agree and did originally use awk, but my original regex is not complicated, in fact it is kinda neat
My main query was regarding whether there is a sedism I am missing that would allow you to make up to a number of changes as opposed to from a number onwards,
which is allowed using '4g' at the end.

Thanks for your valued input as always.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Pattern matching with sed vinaytp Linux - Newbie 3 05-24-2010 07:33 AM
SED multiline pattern matching AutoC Programming 1 07-18-2009 12:04 AM
sed: pattern matching with newlines anjanesh Linux - General 2 02-20-2009 06:36 AM
Sed pattern matching digitalbrutus Programming 1 08-20-2006 01:37 PM
pattern matching problem in sed digitalbrutus Programming 4 08-20-2006 04:40 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:04 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration