LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 10-20-2010, 10:37 AM   #1
Vryali
LQ Newbie
 
Registered: Oct 2010
Location: Columbia, SC
Distribution: Arch Linux
Posts: 11

Rep: Reputation: Disabled
SED assistance: add character in the middle of a string.


I resolved the problem after trying in vain to make sed work as I wanted it to, but my problem was this:

I have a folder of MP3s, where some have <artist>- <song> and others have <artist> - <song>. I wanted to make all of them say <artist> - <song>

The command that works for me was
for i in *[a-Z]-*; do mv "$i" "`echo $i | sed 's/-/ -/g'`"; done

but if I -hadn't- used ls with my parameters to filter out the valid results, how would I have done this with sed? I got close with:

ls | sed -e 's/\(.*\)[a-Z]-\(.*\)/\1 - \2/'

but with the above I lose the [a-Z] character that I used to match with (and I need to keep that). My basic question is what I needed to do with the above command to make it work as expected within sed (again, I've since solved it, I'm just trying to understand sed bettter)?

Thanks in advance for your time and responses =)
 
Old 10-20-2010, 11:34 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

[a-Z] is not a legal range, although sometimes accepted. Sed does not accept it.

You do mention that you need/want to keep the [a-Z] part, which would limit your options considerably!!

Why not something like this: ls | sed 's/[[:blank:]]*-[[:blank:]]*/ - /g'

This looks for a dash (-) that has zero or more blanks in front and after it and changes this into a space a dash and a space (globally). I intentionally used [:blank:] to include all blanks and not just a space.

Hope this helps.
 
Old 10-20-2010, 11:36 AM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
You simply need to make sure that every character you want to save is inside back-referenced parentheses. In this case, just expand the first set to included the a-Z reference.

It's also easier when dealing with regex to use the -r option. That way you don't have to backslash-escape everything.
Code:
ls | sed -r -e 's/(.*[[:alnum:]])- (.*)/\1 - \2/'
Notice that I added a space after the hyphen too. Without it, a filename like artist- hyphenated-song would break at the wrong place, because regex is greedy.

Something like this would be even better. It can be run on both correct and incorrect names, as long as there's a space after the hyphen in the middle.
Code:
ls |sed -r -e 's/([^- ]+)- (.*)/\1 - \2/'
Speaking of which, in your first expression, using the 'g' flag in sed would also affect hyphenated names.

Finally, instead of using sed, ls, and/or loops, I recommend perl rename, a convenient renaming script included in some distro's perl implementations. It uses the same general sed/perl syntax as above and works with standard shell file globbing. A stand-alone version is available here:

http://tips.webdesign10.com/files/rename.pl.txt

Last edited by David the H.; 10-20-2010 at 11:39 AM. Reason: changed range to :alnum: based on Druuna's comment
 
Old 10-21-2010, 07:48 AM   #4
Vryali
LQ Newbie
 
Registered: Oct 2010
Location: Columbia, SC
Distribution: Arch Linux
Posts: 11

Original Poster
Rep: Reputation: Disabled
Thanks to both of you for your responses. I didn't realize I could use blank like that and wasn't aware of the -r flag, which will certainly make things easier going forward

I'll check out the perl mention, but I think the closest answer to the method I was using is:

Code:
ls |sed -r -e 's/([^- ]+)- (.*)/\1 - \2/'
I've never seen/used the plus operator before, but a google said:
Quote:
The plus operator will match the preceding pattern 1 or more times.
Is the + actually needed? I ran it with and without and the results seemed to be the same (Also, the + seems to be functionally the same as using /g, is that a correct correlation)? It looks like, at a glance, the ^ means the expression will only match once regardless of the + (making it unnecessary)?

Thanks again, I'll add some reputation as soon as I figure out how to do it XD
 
Old 10-21-2010, 10:51 AM   #5
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Good catch. No, you don't really need the plus sign there. I put it in out of habit because that pattern is often used to stop a regex pattern from being greedy.

When ^ is at the first position inside brackets it negates the character range, so [^- ] means to match anything that's not a hyphen or a space.

For that matter, you'd probably really only need to negate the space here ([^ ]). It really depends on how careful you need to be in weeding out false matches. You could even use a really simplified version like this if there's no chance of there being multiple not-space+hyphen+space combinations.
Code:
ls | sed -r 's/([^ ])- /\1 - /'
Don't get confused between * and + and the sed "g" option. The first two are part of the regex expression, meaning to match zero/one or more of the previous character/pattern. But the "g" is sed's "global" match command, which means that it will apply the changes at every place that the regex matches in the input string, instead of just the first instance without it.

Last edited by David the H.; 10-21-2010 at 10:54 AM. Reason: changed emphasis for clarity
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
sed add a character to the end of each line keenboy Linux - General 2 08-05-2010 12:36 PM
csh Shell Script: String Concatenation, how do i add a new line character? vxc69 Programming 1 05-04-2009 07:51 PM
C++ string object; add whitespace controll character sureshkellemane Programming 5 01-13-2009 06:31 PM
Insert character into a line with sed? & variables in sed? jago25_98 Programming 5 03-11-2004 06:12 AM
Using sed to convert a string to a character? whansard Linux - General 2 01-10-2003 05:13 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:25 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration