LinuxQuestions.org - BASH: string manipulation

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - BASH: string manipulation (https://www.linuxquestions.org/questions/programming-9/bash-string-manipulation-904110/)

rhuhawk

09-20-2011 07:46 PM

BASH: string manipulation

Hi, I am trying to figure out how to isolate certain parts of string.

In a given file (test.txt) I have some lines:

blah [Need this phrase] bblah
blah ya [Need
this phrase as well]

Is there a way for me to extract only what is in the square brackets? Even if the text goes to the next line?

The extracted phrase should be saved as a variable. Is there any way to do this with the sed command?
Thank you!

corp769

09-20-2011 08:02 PM

Hello,

You would have to use sed for this. Look at the following:

Code:

sed -e 's/.*\[$[^]]*$\].*/\1/g'

This would be the command line equivelant to perform the operation you are looking for. As an example (I'm not within linux, so this might be off...):

Code:

echo "blah [Need this phrase] bblah" | sed -e 's/.*\[$[^]]*$\].*/\1/g'

Which should return for you "Need this phrase".

Cheers,

Josh

rhuhawk

09-20-2011 08:12 PM

Great!

Thanks for the quick reply!!!

When I run this:

sed -e 's/.*\[$[^]]*$\].*/\1/g' test.txt

and test.txt is:
blah [Need this phrase] bblah
blah ya [Need
this phrase as well]

It outputs:
Need this phrase
blah ya [Need
this phrase as well]

Is there any way I can get it to keep reading lines? When the desired string begins on one line and continues on another line how could I get it keep reading through to the next line until it finds the end ].

Thanks again in advance

grail

09-20-2011 10:57 PM

How about:

Code:

sed -r ':a /]/! N;ta;s/.*\[(.*)\].*/\1/' file

kurumi

09-21-2011 05:37 AM

Code:

$ ruby -0777 -ne '$_.split("]").each{|x| puts "#{x.split("[")[-1]}" if x[/\[/]  }' file

Need this phrase

Need

this phrase as well

grail

09-21-2011 06:30 AM

So still learning from the master <bow> to kurumi :)

Now that i have seen what 0777 can do:

Code:

ruby -0777 -ne 'puts $_.scan(/\[([^\]]+)/)' file

Kenhelm

09-21-2011 08:55 AM

Using GNU awk

Code:

echo '

blah

blah [Need this phrase] bblah

[

Need

this

phrase

]

[Need this phrase] blah ya [Need

this phrase as well] blah [Need this phrase] blah

blah' | awk '/./' RS='[^]]*[[]\n?|\n?][^[]*'



Need this phrase

Need

this

phrase

Need this phrase

Need

this phrase as well

Need this phrase

Or, to have each phrase on a single line

Code:

awk '/./{gsub(/\n/," ");print}' RS='[^]]*[[]\n?|\n?][^[]*'



Need this phrase

Need this phrase

Need this phrase

Need this phrase as well

Need this phrase

crts	09-21-2011 09:17 AM

small correction

Quote:

Originally Posted by grail (Post 4477733)

How about:

Code:

sed -r ':a /]/! N;ta;s/.*\[(.*)\].*/\1/' file

The 't' command will only jump if an 's' command has made a substitution since the last line was read. So a conditional 't' jump directly after reading a new line has no effect.

This works as long as there are no multiple patterns on the same line to keep:

Code:

sed -r ':a /]/! N;s/.*\[(.*)\].*/\1/;Ta' file

grail

09-21-2011 09:37 AM

Cheers crts ... still getting my sedfu together :) although I noticed with Kenhelm's example this doesn't get all the necessary ones :(

crts	09-21-2011 11:05 AM

Quote:

Originally Posted by grail (Post 4478185)

Cheers crts ... still getting my sedfu together :) although I noticed with Kenhelm's example this doesn't get all the necessary ones :(

Yes, as I stated above

Quote:

This works as long as there are no multiple patterns on the same line to keep:

the solution has some restrictions. To also accommodate for Kenhelm's sample data we could use:

Code:

sed -nr ':a /\[[^]]*$/ {N;ba}; s/[^[]*\[([^]]*)\][^[]*/\1/pg; ' file

As you can see, with the above solution we have to use an unconditional jump.

All times are GMT -5. The time now is 04:30 PM.