LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 09-19-2014, 01:30 PM   #16
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 11,606

Rep: Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494

Are you interested?
first the shell will try to evaluate the command entered. It will find the following:
Code:
sed               a string
-rn               second string
's/.*foo *\( *['  3rd
\'                4
'"]([^'           5
\'                6
'"]+)['           7
\'                8
'"].*/\1/p;'      9
as you see the parts of the sed script are protected by ': 'here is the stuff'. But unfortunately ' itself cannot be put inside, therefore the sed script is split into parts and \'s are inserted where it was necessary. \' is evaluated and will be taken as a single '.
So the result is (you can check it by inserting an echo before the command: echo sed -rn 's/......):
sed -rn s/.*foo *\( *['"]([^'"]+)['"].*/\1/p;
This will be executed.

Code:
sed   this is the command
-r, --regexp-extended     use extended regular expressions in the script. 
-n, --quiet, --silent     suppress automatic printing of pattern space 
s      substitute
/      substitute command delimiter
.*     whatever
foo    the string foo
 *     space and *, any number of spaces
\(     a literal ( char, \ is used to protect it - do not evaluate, just use as is)
 *     any number of spaces again
['"]   means either " or '
(      beginning of a group (that's why it was escaped above)
[^'"]+ means anything but " or ', at least one char
)      end of group
.*     whatever
/      substitute command delimiter, next comes the replacement string
\1     means take the first matching group
/      delimiter again
p      print the result (the replacement if found, the original line otherwise)
;      that is the sed command delimiter
I hope this helps a bit
 
2 members found this post helpful.
Old 09-19-2014, 01:50 PM   #17
metaschima
Senior Member
 
Registered: Dec 2013
Distribution: Slackware
Posts: 1,982

Rep: Reputation: 491Reputation: 491Reputation: 491Reputation: 491Reputation: 491
Quote:
Originally Posted by pan64 View Post
@metashima #12 and @danielbmartin #9 the line some( foo('text'), 'other') is not evaluated properly.
Code:
sed -rn 's/.*foo *\( *['\''"]([^'\''"]+)['\''"].*/\1/p;'
works properly even with foo( 'banana',"" )
Yeah, I forgot about that one. Here is a possible fix:
Code:
bash-4.2$ cat tt
foo ('apple')
foo( 'banana' )
foo ('cherry' )
foo ("date")
foo ("peach" )
foo( "plum" )
some( foo('text'), 'other')
foo( 'banana',"" )
bash-4.2$ tr -d \ \'\"\, < tt | awk '{ str=substr($0,index($0,"foo(")+4); print substr(str,1,index(str,")")-1) }' 
apple
banana
cherry
date
peach
plum
text
banana
 
1 members found this post helpful.
Old 09-19-2014, 01:53 PM   #18
firstfire
Member
 
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 709

Rep: Reputation: 427Reputation: 427Reputation: 427Reputation: 427Reputation: 427
Quote:
Originally Posted by a4z View Post
(for me this line is like magic :-)
I'll try to fix that.

-r argument tells sed to use extended regular expressions which allows to reduce escaping special characters like parentheses.

-n argument tells sed to not print anything to stdout unless explicitly requested.

The sed script consists of a single substitute command of the form s/<pattern>/<replacement>/p.
This command replaces matched substring and prints (note p at the end) resulting line to stdout.

The pattern is basically what you wrote in post #4, but expressed in extended regular expression notation (with some additional garbage from shell quoting). Main parts are:

.* -- match zero or more (*) arbitrary characters (dot).

foo -- match string `foo' as is.

<space>* -- match any number of spaces (0 or more). Actually it is better written as [[:space:]]*, which takes into account tabulations etc.

\( -- match opening parenthesis. Backslash is requires bacause we are in ERE (-r) mode in which parentheses are special symbols.

['\''"] -- basically matches either single (') or double (") quote. In ideal world it would be written as ['"], but because whole command is written in single quotes we must escape single quote inside brackets (double quote is fine). This is done by ending our single-quoted command ('), appending escaped single quote (\') and starting new single-quoted string ('), like this: '..part1..'\''..part2..'.
For more on this peculiarity see here.

([^'\''"]+) -- match one or more (+) non-quote-characters (negation is expressed by ^) and assign this substring to group 1 (as this is the first pair of parentheses in the command), which is referenced as \1 in replacement string.

EDIT: Oops, I'm late again.

Last edited by firstfire; 09-19-2014 at 01:56 PM.
 
1 members found this post helpful.
Old 09-20-2014, 01:38 AM   #19
a4z
Senior Member
 
Registered: Feb 2009
Posts: 1,727

Original Poster
Rep: Reputation: 740Reputation: 740Reputation: 740Reputation: 740Reputation: 740Reputation: 740Reputation: 740
guys, thanks a lot, for the solutions, and especially for the explanations!
 
Old 09-20-2014, 02:56 AM   #20
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,643

Rep: Reputation: 2960Reputation: 2960Reputation: 2960Reputation: 2960Reputation: 2960Reputation: 2960Reputation: 2960Reputation: 2960Reputation: 2960Reputation: 2960Reputation: 2960
I think we had a little extra we could trim off it:
Code:
sed -rn "s/.*foo[^'\"]+['\"]([^'\"]+).*/\1/p" file
Just a slight change on the quotes initially used and a little minimising of what we are and are not looking for
 
1 members found this post helpful.
Old 09-20-2014, 03:34 AM   #21
firstfire
Member
 
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 709

Rep: Reputation: 427Reputation: 427Reputation: 427Reputation: 427Reputation: 427
@grail, cool! I did not noticed that "\"" works. Reworked RE looks much better too!
 
Old 09-20-2014, 02:40 PM   #22
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 11,606

Rep: Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494Reputation: 3494
Quote:
Originally Posted by grail View Post
I think we had a little extra we could trim off it:
Code:
sed -rn "s/.*foo[^'\"]+['\"]([^'\"]+).*/\1/p" file
(like). Just remember this is a special case, when " works to protect that string, usually we need '.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Confusing issue with Perl regEx - Regex check seems to require variable being set EnderX Programming 1 09-07-2013 05:36 AM
[SOLVED] differences between shell regex and php regex and perl regex and javascript and mysql golden_boy615 Linux - General 2 04-19-2011 02:10 AM
Perl to find regex and print following 5 lines after regex casperdaghost Linux - Newbie 3 08-29-2010 09:08 PM
regex with sed to process file, need help on regex dwynter Linux - Newbie 5 08-31-2007 06:10 AM
why do we use gerp & env & wht is their funda nikhil_rs_maheshwari Solaris / OpenSolaris 12 03-06-2006 03:21 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:51 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration