ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
All i want to extract is "Version: 150" and "Thread: www.google.com". I've been trying to find a way to do it with re.findall (and would prefer to do it that way if possible) but havent been able to get it working.
Edit: I should mention that 150 and 20 are variable in this case in case that wasn't obvious...
Any help is appreciated.
Last edited by mwwynne; 12-10-2012 at 10:55 PM.
Reason: Correct input line
I'm sure you can find lots of regex-related tutorials online. Creating of a particular regular expression often requires some trial and error, so I use little sed one-liners, for example
$ echo 'field1: ignore, Version: 150, field1: ignore, field 2, ignore, Thread: www.google.com' | sed -r 's/(Version|Thread): [^ ,]*/[&]/g'
field1: ignore, [Version: 150], field1: ignore, field 2, ignore, [Thread: www.google.com]
or use ipython, if I need a python solution. This way I can try different ideas and approaches very quickly. Also it is very instructive to read manual and info pages which are probably already installed on your system: man sed, info sed, man awk, info gawk, man grep, man perlre, man perlretut (from perl-doc package on Ubuntu) etc. Of course they are all about different languages, but regular expressions are almost the same.