![]() |
Python Regex Question
If I have a line of text with only 2 fields that I want to extract, how can I essentially ignore what is between the 2 fields, and only extract what I want?
eg. in the line: field1: ignore, Version: 150, field1: ignore, field 2, ignore, Thread: www.google.com Thread: (can be any hostname or ip address) All i want to extract is "Version: 150" and "Thread: www.google.com". I've been trying to find a way to do it with re.findall (and would prefer to do it that way if possible) but havent been able to get it working. Edit: I should mention that 150 and 20 are variable in this case in case that wasn't obvious... Any help is appreciated. Thanks! |
Hi.
How about Code:
s='Version: 150, field1: ignore, field 2, ignore, Thread: 20'Code:
re.findall('(Version|Thread): (\d*)', s) #=> [('Version', '150'), ('Thread', '20')] |
Quote:
Sorry, I forgot to add the fact that there is text in front of the first field I want to extract. I edited the original post to show what the line should look like. |
Code:
s='field1: ignore, Version: 150, field1: ignore, field 2, ignore, Thread: www.google.com' |
Quote:
|
Any tips on learning regular expressions? Websites, tutorials.. etc..?
|
I'm sure you can find lots of regex-related tutorials online. Creating of a particular regular expression often requires some trial and error, so I use little sed one-liners, for example
Code:
$ echo 'field1: ignore, Version: 150, field1: ignore, field 2, ignore, Thread: www.google.com' | sed -r 's/(Version|Thread): [^ ,]*/[&]/g' |
| All times are GMT -5. The time now is 04:57 PM. |