LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-23-2009, 07:08 PM   #1
donnied
Member
 
Registered: Oct 2006
Distribution: Debian x64
Posts: 198

Rep: Reputation: 30
match text over multiple lines Python


I'm trying to find two separate bits of data. I have a list with multiple entries:
Name: Bob
ID: 123 (I want this)
...
W:
1
2
2
3 (I want this)

I want to save the ID and then get whatever appears on the 8th line after W: I'm not sure how to match text over multiple lines. I was thinking along the lines of:
Code:
def id():
                if re.search(r'[0-9]{6}', line):
                        id =  re.search(r'[0-9]{6}', line).group()
                        return id

def clp():
                if re.search(r'WR.*', line):
                        clp = re.search(r'.*', line).group()
                        return clp

for line in open('workfile').readlines():
        print student_id()
        print clp()
 
Old 05-23-2009, 07:39 PM   #2
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
if you want to use re module, compile your pattern with re.M|re.DOTALL to match multiple lines. however, looking at your case, there is really no need to use regex.
one way
Code:
f=open("file")
for line in f:
  if "ID" in line:
     id=line.split()[-1].strip()
  if "W:" in line:
     for i in range(8): line=f.next()
     print "Eighth line after W: is ",line.strip()
second way is to use indexing, if you file is not too big, get everything into memory
Code:
data=open("file").read().split("\n")
for n,line in enumerate(data):
    if "ID" in line: 
        #get your id        
    if "W:" in line:
        print data[n+8]
 
Old 05-23-2009, 08:00 PM   #3
donnied
Member
 
Registered: Oct 2006
Distribution: Debian x64
Posts: 198

Original Poster
Rep: Reputation: 30
Wow. With your code I got done in 5 minutes what I'd been tinkering with for hours. Thank you.

If I did want to use the re.M|re.DOTALL as mentioned how would I do that? (I'm curious for future reference and I think it might help the discrete chunking of things into modules.)
 
Old 05-23-2009, 08:19 PM   #4
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Code:
regex=re.compile("<pattern here>",re.M|re.DOTALL)
please read the docs! as well as Python regular expression HOWTO (google)
 
Old 05-23-2009, 08:56 PM   #5
donnied
Member
 
Registered: Oct 2006
Distribution: Debian x64
Posts: 198

Original Poster
Rep: Reputation: 30
I read the docs, but I don't really get it until I've used it (after I've seen a concrete example).

Thanks again.
 
Old 05-24-2009, 09:18 AM   #6
donnied
Member
 
Registered: Oct 2006
Distribution: Debian x64
Posts: 198

Original Poster
Rep: Reputation: 30
using indexing python

When I use the indexing solution I had data that didn't match.
for example:
Joe Brown 123456 3 4 1 3.5
Joe Brown 123456 3 4 1 4
or
Jane Doe 654321 1 3 1 2
Jane Doe 654321 1 3 1 3


Ooops. READING-WRITING also has 'WRITING'
I'll probably go with something like:


Code:
for n,line in enumerate(data):
        if "Name" in line:
                name=line.split()[-1].strip()
                print name

        if "ID" in line:
                id=line.split()[-1].strip()
                print id

        if "ORAL" in line:
                print data[n+8]
                oral = data[n+8]

        if "READING" in line:
                print data[n+8]
                reading = data[n+8]

        if "BROAD" in line:
                print data[n+8]
                broad = data[n+8]

        if "WRITING" in line:
                if "READING" in line:
                        print 'not what we want'
                else:
                        print data[n+8]
                        writing = data[n+8]
                        calpstring = name + ' ' + id + oral + reading + broad + writing

                wcm2.write(calpstring + '\n')

wcm2.close()
 
  


Reply

Tags
match, pattern, python, search


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] edit multiple lines of a text file into 1 line: schneidz Programming 2 04-09-2009 11:22 AM
sed match last x lines of a file bradvan Programming 12 03-19-2009 11:18 PM
Remove 38 lines after finding match 0.o Programming 12 03-28-2008 02:15 AM
REGEXP Match * through multiple lines ? ALInux Linux - Software 12 08-14-2007 07:39 AM
echo multiple lines of text twistedpair Linux - Software 9 08-08-2007 06:07 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration