LinuxQuestions.org
Social Bookmarking all things Linux and Open Source
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Tags used in this thread
Popular LQ Tags , , , , , ,

Reply
 
Thread Tools
Old 01-17-2009, 01:25 PM   #1
donnied
Member
 
Registered: Oct 2006
Distribution: Debian x64
Posts: 163
Thanked: 0
Python: How to use the re module?


[Log in to get rid of this advertisement]
In Python if I want to use the re module with a file how would I do that? Open the file and read in with readlines is what I'm imagining:

Code:
for line in open('file1.csv').readlines():
        lines = [line.rstrip()]
        m = re.match('the string I want to match', lines)
        m.group()
but it doesn't work. I'm basically trying to emulate grep '^[0-9](6)'

And what would be the Python equivalent of sed 's/cat/dog/g'?
(I don't want to use sed or the OS module.)
donnied is offline  
Tag This Post , , , , , ,
Reply With Quote
Old 01-17-2009, 06:26 PM   #2
ntubski
Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 697
Thanked: 50
Maybe you are over thinking this.

Code:
for line in open('in.txt').readlines():
    if re.match(r'^[0-9]\(6\)', line):
        print line,
Note that the regex syntax of python is like egrep rather than grep (thus the back-slashes).

Quote:
And what would be the Python equivalent of sed 's/cat/dog/g'?
Code:
for line in open('in.txt').readlines():
    print re.sub('cat', 'dog', line),
Both these examples should probably use re.compile outside of the loop for efficiency.
ntubski is offline     Reply With Quote
Thanked by:
Old 01-19-2009, 01:26 PM   #3
donnied
Member
 
Registered: Oct 2006
Distribution: Debian x64
Posts: 163
Thanked: 0

Original Poster
Thank you I ended up using:
Code:
g4 = open('workfile4', 'wb')

for line in open('workfile3').readlines():
    if re.match(r'^[0-9]{6}', line):
        studentid = re.match(r'^[0-9]{6}', line).group()
    else:
        print studentid, ",", line
        class1 = studentid, ",", line
        s4 =  str(class1)
        g4.write(s4)

g4.close()

To clean the file(s) up I used a lot of:
Code:
scriptutil.freplace('.', shellglobs=('workfile3',),regexl=((r'\\$',r'', None),))
scriptutil.freplace('.', shellglobs=('example',),regexl=((r'([A-Z])\,([A-Z])',r'\1\2', None),))
I'm not sure how efficient the scriptil function is, but I am only working with about one thousand lines of text.
donnied is offline     Reply With Quote

Reply

Bookmarks


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
python update - Unable to load GTK2 Python bindings: No module named gtk itpedersen Linux - Software 2 10-03-2008 04:44 AM
python module scope jhwilliams Programming 3 08-08-2007 04:16 PM
using distutils to install a python module shanenin Programming 2 11-29-2005 10:44 PM
no python module base.g? hcgernhardt Linux - Software 0 01-15-2005 01:16 PM
Help me (python and MySQLdb module) Dark Carnival Programming 2 04-22-2004 08:31 AM


All times are GMT -5. The time now is 01:25 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
RSS2  LQ Podcast
RSS2  LQ Radio
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration