LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 02-04-2011, 08:31 PM   #1
Dogs
Member
 
Registered: Aug 2009
Location: Houston
Distribution: Slackware 13.37 x64
Posts: 105

Rep: Reputation: 25
Python Help: Regular expressions not behaving as I might expect


So, I have a file full of IDs and names of various stores that I want to extract either the ID or the store name from.

I got the ID part of it pretty quickly, because they're fixed lengths and at the beginning of the string. The names, however, are variable lengths from 0-50 alphanumeric characters.

the format is:

12345-my store name-
12346-my Big 1 Store Name The Eleventh-


what I'm trying is:

Code:
import re

my_pattern = re.compile('-*-')
storelist = ['12345-my store name-', '12346-my Big 1 Store Name The Eleventh-']

for line in storelist:
    my_store = my_pattern.search(storelist)
    my_store.group()
this isn't how I actually plan to keep it, but I'm new to Python and trying to figure out how to make something like this work so I can write a function that returns a store name at a given position in the list.

What I am wrongly assuming is happening is that the pattern searches for most anything inside of '-'s, but this doesn't appear to be the case.

Instead, it will find a single '-' and print that.

How can I create a pattern that represents a '-', some variable amount of letters/numbers after it, and then another '-'?

Last edited by Dogs; 02-04-2011 at 09:00 PM.
 
Old 02-04-2011, 09:07 PM   #2
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
no need to use regular expression. You are using Python and Python already has excellent string manipulation built in.
just split on "-" .
Code:
>>> "12346-my Big 1 Store Name The Eleventh-".split("-",2)
['12346', 'my Big 1 Store Name The Eleventh', '']
>>> "12345-my store name-".split("-",2)
['12345', 'my store name', '']
now the 2nd element will be your name, and the 1st element will be your id number. Check the doc for usage of split
 
Old 02-04-2011, 09:15 PM   #3
Dogs
Member
 
Registered: Aug 2009
Location: Houston
Distribution: Slackware 13.37 x64
Posts: 105

Original Poster
Rep: Reputation: 25
Oh hey.. That's fancy!

Last edited by Dogs; 02-04-2011 at 09:17 PM.
 
Old 02-05-2011, 02:09 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Obviously ghostdog's solution is the way to go, but on the topic of regular expressions, I believe yours will match the following:
Code:
-
--
----------------------
And so on, for what you have asked is for it to look for zero or more dashes (-*) followed by a single dash (-). As every entry has at least a single dash in it they should
all return that dash.

Maybe you wanted a pattern more like: '-[a-zA-Z ]+-'

There is the option of character classes too but I did not want to go over the top.

Hope that helps.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular expressions Khaj.pandey Linux - Newbie 19 04-21-2010 11:09 PM
Regular expressions bhuwan Programming 5 02-25-2006 11:07 PM
regular expressions. stomach Linux - Software 1 02-10-2006 06:41 AM
Regular Expressions in Python indian Programming 7 09-14-2005 11:00 PM
Regular Expressions markjuggles Programming 2 05-05-2005 11:39 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:04 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration