LinuxQuestions.org
Latest LQ Deal: Complete CCNA, CCNP & Red Hat Certification Training Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 01-01-2011, 09:30 PM   #1
Garrett85
Member
 
Registered: Jan 2011
Posts: 240

Rep: Reputation: 5
regex


Could someone please help me figure out the regex formula for this string?

City, ST 12345

City can be more than one word, and ST is always any two capital letters.
Since the city name can be almost anything but is always followed by a comma so it might be a good idea to start matching at the comma.

Thanks.

I would really appreciate the member darkcrimson to get in tough with me regarding this thread. Thanks again to everyone.
 
Old 01-01-2011, 10:05 PM   #2
AlucardZero
Senior Member
 
Registered: May 2006
Location: USA
Distribution: Debian
Posts: 4,817

Rep: Reputation: 609Reputation: 609Reputation: 609Reputation: 609Reputation: 609Reputation: 609
Basic (and more) regex info: http://www.regular-expressions.info/ . You should read through it if you want to learn regex.

We need to know where you are using this regular expression. ie: grep? a Perl script? Because there are different flavors of regex engines with different capabilities. And we need to know what you want to do with the match: just print it? Parse out the city, state, zip?

Perl-Compatible Regular Expression:
Code:
^(.+?), ([A-Z]{2}) (\d{5})$
The city will be in $1, the state in $2, and the zip in $3.

That's:
- ^ start of line
- (.+?) One or more characters, non-greedy, captured
- , literal comma, literal space
- ([A-Z]{2}) two capital letters, captured
- literal space
- (\d{5}) Five numbers, captured
- $ end of line
 
1 members found this post helpful.
Old 01-02-2011, 02:52 AM   #3
theNbomr
LQ 5k Club
 
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,396
Blog Entries: 2

Rep: Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908Reputation: 908
You have done yourself the favor of describing how the regex should match the input. Having done this much is most of the work; the rest is just translating the long-form description to the concise regex version. AlucardZero has already mentioned the distinction between regex implementations in different tools and languages, so I'll just use Perl as an example.
You said 'anything but is always followed by a comma', which I will translate to the more accurate 'at least one of anything, followed by a comma'. Happily, there is an almost direct translation of these words to regex code.
Code:
.+,
dot (anything)
+ (at least one of the preceding)
, (literal comma)

Then, you didn't mention the whitespace, but it's there, and whitespace can sometimes be in multiples, so as long as we specify at least one, we'll be robust in how we match:
Code:
\s+
\s (whitespace of any sort)
+ (at least one of the preceding)

Then, you said 'always any two capital letters'. Nice, concise, and once again, directly translates to regex code
Code:
[A-Z][A-Z]
This should need no explanation, I would guess. However, it is distinct from AlucardZero's example in that it seems simpler to read, and from my understanding of the way regexes work, may be slightly more efficient. Until the regex has been use a few million times, I doubt the difference is measurable.

Now, more whitespace, as before, followed by what many would guess to be a US zip code of five digits. Now, for five digits, I will agree with AlucardZero,s example:
Code:
\s+[0-9]{5}
The whole thing, as a snippet of Perl code:
Code:
$address =~ m/(.+),\s+([A-Z][A-Z])\s+([0-9]{5})/;
$city = $1;
$state = $2;
$zip = $3;
Now, next time you think about how to describe what you want, simply go the extra step and translate it to code. The problem almost solves itself.

--- rod.
 
1 members found this post helpful.
Old 01-02-2011, 10:52 AM   #4
Garrett85
Member
 
Registered: Jan 2011
Posts: 240

Original Poster
Rep: Reputation: 5
regex and grep

Quote:
Originally Posted by AlucardZero View Post
Basic (and more) regex info: http://www.regular-expressions.info/ . You should read through it if you want to learn regex.



We need to know where you are using this regular expression. ie: grep? a Perl script? Because there are different flavors of regex engines with different capabilities. And we need to know what you want to do with the match: just print it? Parse out the city, state, zip?

Perl-Compatible Regular Expression:
Code:
^(.+?), ([A-Z]{2}) (\d{5})$
The city will be in $1, the state in $2, and the zip in $3.

That's:
- ^ start of line
- (.+?) One or more characters, non-greedy, captured
- , literal comma, literal space
- ([A-Z]{2}) two capital letters, captured
- literal space
- (\d{5}) Five numbers, captured
- $ end of line
Sorr, Yes it's just a regular unix expression combined with grep. Thanks for any and all replies.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl to find regex and print following 5 lines after regex casperdaghost Linux - Newbie 3 08-29-2010 09:08 PM
Regex ganninu Programming 8 08-07-2010 02:14 AM
Help with regex tbeehler Linux - Software 4 07-11-2008 11:05 AM
regex with sed to process file, need help on regex dwynter Linux - Newbie 5 08-31-2007 06:10 AM
Regex Help cmfarley19 Programming 5 03-31-2005 11:13 PM


All times are GMT -5. The time now is 06:05 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration