Welcome to the most active Linux Forum on the web.
Go Back > Forums > Non-*NIX Forums > Programming
User Name
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.


  Search this Thread
Old 11-30-2011, 09:27 AM   #1
Senior Member
Registered: Jun 2006
Location: Maryland
Distribution: Kubuntu, Fedora, RHEL
Posts: 1,533

Rep: Reputation: 335Reputation: 335Reputation: 335Reputation: 335
Help needed with understanding (Java) Regular Expression

I'm maintaining a Java application that queries for email replies from an POP server. I have very limited knowledge interpreting regular expressions, and was wondering if someone could help me understand the following statement:
Pattern subjStart = Pattern.compile("^\\s*(?:[Rr][Ee]:\\s*" + SUBJECT_PREFIX + " )?([0-9A-Z]+)\\s*(?:.*)?");
The SUBJECT_PREFIX is a string that can be anything, including an empty string. Pattern is from the java.util.regex package.
Old 11-30-2011, 11:13 AM   #2
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 704

Rep: Reputation: 425Reputation: 425Reputation: 425Reputation: 425Reputation: 425

First, inspect these two links: 1, 2(scroll to "Regular Expressions, Literal Strings and Backslashes").

According to second link:
In literal Java strings the backslash is an escape character. The literal string "\\" is a single backslash. In regular expressions, the backslash is also an escape character. The regular expression \\ matches a single backslash. This regular expression as a Java string, becomes "\\\\". That's right: 4 backslashes to match a single one.
So, "\\s" is the character class \s equal to [ \t\n\x0B\f\r] -- a whitespace character.

From the first link:
Greedy quantifiers
X? X, once or not at all
Special constructs (non-capturing)
(?:X) X, as a non-capturing group
Therefore (?:.*)? means optional (note second ?) block of zero or more arbitrary characters. Such non-capturing groups are used to improve performance.

The same with "(?:[Rr][Ee]:\\s*" + SUBJECT_PREFIX + " )?" -- optional block, for example "Re: <SUBJECT_PREFIX>" or "RE: <SUBJECT_PREFIX>" etc.

The only capturing group here is '([0-9A-Z]+)' -- one or more capital alphanumeric characters.

Hope, I am correct and this will help.

Last edited by firstfire; 11-30-2011 at 12:25 PM.
1 members found this post helpful.
Old 11-30-2011, 01:07 PM   #3
Senior Member
Registered: Jun 2006
Location: Maryland
Distribution: Kubuntu, Fedora, RHEL
Posts: 1,533

Original Poster
Rep: Reputation: 335Reputation: 335Reputation: 335Reputation: 335
Originally Posted by firstfire View Post
Hope, I am correct and this will help.
I will read over the information from the links you provided. Thanks a lot for dissecting the regex string I provided earlier. With the information you supplied, and that found within the Java API site, it shouldn't be too hard to grasp the layout of the regex.

Again, thanks for your help.


java, regex

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Help with regular expression xavitxus Programming 3 08-28-2011 04:44 PM
Regular Expression 0.o Programming 3 06-09-2009 02:28 AM
regular expression in java sajith Programming 4 12-10-2008 05:40 AM
Regular Expression harkonen Programming 6 07-12-2008 12:06 PM
sed regular expression help needed Dew Linux - Newbie 1 03-30-2005 02:59 PM > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:20 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration