LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-08-2009, 04:31 AM   #1
sancho1980
Member
 
Registered: May 2006
Location: Leipzig, Germany
Distribution: Kanotix 64
Posts: 45

Rep: Reputation: 15
regular expression problem


hi

i am using regcomp and regexec to find out whether a string is a valid host name
a valid host name (according) to wikipedia is anything that
-starts with a-z or A-Z
-followed by 0 or more of a-z, A-Z, 0-9 or '-'
-and ends with a-z, A-Z or 0-9

the regular expression i use is this one

#define HOSTNAMEREGEX "^([a-zA-Z])|([a-z0-9A-Z-])*|([a-z0-9A-Z])$"

But strangely enough, this also matches strings like

"a-" and even "a?"

whats wrong with this?

thanks

martin
 
Old 06-08-2009, 04:48 AM   #2
david1941
Member
 
Registered: May 2005
Location: St. Louis, MO
Distribution: CentOS7
Posts: 267

Rep: Reputation: 58
Well, a- and a ARE valid hostnames. It appears little is worng with it.

Dave
 
Old 06-08-2009, 05:05 AM   #3
sancho1980
Member
 
Registered: May 2006
Location: Leipzig, Germany
Distribution: Kanotix 64
Posts: 45

Original Poster
Rep: Reputation: 15
Even if "a-" WAS a valid host name (which I doubt), then your answer still misses my point: The core of my question was "why does the above regex match something like 'a-' and 'a?'"
 
Old 06-08-2009, 05:11 AM   #4
david1941
Member
 
Registered: May 2005
Location: St. Louis, MO
Distribution: CentOS7
Posts: 267

Rep: Reputation: 58
Code:
"^([a-zA-Z])|([a-z0-9A-Z-])*|([a-z0-9A-Z])$"
It matches the second alternation, ([a-z0-9A-Z-])*

Dave
 
Old 06-08-2009, 05:20 AM   #5
sancho1980
Member
 
Registered: May 2006
Location: Leipzig, Germany
Distribution: Kanotix 64
Posts: 45

Original Poster
Rep: Reputation: 15
my regex was indeed wrong, but i think so was your answer
i really meant the following regex:

#define HOSTNAMEREGEX \
"^([a-zA-Z])|(([a-zA-Z])([a-z0-9A-Z-])*([a-z0-9A-Z]))$"

this clearly has 2 alternatives:

1) ([a-zA-Z])..any ONE letter out of a-z or A-Z
2) (([a-zA-Z])([a-z0-9A-Z-])*([a-z0-9A-Z]))..any ONE letter out of a-z or A-Z followed by 0 or more out of [a-z0-9A-Z-] and ENDING WITH ANY ONE out of [a-z0-9A-Z]

my understanding is this CANNOT possibly match anything ending with "-", let alone "?" or any other special character...BUT IT DOES!!...WHY?

get my point?
 
Old 06-08-2009, 05:40 AM   #6
david1941
Member
 
Registered: May 2005
Location: St. Louis, MO
Distribution: CentOS7
Posts: 267

Rep: Reputation: 58
the ([a-z0-9A-Z-])* should have the final - first in the character class, or else it is a metacharacter. Try it like this: ([-a-z0-9A-Z])*

Dave
 
Old 06-08-2009, 05:45 AM   #7
sancho1980
Member
 
Registered: May 2006
Location: Leipzig, Germany
Distribution: Kanotix 64
Posts: 45

Original Poster
Rep: Reputation: 15
umm no, that didnt do the trick :-(
 
Old 06-08-2009, 07:15 AM   #8
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
These do exactly the same thing:
Code:
([a-z-])
([-a-z])
To wit: match any character in the range a-z, OR a literal "-"
 
Old 06-08-2009, 07:21 AM   #9
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
@pixellany:
Quote:
These do exactly the same thing:
([a-z-])
([-a-z])
Theoretically you are correct, but the first one ([a-z-]) can fail depending on program/version used and/or *nix flavor that is used (error being: A closing character is expected after z-).

If you want to make sure no error will occur use the second one ([-a-z]).
 
Old 06-08-2009, 07:22 AM   #10
PMP
Member
 
Registered: Apr 2009
Location: ~
Distribution: RHEL, Fedora
Posts: 381

Rep: Reputation: 58
Try using this

"^([a-zA-A])([a-z0-9A-Z-])*([a-z0-9A-Z])$"
 
Old 06-08-2009, 07:26 AM   #11
PMP
Member
 
Registered: Apr 2009
Location: ~
Distribution: RHEL, Fedora
Posts: 381

Rep: Reputation: 58
I Guess this one should work

"^([a-zA-A])([a-z0-9A-Z])*(-[a-z0-9A-Z])?$"
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
perl regular expression problem true_atlantis Programming 4 05-27-2009 06:35 AM
Regular expression problem raghu123 Programming 11 10-12-2008 07:17 AM
PHP PRCE Regular Expression problem x_terminat_or_3 Programming 4 09-11-2007 04:09 PM
having problem in writing regular expression in tcl mohtasham1983 Programming 1 10-29-2006 01:29 PM
OpenOffice regular expression filtering problem JAB4ever Linux - Software 3 12-08-2004 04:42 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 05:07 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration