Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
|
07-16-2006, 06:37 PM
|
#1
|
Member
Registered: Mar 2005
Location: Montréal, Québec, Canada
Distribution: Slackware 12.1 x32, 13.1 x64
Posts: 90
Rep:
|
perl regular expression a char match
Hi guys,
I want how to validate a string line searching for any character which is not defined as:
PHP Code:
from a to z
from A to Z
from 0 to 9
. (point)
- (-)
"\" (back slash) finally
[white space]
In this case I'm looking in the whole string line give me a error message if find any char which is not in my list. I'm using this:
Code:
#!/usr/bin/perl
...
...
if( $mystring !~ m/[a-zA-Z_0-9\.\-\\]|[ ]|/ ) # if $mystring has a non-defined char
{
print "Error";
}
Does any one know what im doing wrong?
I guess my regular expressions
Regards
|
|
|
07-16-2006, 11:08 PM
|
#2
|
Senior Member
Registered: Mar 2005
Location: USA::Pennsylvania
Distribution: Slackware
Posts: 1,065
Rep:
|
well it looks close.. why is there an underscore in there?
it might work like this .. maybe..
[a-zA-Z0-9\.\-\\ ]
|
|
|
07-16-2006, 11:51 PM
|
#3
|
Member
Registered: Jun 2006
Location: Colombo, Sri Lanka
Distribution: Ubuntu
Posts: 103
Rep:
|
this should work. you dont have to escape a hyphen if it's the first char
Code:
/[-a-zA-Z0-9\.\\\ ]+/
also, if there's a modifier to make it case insensitive, you can exclude either the a-z or A-Z
Code:
/[-a-z0-9\.\\\ ]+/i
if you want those chars to appear in that exact same order, the expression has to be changed but i'm not sure if that's what you want.
Last edited by koobi; 07-16-2006 at 11:53 PM.
|
|
|
07-16-2006, 11:53 PM
|
#4
|
Senior Member
Registered: Oct 2003
Posts: 3,057
Rep:
|
I wonder if you would might use [\w\s]+ instead.
For example:
Code:
#!/usr/bin/perl
# Example: ./myscript This is \^ test!
# Example: ./myscript There aren\'t \~ 33 Numbers \ HEre.
foreach $mystring (@ARGV) {
if( $mystring !~ m/([\w\s]+)/ ) # if $mystring has a non-defined char
{
print "$mystring is not defined\n";
}
}
|
|
|
07-16-2006, 11:57 PM
|
#5
|
Member
Registered: Jun 2006
Location: Colombo, Sri Lanka
Distribution: Ubuntu
Posts: 103
Rep:
|
Quote:
Originally Posted by homey
I wonder if you would might use [\w\s]+ instead.
For example:
Code:
#!/usr/bin/perl
# Example: ./myscript This is \^ test!
# Example: ./myscript There aren\'t \~ 33 Numbers \ HEre.
foreach $mystring (@ARGV) {
if( $mystring !~ m/([\w\s]+)/ ) # if $mystring has a non-defined char
{
print "$mystring is not defined\n";
}
}
|
but \w includes an underscore as well, doesn't it?
|
|
|
07-17-2006, 01:32 AM
|
#6
|
Member
Registered: Mar 2005
Location: Montréal, Québec, Canada
Distribution: Slackware 12.1 x32, 13.1 x64
Posts: 90
Original Poster
Rep:
|
Thanks guys,
but still not working
I try this:
Code:
$mystring = "!20060710,NONE,NONE,GROUP_INC,Swap,Senior Unsecured,CDS SDA,NONE,100y,0.12,0.12"
it has "!", ",". which are suppouse to not be allow on my check function. Im using this:
Code:
foreach $mystring (@ARGV)
{
if( $mystring !~ m/[a-zA-Z_0-9\-\.\\ ]+/i )
{
print "\nError\n";
}
}
Nothing happen I dont know why the regular expression seems ok.
Maybe something with m/expr/option. the option are i, g, e but not sure which one is the right one, I try all them.
I'm allowing the next chars:
PHP Code:
From a to z
From A to Z
"_" [undescore]
From 0 to 9
"-" [hypen] or is this [dash]?
"." (point)
"\" [back slash] and
" " [white space]
|
|
|
07-18-2006, 09:01 AM
|
#7
|
Senior Member
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515
|
your logic is wrong.
!~ is the NOT of =~
you are finding if NOT ANY legal chars are there.
you need a positive search for illegal chars (think about it)
Code:
if $mystring =~ /[^-.A-Za-z0-9\s]/ ;
Last edited by bigearsbilly; 07-18-2006 at 09:03 AM.
|
|
|
07-18-2006, 10:08 PM
|
#8
|
Member
Registered: Mar 2005
Location: Montréal, Québec, Canada
Distribution: Slackware 12.1 x32, 13.1 x64
Posts: 90
Original Poster
Rep:
|
Thanks bigearsbilly,
At the end I went to search for the chars that I dont want because my logic seems wrong.
I did the next then:
Code:
if( $mystring =~ m/[!"#\$\%&'()\*\+\,\-\/:<=>\?\@\[\\\]\^\`\{\|\}\~]+/g )# ch
ar ";" is valid after sustitution of "," for ";".
{
print "Error on line: $i a invalid char found aborting the parsing\n";
}
I guess the above is different for what I was doing before, the negation of the regular expression.
All this because it could contained corrupted data which come from the excel file.Thats way I'm looking for any weird char.
I don't thing it could get control char data from the excel file isnt .
Thanks everybody for your comments.
I love this site.
|
|
|
07-19-2006, 04:37 AM
|
#9
|
Senior Member
Registered: Mar 2004
Location: england
Distribution: Mint, Armbian, NetBSD, Puppy, Raspbian
Posts: 3,515
|
Quote:
Originally Posted by richikiki
At the end I went to search for the chars that I dont want because my logic seems wrong.
|
I think it's generally agreed better to have a white list rather than a black list.
i.e. I think the first match is better as you never know what you've missed
with the second method.
|
|
|
All times are GMT -5. The time now is 12:12 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|