LinuxQuestions.org - [SOLVED] regex match string from start to find unique combinations

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - regex match string from start to find unique combinations (https://www.linuxquestions.org/questions/programming-9/regex-match-string-from-start-to-find-unique-combinations-788423/)

fukawi2

02-11-2010 06:12 AM

regex match string from start to find unique combinations

Well it's late, and I'm way too inexperienced with perl/regex to figure this out on my own...

I'm writing a perl script to accept input (commands) from the user. I want to implement a 'closest match' type scheme on accepting the input.

Example:
- A valid command is 'update' and 'upload'
- The user should be able to type 'upd' or 'update' etc to execute the 'update' command. 'upte' is not valid.
- The user should be able to type 'upl' or 'uplo' etc to execute the 'upload' command. upod is not valid.
- The command 'up' can't be matched to a unique command.

I'm using the following regex at the moment:

Code:

/^upd?a?t?e?/

/^upl?o?a?d?/

This works EXCEPT for treating 'upte' and 'upod' as matches.

I think I need a way in the regex similar to ? except to say "match the preceding character or nothing, and stop looking" rather than "match the preceding character, or don't"

Any ideas folks? :)

neonsignal

02-11-2010 07:08 AM

You can bracket regular expressions, eg

Code:

/^upd(a(t(e)?)?)?/

Or you could just match on the first three characters and then do a second check that what they entered matches the start of the full command string.

ashok.g

02-11-2010 07:14 AM

I think this will work fine for you.

Code:

$a=<STDIN>;

if($a=~/^(upda?|updat?|update?|update)/)

{

print "UPDATE\n";

}

elsif($a=~/^(uplo?|uploa?|updload?|upload)/)

{

print "UPLOAD\n";

}

else

{

print "NONE\n";

}

bartonski

02-11-2010 08:11 AM

Just out of curiosity, why aren't 'upte' and 'upod' valid matches? If it's 'closest match', anything that could uniquely match would seem to be valid.

I think that I would use a soundex algorithm, and be done with it.

jschiwal

02-11-2010 10:36 AM

You could use the patterns in case statements instead of a string of if/then/else statements.

tuxdev

02-11-2010 10:45 AM

I would consider approaching the problem from the other direction. If say, the user typed "up", use the regex "up.*" on each valid command. Since that regex matches more than one command, it's ambiguous (and you can create a nice error message listing out the possibilities). If the user typed "upd", then the regex "upd.*" would only match "update", so that must be the desired command.

fukawi2

02-11-2010 05:32 PM

Quote:

Originally Posted by neonsignal (Post 3860243)

You can bracket regular expressions, eg

Code:

/^upd(a(t(e)?)?)?/

That was my other thought, but it didn't seem 'graceful' enough, lol

Now that I'm awake a bit better, my Googling skills are working better, and I think I've found my solution in here:
http://docstore.mik.ua/orelly/perl/cookbook/ch06_21.htm
http://perldoc.perl.org/Text/Abbrev.html

Thanks for all the suggestions folks :)

All times are GMT -5. The time now is 06:31 AM.