LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-23-2022, 10:09 AM   #1
aristosv
Member
 
Registered: Dec 2014
Posts: 263

Rep: Reputation: 3
regular expression improvement to dected a specific set of numbers


I have a list of strings containing text and numbers

Code:
test 1 99 435 18 1 more text
test 2 97 123 1 81 more text2
test 3 96 4 3 5567 more text3
test 4 99 43 5181 more text4
I am using this regular expression to extract numbers starting with 9, followed by 4 or 5 or 6 or 7 or 9, followed by 6 more numbers

Code:
9*[45679]( *[0-9]){6}
So basically I want to end up with something like this

Code:
99435181
97123181
96435567
99435181
I was told that the regular expression I'm using is not good enough though because it will also match 5 8 8 8 8 8 8 and 6888888.

Can you suggest an improved regular expression?
Thanks
 
Old 10-23-2022, 11:55 AM   #2
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,616

Rep: Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555

The answer appears to be in this thread: https://www.linuxquestions.org/questions/programming-9/tell-regex-to-run-as-if-there-were-no-spaces-4175712277

 
1 members found this post helpful.
Old 10-23-2022, 11:58 AM   #3
aristosv
Member
 
Registered: Dec 2014
Posts: 263

Original Poster
Rep: Reputation: 3
No that was just for removing spaces. I know I cant do that with regex. This is for improving the way the matching of the query happens
 
Old 10-23-2022, 12:13 PM   #4
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,616

Rep: Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555Reputation: 2555

Review that thread and compare the patterns to what you have.

If that isn't enough for you to figure it out, spend some time learning the basics of regex - in particular what "*" means.

 
Old 10-26-2022, 12:34 AM   #5
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,816

Rep: Reputation: 1211Reputation: 1211Reputation: 1211Reputation: 1211Reputation: 1211Reputation: 1211Reputation: 1211Reputation: 1211Reputation: 1211
You are missing a space that was given in your earlier thread
Code:
9 *[45679]( *[0-9]){6}
Without the first space it is 9 any times - even zero times.
With the space it is a 9 followed by a space any times.
 
Old 10-26-2022, 04:44 AM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,011

Rep: Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194Reputation: 3194
Maybe you could look at something like https://rubular.com/ and terst until your hearts content and you learn what some of thos symbols mean
 
Old 11-07-2022, 04:02 AM   #7
lucmove
Senior Member
 
Registered: Aug 2005
Location: Brazil
Distribution: Debian
Posts: 1,434

Rep: Reputation: 110Reputation: 110
The error is right in the beginning:

Code:
9*[45679]( *[0-9]){6}
9* says that 9 is optional. The string may or may not start with a 9.

Also, note that the "4" in "test 4" is also matched incorrectly because of that.

This regex works better:
Code:
9[45679]( *[0-9]){6}
But I'm not too sure because the specification is not clear. When you say "followed by 6 more numbers," the distribution of numbers and spaces is not predictable. You provide four different models in your test lines.

Maybe this will work even better:

Code:
9[45679] [0-9 ]{7,8}
But again, I can't be sure because I'm not sure of the specification.

Last edited by lucmove; 11-07-2022 at 04:03 AM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] jhalfs sed: -e expression #1, char 55:Invalid preceding regular expression percy_vere_uk Linux From Scratch 10 07-22-2017 07:15 AM
printing the numbers between 20 to 80 in perl using a regular expression iceman_san Linux - Newbie 3 08-27-2007 01:35 AM
Using dual monitors - my prefered monitor is dected as second radone Linux - Hardware 1 07-03-2006 10:12 PM
netgear wireless range max wont dected when installing with Fedora 5 khmerisp Linux - Wireless Networking 4 04-01-2006 03:44 PM
(ee) no devices dected problem question trevor51590 Mandriva 0 02-22-2005 03:50 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 10:24 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration