LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 12-12-2011, 07:59 PM   #1
zski128
LQ Newbie
 
Registered: Dec 2011
Posts: 2

Rep: Reputation: Disabled
sed and regex help


Hello,
I am in need of some regex help. I need to pull out a string of numbers for a single line.

The input can look like this, these are 3 separate examples:
o.text text 336 09-Dec-11 13:33:10
o.text text 3350126 09-Dec-11 13:33:10
o.texttext text 30473 09-Dec-11 13:33:10

I need to pull out the middle number, 336, 3350126, 30473

I am close, I think, here is the command I am running:

Code:
sed -r 's/([0-9]+).*/\1/'
Any help would be greatly appreciated!!
 
Old 12-12-2011, 08:21 PM   #2
corp769
Guru
 
Registered: Apr 2005
Posts: 5,807

Rep: Reputation: 995Reputation: 995Reputation: 995Reputation: 995Reputation: 995Reputation: 995Reputation: 995Reputation: 995
Honestly, if the fields do not change, you could use awk to extract the data, like so:
Code:
cat filename | awk '{ print $3 }'
Where filename is the name of the file that holds the data.

Cheers,

Josh
 
1 members found this post helpful.
Old 12-12-2011, 08:40 PM   #3
danielbmartin
Senior Member
 
Registered: Apr 2010
Location: Apex, NC, USA
Distribution: Ubuntu
Posts: 1,057

Rep: Reputation: 284Reputation: 284Reputation: 284
Quote:
Originally Posted by zski128 View Post
The input can look like this, these are 3 separate examples:
o.text text 336 09-Dec-11 13:33:10
o.text text 3350126 09-Dec-11 13:33:10
o.texttext text 30473 09-Dec-11 13:33:10

I need to pull out the middle number, 336, 3350126, 30473
It appears you always want the third field, and your fields are delimited by a single blank. Consider using "cut".

Code:
cut -d' ' -f3 < InFile
Daniel B. Martin
 
1 members found this post helpful.
Old 12-12-2011, 11:08 PM   #4
Telengard
Member
 
Registered: Apr 2007
Location: USA
Distribution: Kubuntu 8.04
Posts: 579
Blog Entries: 8

Rep: Reputation: 147Reputation: 147
zski128, are your fields separated by single space characters?

Code:
test$ cat input-file
o.text text 336 09-Dec-11 13:33:10
o.text text 3350126 09-Dec-11 13:33:10
o.texttext text 30473 09-Dec-11 13:33:10
Quote:
Originally Posted by corp769 View Post
Code:
cat filename | awk '{ print $3 }'
It should work without cat.

Code:
test$ awk '{print $3}' input-file
336
3350126
30473
test$
Quote:
Originally Posted by danielbmartin View Post
Code:
cut -d' ' -f3 < InFile
I tried without redirection, and it seemed to work.

Code:
test$ cut -d' ' -f3 input-file
336
3350126
30473
test$
A Bash loop worked too.

Code:
test$ while read -a array; do echo ${array[2]}; done < input-file
336
3350126
30473
test$
As for sed, when I tried the given program I got this.

Code:
test$ sed -r 's/([0-9]+).*/\1/' input-file
o.text text 336
o.text text 3350126
o.texttext text 30473
What seems to be happening is that the regex is only matching text from the first digit character on (the backreference). So I decided to match all characters preceeding the first digit outside the backreference.

Code:
test$ sed -r 's/.+ ([0-9]+) .+/\1/' input-file
336
3350126
30473
test$
zski128, is that what you wanted?
 
1 members found this post helpful.
Old 12-13-2011, 06:43 AM   #5
zski128
LQ Newbie
 
Registered: Dec 2011
Posts: 2

Original Poster
Rep: Reputation: Disabled
Quote:
What seems to be happening is that the regex is only matching text from the first digit character on (the backreference). So I decided to match all characters preceeding the first digit outside the backreference.

Code:

test$ sed -r 's/.+ ([0-9]+) .+/\1/' input-file
336
3350126
30473
test$
Thanks! The first reply with awk is much simpler, however thanks for the regex, I see where I was going wrong. The cut command would not work in my case, there is a variable amount of white space between the strings that where stripped out when I posted this thread, sorry about that.
 
Old 12-13-2011, 10:30 AM   #6
Telengard
Member
 
Registered: Apr 2007
Location: USA
Distribution: Kubuntu 8.04
Posts: 579
Blog Entries: 8

Rep: Reputation: 147Reputation: 147
Quote:
Originally Posted by zski128 View Post
The cut command would not work in my case, there is a variable amount of white space between the strings that where stripped out when I posted this thread, sorry about that.
That is one reason you should enclose both code and data blocks in code tags. It will preserve the whitespace.
 
  


Reply

Tags
awk, cut, regular expression, sed


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
sed regex say_hi_ravi Programming 3 09-15-2011 02:12 AM
[SOLVED] sed regex schneidz Programming 1 02-28-2011 06:46 PM
Help with sed regex homer_3 Linux - General 1 08-18-2009 01:57 PM
regex with sed to process file, need help on regex dwynter Linux - Newbie 5 08-31-2007 05:10 AM
Help with Sed and regex cmfarley19 Programming 6 11-18-2004 01:09 PM


All times are GMT -5. The time now is 12:42 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration