LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-28-2011, 04:45 PM   #1
FireRaven
Member
 
Registered: Apr 2006
Location: Australia
Distribution: Debian Squeeze
Posts: 135

Rep: Reputation: 18
Can GREP be used for this?


I have a file for example:
Code:
$ cat /tmp/text.txt
http://www.google.com.au/search?sourceid=chrome&ie=UTF-8&q=test+query
I want to return the string "chrome" out of that file.
Will I need to use grep with backreferences? I want grep to return the word "chrome" to stdout.
How do I do it?
 
Old 02-28-2011, 04:53 PM   #2
arizonagroovejet
Senior Member
 
Registered: Jun 2005
Location: England
Distribution: openSUSE, Fedora, CentOS
Posts: 1,078

Rep: Reputation: 195Reputation: 195
grep returns lines.

Code:
$ grep chrome /tmp/text.txt
will return any lines in /tmp/text.txt that contain the word chrome.

What exactly is it that you are trying to achieve?
 
Old 02-28-2011, 04:55 PM   #3
corp769
LQ Guru
 
Registered: Apr 2005
Posts: 5,817

Rep: Reputation: 1002Reputation: 1002Reputation: 1002Reputation: 1002Reputation: 1002Reputation: 1002Reputation: 1002Reputation: 1002
Came across this link:

http://stackoverflow.com/questions/4...ng-grep-or-sed

There are quite a few ways to perform what you are doing, would anything from that site help?
 
Old 02-28-2011, 04:59 PM   #4
FireRaven
Member
 
Registered: Apr 2006
Location: Australia
Distribution: Debian Squeeze
Posts: 135

Original Poster
Rep: Reputation: 18
Quote:
Originally Posted by arizonagroovejet View Post
grep returns lines.

Code:
$ grep chrome /tmp/text.txt
will return any lines in /tmp/text.txt that contain the word chrome.

What exactly is it that you are trying to achieve?
What I want is the shell to do something like this:
Code:
$ grep 'SOMEREGEX' /tmp/text.txt
chrome
$
See how the grep command returned the word "chrome"
 
Old 02-28-2011, 05:04 PM   #5
arizonagroovejet
Senior Member
 
Registered: Jun 2005
Location: England
Distribution: openSUSE, Fedora, CentOS
Posts: 1,078

Rep: Reputation: 195Reputation: 195
OK, if you just want to return the word chrome, that's pointless. You know it's chrome, why extract it? What I was getting at is what you want to do in general terms. E.g. is it that you are dealing with urls of this type that may have various values for sourceid in them and you want to get that value? If so then these methods both work



Code:
$ grep sourceid /tmp/text.txt  | cut -d '?' -f 2 | cut -d '&' -f 1 | cut -d '=' -f 2
chrome

Code:
$ grep sourceid /tmp/text.txt | cut -d '?' -f 2    | sed 's/sourceid=\([^&]*\).*/\1/'
chrome
but are dependent upon the format of the string
 
1 members found this post helpful.
Old 02-28-2011, 05:10 PM   #6
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,811
Blog Entries: 1

Rep: Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191
Well, the first option that comes to my mind is

Code:
sed 's/\(.*\)\(chrome\)\(.*\)/\2/g' file.txt
but that's not grep!

Additionally, arizonagroovejet is right about it being somewhat pointless. If you know it's always going to be 'chrome', there's no need to extract it.

Last edited by sycamorex; 02-28-2011 at 05:13 PM.
 
Old 02-28-2011, 05:13 PM   #7
FireRaven
Member
 
Registered: Apr 2006
Location: Australia
Distribution: Debian Squeeze
Posts: 135

Original Poster
Rep: Reputation: 18
Let me start again, I have the file with these contents:
Code:
http://www.google.com.au/search?sourceid=chrome&ie=UTF-8&q=test+query
http://www.google.com.au/search?sourceid=firefox&ie=UTF-8&q=test+query
http://anything.com/search?sourceid=chrome&ie=UTF-8&morequeries=123
http://anything.com/search?sourceid=ie&ie=UTF-8&morequeries=123
I want a command (grep or sed) to return something like this:
Code:
$ grep 'SOMEREGEX' text.txt
chrome
firefox
chrome
ie
$
 
Old 02-28-2011, 05:14 PM   #8
arizonagroovejet
Senior Member
 
Registered: Jun 2005
Location: England
Distribution: openSUSE, Fedora, CentOS
Posts: 1,078

Rep: Reputation: 195Reputation: 195
Quote:
Originally Posted by FireRaven View Post
I want a command (grep or sed) to return something like this:
Code:
$ grep 'SOMEREGEX' text.txt
chrome
firefox
chrome
ie
$
Both the commands I just posted do that. You can't do it with grep on it's own.
 
Old 02-28-2011, 05:18 PM   #9
Dark_Helmet
Senior Member
 
Registered: Jan 2003
Posts: 2,786

Rep: Reputation: 370Reputation: 370Reputation: 370Reputation: 370
Quote:
Originally Posted by FireRaven
I want grep to return the word "chrome" to stdout.
The only option in the grep man page that approaches what you want is:
Quote:
Code:
       -o, --only-matching
              Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
So, in theory, yes you could, but your regex won't be able to work the way you want. There is no text inside the word "chrome" to differentiate it from any other text. If you try to pull in "sourceid", then grep's output will include "sourceid"--which is not what you want.

The back references won't help you because, if you can back reference something, that means it matched in the regular expression. If it matched in the regular expression, then grep will print it. Therefore, the only back reference you can use would be to reference the match itself. If you're back referencing the match, you don't need the back reference to begin with.

I think the tool you are really looking for is sed--as demonstrated by sycamorex.

EDIT:
Given the sample data above, the only thing I can think of would be:
Code:
$ grep -o "chrome\|firefox\|ie" sample_data.txt
chrome
ie
firefox
ie
chrome
ie
ie
ie
ie
ie
And the results illustrate the problem with searching for such a short string as "ie"

Last edited by Dark_Helmet; 02-28-2011 at 05:22 PM.
 
Old 02-28-2011, 05:24 PM   #10
FireRaven
Member
 
Registered: Apr 2006
Location: Australia
Distribution: Debian Squeeze
Posts: 135

Original Poster
Rep: Reputation: 18
Quote:
Originally Posted by sycamorex View Post
Well, the first option that comes to my mind is

Code:
sed 's/\(.*\)\(chrome\)\(.*\)/\2/g' file.txt
but that's not grep!

Additionally, arizonagroovejet is right about it being somewhat pointless. If you know it's always going to be 'chrome', there's no need to extract it.
This is almost right, but is there a way to not hardcode "chrome" in there? As this querystring value might be different.

Note that "sourceid=" will always be the same though.
 
Old 02-28-2011, 05:26 PM   #11
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,811
Blog Entries: 1

Rep: Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191
Sed only option should work as well.
Code:
sed 's/\(.*sourceid=\)\(.[^&]*\)\(.*\)/\2/g' file.txt
 
1 members found this post helpful.
Old 02-28-2011, 05:29 PM   #12
arizonagroovejet
Senior Member
 
Registered: Jun 2005
Location: England
Distribution: openSUSE, Fedora, CentOS
Posts: 1,078

Rep: Reputation: 195Reputation: 195
My 3rd and 4th solutions...

Code:
$ grep sourceid /tmp/text.txt  | sed 's/[^?]*?sourceid=\([^&]*\).*/\1/'
Or if you know that all the lines will contain sourceid you could dispense with grep

Code:
$ sed 's/[^?]*?sourceid=\([^&]*\).*/\1/' /tmp/text.txt
 
1 members found this post helpful.
Old 02-28-2011, 05:34 PM   #13
FireRaven
Member
 
Registered: Apr 2006
Location: Australia
Distribution: Debian Squeeze
Posts: 135

Original Poster
Rep: Reputation: 18
Quote:
Originally Posted by sycamorex View Post
Sed only option should work as well.
Code:
sed 's/\(.*sourceid=\)\(.[^&]*\)\(.*\)/\2/g' file.txt
Yes that is it! Thanks everyone for the help.
 
Old 02-28-2011, 05:35 PM   #14
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,811
Blog Entries: 1

Rep: Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191Reputation: 1191
Yep, my version is unnecessarily complicated. The above post from arizona is a clearer/shorter solution.
 
Old 02-28-2011, 05:41 PM   #15
arizonagroovejet
Senior Member
 
Registered: Jun 2005
Location: England
Distribution: openSUSE, Fedora, CentOS
Posts: 1,078

Rep: Reputation: 195Reputation: 195
Quote:
Originally Posted by sycamorex View Post
The above post from arizona is a clearer/shorter solution.

Although, mine assumes that there is a ? immediately before sourceid and yours doesn't.

But then that's the thing about problems like this, so many possible solutions
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Grep -p for Linux, Trying to grep a paragraph. ohijames Linux - Newbie 5 07-22-2010 03:09 PM
Trying to understand pipes - Can't pipe output from tail -f to grep then grep again lostjohnny Linux - Newbie 15 03-12-2009 11:31 PM
how to grep multiple filters with grep LinuxLover Linux - Enterprise 1 10-18-2007 08:12 AM
bash script with grep and sed: sed getting filenames from grep odysseus.lost Programming 1 07-17-2006 12:36 PM
ps -ef|grep -v root|grep apache<<result maelstrombob Linux - Newbie 1 09-24-2003 12:38 PM


All times are GMT -5. The time now is 11:52 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration