LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-07-2005, 08:31 PM   #1
redhatbeatswin
LQ Newbie
 
Registered: Jul 2004
Posts: 15

Rep: Reputation: 0
need your help with GREP!


hello!

please help me on this. trying to use grep to find tags in a file called mike.txt.

Within the file I need the information between these two tags

<mike>
</mike>

AND

the data between <mike2> </mike2>

I also need to output the results to a file.

in other words, my file named mike.txt has a bunch of tags and there ARE tags in between the </mike> and the <mike2> tags. I dont want any data in between these , just the data i note above.

Help!!
 
Old 04-07-2005, 11:22 PM   #2
bigrigdriver
LQ Addict
 
Registered: Jul 2002
Location: East Centra Illinois, USA
Distribution: Debian stable
Posts: 5,908

Rep: Reputation: 356Reputation: 356Reputation: 356Reputation: 356
While grep can look into files and find a regular expression (read search term), it can't do the kind of selection you want. You would probably have to use sed or awk to do the selection. So, use grep to find the beginning tag, awk/sed to select what's between the beginning and ending tags, and either print to screen (stdout) or pipe the output though tee to print to a file.
 
Old 04-08-2005, 01:55 AM   #3
ginetta
LQ Newbie
 
Registered: Nov 2004
Location: Canada
Posts: 28

Rep: Reputation: 15
grepping like grampa

if you have a file that has only the tags <mike> and </mike>
you could get away with 'grep -v mike'.
The -v switch tells it to search for lines that don't contain mike.
Of course if your file has other tags too you may want to aproach
it differently.

For example, if you know that <mike> is always going to appear
on line 25 and the data will only ever be ten lines in length then
you can specify explicitly the lines you want to extract with a tool
like sed;

sed -n '20,30p' rawtaggedfile > mikes.file

mikes.file being the alternative location to stdout.

Statically located data has it's uses and this would be a fine example.


If the tags are at the beginning and the end of a single line then you can
use sed again to delete the first 6 characters and the last 7 characters
like this;

sed 's/......//' rawtaggedfile > tmp1
sed 's/.......$//' tmp1 > mikes.file

or something similar. Passing the output of the first sed to a pipe may not
work because sed can be a little flaky with the memory in my experience.
So I use a tmp file then delete it.

G

Last edited by ginetta; 04-08-2005 at 01:56 AM.
 
Old 04-08-2005, 09:40 AM   #4
nilleso
Member
 
Registered: Nov 2004
Location: ON, CANADA
Distribution: ubuntu, RHAS, and other unmentionables
Posts: 372

Rep: Reputation: 31
actually this is quite easy but you will want to use cgrep and not grep.
cgrep has this very ability built in via delimiters.

In you case I believe you will use:
Code:
cgrep -w"<mike>" +w"<\/mike>"  mike.txt > filename
The -w is the top delimiter and the +w is the bottom delimiter. The output will be everything in between (incl. the delimiters themselve). On note: you may need to use a \ in front of each < and > to indicate that these are just regular characters.
cheers

Last edited by nilleso; 04-08-2005 at 10:29 AM.
 
Old 04-08-2005, 12:16 PM   #5
ginetta
LQ Newbie
 
Registered: Nov 2004
Location: Canada
Posts: 28

Rep: Reputation: 15
Lightbulb cgrep

looks like I just found myself a new tool to learn all about - cheers nilesso

G
 
Old 04-08-2005, 12:27 PM   #6
ginetta
LQ Newbie
 
Registered: Nov 2004
Location: Canada
Posts: 28

Rep: Reputation: 15
I looked up cgrep.
---quote---
cgrep is a context-grep Perl script for showing the given string with several lines of surrounding text.
---end---
There is also a cgrep.sed.
---quote---
cgrep.sed is a context-grep sed script for showing the given string with several lines of surrounding text. It can also match a pattern that's spread across several lines.
---end---
Anyone have any experience with
the latter?

G
 
Old 04-08-2005, 03:23 PM   #7
nilleso
Member
 
Registered: Nov 2004
Location: ON, CANADA
Distribution: ubuntu, RHAS, and other unmentionables
Posts: 372

Rep: Reputation: 31
cgrep isn't a perl script. It's a C-binary developed by Bell Labs (Lucent)... I think what you have come across are Perl scripts that use cgrep internally.
cgrep's basic funtionality already allows matching " a pattern that's spread across several lines" It is excellent for parsing log files in which error msg's are multiple lines.

check out this link for cgrep info
and the following to get the source

Hope you find it useful
 
Old 04-08-2005, 03:29 PM   #8
ginetta
LQ Newbie
 
Registered: Nov 2004
Location: Canada
Posts: 28

Rep: Reputation: 15
Just goes to show... don't believe everything you google!! lol

Thanks Nilleso

Ginetta
 
Old 04-08-2005, 03:42 PM   #9
nilleso
Member
 
Registered: Nov 2004
Location: ON, CANADA
Distribution: ubuntu, RHAS, and other unmentionables
Posts: 372

Rep: Reputation: 31
Here is the current man page (Description Section) for more info:

DESCRIPTION
cgrep provides all the features of grep, egrep, and fgrep, with greatly
enhanced performance (see the section on PERFORMANCE) along with many
additional features, one of which is the ability to output the context
(surrounding lines) of the matching lines. The use of cgrep is upward-
compatible with that of grep, egrep (using the -E option), or fgrep
(using the -F option). cgrep searches files for lines matching pat-
terns and normally sends to standard output matching lines, possibly
with a user-specified context window. The window may be specified as a
constant number of lines before and after the matching lines (the num-
ber of context lines before the matching lines and the number after the
matching lines may differ); or by specifying beginning and ending
delimiters (these may differ); or as any combination thereof. The win-
dows need not be delimited at the nearest occurrence of delimiters, as
any number of matches to the beginning and ending delimiters may be
independently specified. The lines delimiting the beginning or end of
the window may independently be either included in or excluded from the
window.

By default, the patterns and delimiters are taken to be limited regular
expressions as in grep; however, full regular expressions as in egrep,
or fixed strings as in fgrep, may also be used.

More than one pattern or delimiter can be specified by enclosing the
entire set of patterns or delimiters within quotes and separating indi-
vidual patterns or delimiters with newlines. More than one pattern or
delimiter can also be specified by using egrep mode (the -E option) and
separating individual patterns or delimiters with `|' . More than one
pattern can also be specified by using the -f option and listing the
patterns, one per line in a file. patterns can also be specified
dynamically from the input itself by use of the -t or +t option. cgrep
also provides two special-purpose options, -R and -T, for scanning
5ESS(R) ROP output.

If no files are specified, standard input is assumed.

cgrep allows restricting matches to whole words or phrases. The sec-
tion, WORD MATCHING, explains this in more detail.

cgrep supports approximate matching (matching with mismatches allowed).
The description of the -A option explains approximate matching in more
detail.

Unlike grep, egrep, or fgrep, cgrep allows the matching of patterns,
delimiters, or trail_patterns that may span multiple lines of text
through the use of literal newline characters. The section, MULTI-LINE
MATCHING, explains this in more detail.

cgrep also supports viewpathing. The section, VIEWPATHING, explains
this in more detail.
 
Old 04-08-2005, 03:49 PM   #10
ginetta
LQ Newbie
 
Registered: Nov 2004
Location: Canada
Posts: 28

Rep: Reputation: 15
Yeah, I re-researched, just a little more thoroughly and have a good grasp of it now.

Thanks for sharing -- nice to see people still like to follow through with as much as possible when they take on the role of tutor :-)

G is for Grasshopper
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
grep ?? can grep us variables? DaFrEQ Linux - Software 4 09-14-2005 12:22 PM
about grep ringerxyz Programming 1 03-03-2005 01:52 AM
What does rpm -qa |grep th* (as compared to rpm -qa |grep th) display? davidas Linux - Newbie 2 03-18-2004 01:35 AM
"Undeleting" data using grep, but get "grep: memory exhausted" error SammyK Linux - Software 2 03-13-2004 03:11 PM
ps -ef|grep -v root|grep apache<<result maelstrombob Linux - Newbie 1 09-24-2003 11:38 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 03:08 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration