LinuxQuestions.org - Grep and AWK

- Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)

- - Grep and AWK (https://www.linuxquestions.org/questions/linux-general-1/grep-and-awk-779804/)

keogk

01-04-2010 03:51 PM

Grep and AWK

I have a XML file that I want to pull a number out of.
The file looks like this
<WEBSERVER>
<actual>
<temp>
27.855
</temp>
<timestamp>
04.01.2010
</timestamp>
</actual>

Then it starts over again but with iem 1 , 2 3 etc instead of actual.
I want to just pull the number between <temp> and </temp>

That term appears a few times in the document as well
Would I use grep or awk to do this?

ozanbaba

01-04-2010 04:04 PM

Quote:

Originally Posted by keogk (Post 3813997)

are they on the same line?

colucix

01-04-2010 04:11 PM

Quote:

Originally Posted by keogk (Post 3813997)

Would I use grep or awk to do this?

awk or sed or even perl as in:

Code:

while (<>) {

  if (/<temp>/../<\/temp>/) {

    next if /<temp>/ || /<\/temp>/;

    print;

  }

}

keogk

01-04-2010 04:13 PM

No each piece of code is on its own line.
The number I need is always on line 4 and is always the only thing on line 4.

carolh

01-04-2010 04:15 PM

re: grep and awk

I would use both awk and grep, like this:
$ cat YourFile | awk '/<temp>/,/<\/temp>/' | grep -v temp

where the awk command prints out all lines between
<temp> and <endtemp> pairs, but that includes the <temp> and </temp> lines
So:
grep -v temp
to remove those lines

syg00

01-04-2010 05:02 PM

Do it all in one call

Code:

awk '/<temp/,/<\/temp/ {if ($0 !~ /temp/) {print } }' temp.txt

Similarly, the perl above can be reduced to a one-liner.

ghostdog74

01-04-2010 06:45 PM

Quote:

Originally Posted by keogk (Post 3814022)

No each piece of code is on its own line.
The number I need is always on line 4 and is always the only thing on line 4.

Code:

awk 'NR==4' file

ghostdog74

01-04-2010 06:47 PM

Quote:

Originally Posted by carolh (Post 3814025)

1) no need to use cat.
2) use grep + awk on BIG files.
3) other than that, just awk will do.

pixellany

01-04-2010 07:44 PM

Quote:

Originally Posted by keogk (Post 3814022)

No each piece of code is on its own line.
The number I need is always on line 4 and is always the only thing on line 4.

You are contradicting what you said in the first post.......If the number is always on line 4, then all you need is:

Code:

sed -n '4p' filename

If it is in fact what you first said, then try this:

Code:

sed -n '/<temp>/,/<\/temp>/{/^[0-9]/p}' filename

sundialsvcs

01-04-2010 07:55 PM

"Tools for the job."

Perl provides a very large library of XML-support routines... all of them thoroughly tested.

Use one to read the XML file and then to apply an "XPath expression" to automagically select from it exactly the nodes that you want. Then, output the results as you please.

The Unix/Linux environments provide you with "an embarrassment of riches" in terms of "possible ways to do it." What you want to find, then, is the best way.

Quite frankly, IMHO, Perl usually is that "best way," hands down. And the reason for this is the astounding "CPAN" library.

ghostdog74

01-04-2010 08:03 PM

ideally, that should be the case, using libraries to do the job. But for this simple case, there's no need to. Its not that complicated a task.

All times are GMT -5. The time now is 04:10 AM.