LinuxQuestions.org
Did you know LQ has a Linux Hardware Compatibility List?
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 11-30-2012, 12:18 PM   #1
graphicsmanx1
Member
 
Registered: Oct 2012
Posts: 81

Rep: Reputation: Disabled
grep variable question


Looking at grep for something I wanted to make. Within an XML file I need values. After searching I found a few options and chose grep. I wrote a for loop to pull data from an .xml file which works but I dont need the tags just the data.

Code:
#!/bin/bash

for X in *.xml; do

	name=$(grep -r "<name>.*<name>" *.xml)
	
	results=$(basename $name)
	
	echo "$results"
	echo "done"
done
the code works but when it echoes it echoes name blah blah name. what can I add to just get the end results without the <name> tags?? I tried basename but it doesnt work and when I searched I saw this for sed but I have no clue whats going on:

post
code:
Code:
echo '/foo/fizzbuzz.bar' | sed 's|.*\/\([^\.]*\)\(\..*\)$|\1|g'
 
Old 11-30-2012, 12:45 PM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
line-and-regex based programs like grep and sed do not handle formats that use free-form nested tagging like xml or html very well. You really should use a tool that has a dedicated xml parser like perl or xmlstarlet.

If you would supply an example of the xml data and what you want to extract from it, I could help you write up an xmlstarlet rule to extract it for you.

As for the above, first of all grep can only do simple matching and can't extract substrings from a match. sed can do it, but without a sample of the input to work with we can only guess the expression.

Code:
sed -rn '/name/ s|.*name>([^>]+)</name.*|\1|p' infile.xml
The above assumes that there's a single tag in the file that formatted like this:

Code:
<name>data I want</name>
...and you want the part between them.

".*" alone can't be used because "*" is greedy, and will continue matching everything to the end of the line. You have to use a negating pattern to stop it where you want it to.

But again, this depends on the xml being regular enough for the full pattern to always exist on a single line. Again, it's much better to use a real xml parser.

Edit: To do the same with xmlstarlet, the following command should work:

Code:
xmlstarlet sel -T -t -v '//name' -n input.xml
It will extract the values of all "name" elements, and print them one per line, just like sed. But unlike sed, the exact structure of the file is immaterial.

Do note however that it is very picky about the input being well-formatted xml.

Last edited by David the H.; 11-30-2012 at 01:01 PM. Reason: fixed sed code
 
1 members found this post helpful.
Old 11-30-2012, 12:54 PM   #3
graphicsmanx1
Member
 
Registered: Oct 2012
Posts: 81

Original Poster
Rep: Reputation: Disabled
thanks guess I need to learn Perl too.

when i run that line it just prints the entire file in the terminal

Last edited by graphicsmanx1; 11-30-2012 at 12:57 PM.
 
Old 11-30-2012, 01:16 PM   #4
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
As I said, what I posted was just a guess. Please give us an actual representative example of the input text if you want to ensure code that really works.

I also updated my post to fixe a mistake I made, where I forgot to include the -n in sed. That keeps it from printing the whole file by default. Perhaps you missed that.
 
1 members found this post helpful.
Old 11-30-2012, 01:22 PM   #5
graphicsmanx1
Member
 
Registered: Oct 2012
Posts: 81

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by David the H. View Post
As I said, what I posted was just a guess. Please give us an actual representative example of the input text if you want to ensure code that really works.

I also updated my post to fixe a mistake I made, where I forgot to include the -n in sed. That keeps it from printing the whole file by default. Perhaps you missed that.
no need the -n made it work perfectly

Code:
sed -rn '/name/ s|.*name>([^>]+)</name.*|\1|p' infile.xml
now I just need to find out what exactly each element is doing so I can use it.

Last edited by graphicsmanx1; 11-30-2012 at 02:26 PM.
 
Old 11-30-2012, 04:09 PM   #6
graphicsmanx1
Member
 
Registered: Oct 2012
Posts: 81

Original Poster
Rep: Reputation: Disabled
this is what I am trying to do but its not working

Code:
#!/bin/bash

today=$(date +%y-%m-%d)
event=$(date +%H:%M)
for name in $(find $folders -type f -name client.xml); do
	result=$(sed -rn "/name/ s|.*name>([^>]+)</name.*|\1|p" .xml)
	echo $name,$result,$today,$event >> log.csv
done
trying to search all folders. For each folder it finds client.xml file and prints the name in it with the date and time into a csv. However I can run the line sed by itself but when I run it this way I get a sed cant read .xml file in the terminal. Still trying to figure out why.
 
Old 12-01-2012, 03:40 PM   #7
Reuti
Senior Member
 
Registered: Dec 2004
Location: Marburg, Germany
Distribution: openSUSE 13.1
Posts: 1,320

Rep: Reputation: 252Reputation: 252Reputation: 252
Quote:
Originally Posted by graphicsmanx1 View Post
Code:
result=$(sed -rn "/name/ s|.*name>([^>]+)</name.*|\1|p" .xml)
Instead of giving .xml as argument you need $name to get the name of the already found file.
 
Old 12-02-2012, 01:14 PM   #8
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
I've asked a couple of times for you to give us a representative example of the xml that you are using, and exactly what you want to get from it. Without it we can only make educated guesses, but with it, and a clear understanding of your goals, we can run tests ourselves and provide you with real, working solutions.

I still say that you should probably be using a real xml parsing program instead of sed.

Also, what does the file tree look like, exactly?


BTW, for a couple more obvious errors (Reuti has already pointed out one):

Code:
for name in $(find $folders -type f -name client.xml); do
1) Don't Read Lines With For! Lines of text output from a command should always be processed with a while+read loop, and when the "lines" are filenames then it should also ideally use null-character separators.

How can I read a file (data stream, variable) line-by-line (and/or field-by-field)?
http://mywiki.wooledge.org/BashFAQ/001

2) QUOTE ALL OF YOUR VARIABLE EXPANSIONS. You should never leave the quotes off a parameter expansion unless you explicitly want the resulting string to be word-split by the shell (globbing patterns are also expanded). This is a vitally important concept in scripting, so train yourself to do it correctly now. You can learn about the exceptions later.

http://mywiki.wooledge.org/Arguments
http://mywiki.wooledge.org/WordSplitting
http://mywiki.wooledge.org/Quotes
 
1 members found this post helpful.
Old 12-03-2012, 10:03 AM   #9
graphicsmanx1
Member
 
Registered: Oct 2012
Posts: 81

Original Poster
Rep: Reputation: Disabled
thats just it I dont have any xml files. I ask questions to learn and I use files I deal with everyday and try to see if I can get them to do something. I wanted to learn Linux and bash because a few people I knew suggested it. Ive been out of programming for over a year and I need something to practice with. I almost thought about going back to Java and learn again.
 
Old 12-05-2012, 08:22 PM   #10
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
Ok, then. I was under the impression that you were working on a specific problem.

As a last bit of advice then, if you aren't very familiar with regular expressions, then I highly recommend putting some time into that. Being able to effectively match text patterns is vital for getting the best out of grep and sed, and they're supported by many other tools as well, including bash.
 
Old 12-06-2012, 03:02 PM   #11
graphicsmanx1
Member
 
Registered: Oct 2012
Posts: 81

Original Poster
Rep: Reputation: Disabled
do you suggest any good reads such as links or books. I debated getting O'reileys Awk and sed and bash book
 
Old 12-06-2012, 03:16 PM   #12
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
I've learned most of what I know from readily-available online tutorials and hands-on experience.

Let me just post my full list of scripting reference pages:

Here are a few useful bash references:
http://mywiki.wooledge.org/BashGuide
http://wiki.bash-hackers.org/start
http://www.linuxcommand.org/index.php
http://wiki.bash-hackers.org/scripting/newbie_traps
http://mywiki.wooledge.org/BashPitfalls
http://mywiki.wooledge.org/BashFAQ
http://tldp.org/LDP/Bash-Beginners-G...tml/index.html
http://www.tldp.org/LDP/abs/html/index.html
http://www.gnu.org/software/bash/manual/bashref.html
http://ss64.com/bash/

=====
Here are a few useful sed references:
http://www.grymoire.com/Unix/Sed.html
http://sed.sourceforge.net/grabbag/
http://sed.sourceforge.net/sedfaq.html
http://sed.sourceforge.net/sed1line.txt

=====
Here are a few useful awk references:
http://www.grymoire.com/Unix/Awk.html
http://www.gnu.org/software/gawk/man...ode/index.html
http://www.pement.org/awk/awk1line.txt
http://www.catonmat.net/blog/awk-one...ined-part-one/

=====
Here are a couple of good links about using find:
http://mywiki.wooledge.org/UsingFind
http://www.grymoire.com/Unix/Find.html

=====
Here are a few regular expressions tutorials:
http://mywiki.wooledge.org/RegularExpression
http://www.grymoire.com/Unix/Regular.html
http://www.regular-expressions.info/

=====
How to use ed:
http://wiki.bash-hackers.org/howto/edit-ed
http://snap.nlc.dcccd.edu/learn/nlc/ed.html
(also read the info page)

=====
wget tutorial:
http://www.thegeekstuff.com/2009/09/...esome-examples


Just google if you want more.

Last edited by David the H.; 12-06-2012 at 03:18 PM. Reason: expanded references
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
using Variable in grep in perl KManepalli Linux - Newbie 5 08-10-2011 09:13 AM
Lynx + Grep + Variable... thecurious1 Linux - General 10 03-13-2011 11:50 AM
How do you grep a variable? dbrazeau Programming 12 03-11-2010 09:57 PM
Grep variable with space brainlesseinstein Linux - General 3 08-10-2009 12:28 AM
Using a variable containing a filename in grep TrumpetMan258 Programming 2 03-01-2008 02:27 PM


All times are GMT -5. The time now is 04:13 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration