Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
Due to network maintenance being performed by our provider, LQ will be down starting at 05:01 AM UTC. The exact duration of the downtime isn't currently known. We apologize for the inconvenience.
|
 |
04-17-2006, 03:32 AM
|
#1
|
|
Member
Registered: Nov 2004
Location: Noreg
Distribution: ubuntu
Posts: 107
Rep:
|
how to use sed to redirect only pattern match to file (not entire line)
Hi, i'm trying to figure out how to use sed and am having some problems. I have this file:
tcxmlmeldinger/innfraergo/TC-20060413121638410.xml
the whole xml file is on 1 line so i can't grep out the text i want (grrr). When i try to use the w option in sed:
Code:
sed -e '/Varsling_[0-9]+/w tmp' tcxmlmeldinger/innfraergo/TC-20060413121638410.xml
Even that writes the whole matching line (in this case, the entire file) to tmp. What i'm most interested in is just getting the regexp match (will look something like: Varsling_902394039023) so i can save it in a bash variable. I guess i could rephrase this whole thing:
how do i extract a substring from a line of text in a file using bash?
Last edited by nickleus; 04-17-2006 at 04:06 AM.
|
|
|
|
04-17-2006, 04:06 AM
|
#2
|
|
LQ Veteran
Registered: Sep 2003
Location: the Netherlands
Distribution: lfs, debian, rhel
Posts: 8,703
|
Hi,
Getting part of a string with sed:
sed -n 's/.*\(Varsling_[0-9]*\).*/\1/p' infile
The search string is made up of the following 3 parts:
.* => everything in front of Varsling_[0-9]*
\(Varsling_[0-9]*\) => What we are looking for. The \( and \) are special. All in between can be represented as \1 in the replacement part.
.* => everything after Varsling_[0-9]*
The -n make sed suppress the normal output, the p on the end prints only the replacement pattern.
Hope this clears things up.
|
|
|
|
04-17-2006, 04:22 AM
|
#3
|
|
Member
Registered: Nov 2004
Location: Noreg
Distribution: ubuntu
Posts: 107
Original Poster
Rep:
|
druuna, thank you thank you thank you! i had thought about the whole \1 thing, but wasn't sure how to write it syntactically correct. you da man! worked like a charm =)
|
|
|
|
04-17-2006, 07:08 AM
|
#4
|
|
Member
Registered: Nov 2004
Location: Noreg
Distribution: ubuntu
Posts: 107
Original Poster
Rep:
|
I tried expanding your example to include multiple backward references in my script (sh) file, but it doesn't seem to work:
Code:
echo "TIMESTAMP from SENT file $i: $(sed -n 's/.*DateAndTimes id="206">\s*<Year>\([0-9]*\)<\/Year>\s*<Month>\([0-9]*\)<\/Month>\s*<Day>\([0-9]*\)<\/Day>\s*<Hour>\([0-9]*\)<\/Hour>\s*<Minute>\([0-9]*\)<\/Minute>.*/\1\2\3\4\5/p' $SENT$i)"
Nothing gets printed out, but i can't see why when the file contents look like this:
Code:
<DateAndTimes id="206">
<Year>2006</Year>
<Month>04</Month>
<Day>13</Day>
<Hour>11</Hour>
<Minute>57</Minute>
</DateAndTimes>
it should match, or have i just written a crappy regex? =)
Last edited by nickleus; 04-17-2006 at 07:21 AM.
|
|
|
|
04-17-2006, 08:21 AM
|
#5
|
|
LQ Veteran
Registered: Sep 2003
Location: the Netherlands
Distribution: lfs, debian, rhel
Posts: 8,703
|
Hi,
Is the input still 1 line, or is the example in post #4 a new, multiple line, inputfile?
If the infile is multiple lines, you could do something like this (only first 2 lines are shown):
Code:
sed -n -e 's%<Year>\([0-9]*\)</Year>%\1%p' -n -e 's%<Month>\([0-9]*\)</Month>%\1%p' infile
Some chars are special and need to be escaped, but you can also change the separator that sed uses (changed it from / to % in the above example. Now you do not need to escape the / (in </zzzzz> constructs).
Also the -e option is new. This makes it possible to join multiple sed commands.
If it is one line, please post the line so I can have a look.
Hope this clears things up a bit.
|
|
|
|
04-17-2006, 08:43 AM
|
#6
|
|
Moderator
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733
|
Code:
\s*<Year>\([0-9]*\)<\/Year>
It might be better to use [0-9][0-9]* so that at least one digit is required for a match. You could also use [[:digit:]][[:digit:]]*
What is the "\s" for?
|
|
|
|
04-17-2006, 08:47 AM
|
#7
|
|
LQ Veteran
Registered: Sep 2003
Location: the Netherlands
Distribution: lfs, debian, rhel
Posts: 8,703
|
@jschiwal:
Quote:
|
Originally Posted by jschiwal
It might be better to use [0-9][0-9]* so that at least one digit is required for a match. You could also use [[:digit:]][[:digit:]]*
|
Good point! Overlooked that myself 
|
|
|
|
04-17-2006, 09:35 AM
|
#8
|
|
Member
Registered: Nov 2004
Location: Noreg
Distribution: ubuntu
Posts: 107
Original Poster
Rep:
|
Thanks for your feedback guys =)
Quote:
|
Originally Posted by jschiwal
It might be better to use [0-9][0-9]* so that at least one digit is required for a match.
|
i tried this [0-9]+, but it didn't work, why not? i thought '+' means one or more is required...??
Quote:
|
Originally Posted by jschiwal
What is the "\s" for?
|
i read here:
http://www.webcom.com/glossary/regexp.shtml
that it means:
Quote:
|
\s Matches a whitespace char (space, tab, newline...)
|
since the file is multi-line i thought that would take care of matching the newline and eventual tabs and/or whitespaces after and before the xml tags...
Ok, so here is what i came up with:
Code:
echo "TIMESTAMP from SENT file $i: $(sed -n -e 's%.*<Year>\([0-9]*\)</Year>.*%\1%p' -n -e 's%.*<Month>\([0-9]*\)</Month>.*%\1%p' -n -e 's%.*<Day>\([0-9]*\)</Day>.*%\1%p' -n -e 's%.*<Hour>\([0-9]*\)</Hour>.*%\1%p' -n -e 's%.*<Minute>\([0-9]*\)</Minute>.*%\1%p' -n -e 's%.*<Second>\([0-9]*\)</Second>.*%\1%p' tmp)"
i added the .* before and after to take away whitespace, but the problem is that the output on the screen looks like this:
Code:
TIMESTAMP from SENT file test20060413134349.xml: 2006
04
13
13
43
i need it to look like this:
Quote:
|
TIMESTAMP from SENT file test20060413115712.xml: 200604131157
|
it seems like the 'p' option acts like a println, but i need it to act like a print (no newline). druuna thanks for the cool tip about switching the separator =)
PS. so it isn't possible to use multiple backward references like in my previous post?
Last edited by nickleus; 04-17-2006 at 10:14 AM.
|
|
|
|
04-17-2006, 10:30 AM
|
#9
|
|
LQ Veteran
Registered: Sep 2003
Location: the Netherlands
Distribution: lfs, debian, rhel
Posts: 8,703
|
Hi,
The newline after each print (from sed) causes this behavior. Besides writing a complete sed script I don't know how to solve this.
So I would not use sed to solve your problem. It could probably be done by sed, but awk (to name just one) can do it more elegant and simpler (my opinion):
Code:
#!/bin/bash
inFile="$1"
printf "IMESTAMP from SENT file ${inFile}: "
awk 'BEGIN { FS="[><]"}
/Year/ { printf $3 }
/Month/ { printf $3 }
/Day/ { printf $3 }
/Hour/ { printf $3 }
/Minute/ { print $3 }
' ${inFile}
Mind the difference between print and printf (printf omits the newline, print does not).
Quote:
|
PS. so it isn't possible to use multiple backward references like in my previous post?
|
It is 'not possible' the way you set it up, including the fact that the inputfile is multiple lines. Not possible is between quotes, it is probably possible by writing a complete sed script, but I don't believe that is what you want (assumed by me......).
Hope this helps.
|
|
|
|
04-18-2006, 07:44 AM
|
#10
|
|
Member
Registered: Nov 2004
Location: Noreg
Distribution: ubuntu
Posts: 107
Original Poster
Rep:
|
drunna, i'm trying to save the $3 value to a local variable and have tried many different things but can't get it to work:
Code:
awk 'BEGIN { FS="[><]"}
/Year/ { TIMESTAMP=$3 }
/Month/ { TIMESTAMP=$TIMESTAMP$3 }
/Day/ { TIMESTAMP=$TIMESTAMP$3 }
/Hour/ { TIMESTAMP=$TIMESTAMP$3 }
/Minute/ { TIMESTAMP=$TIMESTAMP$3 }
' tmp
what am i doing wrong here? thanks so much in advance for your help =) have never used awk before..
|
|
|
|
04-18-2006, 08:04 AM
|
#11
|
|
Member
Registered: Nov 2004
Location: Noreg
Distribution: ubuntu
Posts: 107
Original Poster
Rep:
|
WAIT! i figured it out! =)
Code:
TIMESTAMP=$(awk 'BEGIN { FS="[><]"}
/Year/ { printf $3 }
/Month/ { printf $3 }
/Day/ { printf $3 }
/Hour/ { printf $3 }
/Minute/ { print $3 }
' tmp)
and if i want to save it to a file instead i just formulate it this way:
Code:
awk 'BEGIN { FS="[><]"}
/Year/ { printf $3 }
/Month/ { printf $3 }
/Day/ { printf $3 }
/Hour/ { printf $3 }
/Minute/ { print $3 }
' tmp >> timestamp
sweetness!
|
|
|
|
04-18-2006, 08:34 AM
|
#12
|
|
LQ Veteran
Registered: Sep 2003
Location: the Netherlands
Distribution: lfs, debian, rhel
Posts: 8,703
|
Hi,
You figured it out, nothing to add from my side 
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 09:26 PM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|