Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place! |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
|
07-28-2010, 12:32 PM
|
#1
|
LQ Newbie
Registered: Jun 2010
Posts: 13
Rep:
|
How to grab certain data from trace file?
Alright, I have a network trace file that I want to parse through.
The file looks like this:
+ 1.002 /NodeList/1/DeviceList/0/$ns3::PointToPointNetDevice/TxQueue/Enqueue ns3::PppHeader (Point-to-Point Protocol: IP (0x0021)) ns3::Ipv4Header (tos 0x0 ttl 62 id 0 protocol 6 offset 0 flags [none] length: 40 10.2.1.1 > 10.1.1.1) ns3::TcpHeader (49153 > 26 [ SYN ] Seq=0 Ack=0 Win=65535)
- 1.002 /NodeList/1/DeviceList/0/$ns3::PointToPointNetDevice/TxQueue/Dequeue ns3::PppHeader (Point-to-Point Protocol: IP (0x0021)) ns3::Ipv4Header (tos 0x0 ttl 62 id 0 protocol 6 offset 0 flags [none] length: 40 10.2.1.1 > 10.1.1.1) ns3::TcpHeader (49153 > 26 [ SYN ] Seq=0 Ack=0 Win=65535)
- 1.32033 /NodeList/1/DeviceList/0/$ns3::PointToPointNetDevice/TxQueue/Dequeue ns3::PppHeader (Point-to-Point Protocol: IP (0x0021)) ns3::Ipv4Header (tos 0x0 ttl 62 id 4406 protocol 6 offset 0 flags [none] length: 576 10.2.3.1 > 10.1.54.1) ns3::TcpHeader (143 > 49152 [ ACK ] Seq=75041 Ack=9 Win=65535) Payload Fragment [160:696]
......
All I want to extract from this very large file is the time and the length from each packet flow. And then output it to a new csv file.
Is this possible?
For example:
1.002, 40
1.002, 40
1.32033, 576
...
Thank you for the help!
|
|
|
07-28-2010, 12:41 PM
|
#2
|
LQ Guru
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594
|
Code:
awk '{print $2","$23}' file > newfile
Note: this is a very basic way of doing this, and is not very reliable because it depends on each line having the same number of fields up to and including field #23.
Also, I may have mis-counted for the 23rd field, so if column #2 you get in your output is not the length, re-count which field the length is actually at.
If the lines regularly (or at all) have varying number of fields up to the length field, we'll need something more robust.
|
|
|
07-28-2010, 01:35 PM
|
#3
|
LQ Newbie
Registered: Jun 2010
Posts: 13
Original Poster
Rep:
|
Thanks for the help!
It worked for most of them, but then not for others
These are the variations I see:
1.002,40
1.002,40
1.00203,(length:
1.00203,(length:
1.46974,(49152
1.46975,(49152
|
|
|
07-28-2010, 02:10 PM
|
#4
|
LQ Guru
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594
|
Code:
#!/bin/sh
# clean output file:
echo > outputfile
# grab data:
while read line; do
time=$(echo "$line" | cut -f2 -d" ")
len=$(echo "$line" | grep -oE -- 'length: [0-9]+' | cut -f2 -d" ")
echo "$time,$len" >> outputfile
done < inputfile
Try this - replace "inputfile" and "outputfile" with the files you really want to use.
|
|
|
07-28-2010, 02:33 PM
|
#5
|
LQ Newbie
Registered: Jun 2010
Posts: 13
Original Poster
Rep:
|
How do I run this code?
|
|
|
07-28-2010, 02:35 PM
|
#6
|
LQ Guru
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594
|
It's a shell script. Copy it & paste it into a text file, save it in the same directory as your trace file, then execute it.
To execute it, you can do:
Or, make it executable by doing:
followed by:

|
|
|
07-28-2010, 02:37 PM
|
#7
|
LQ Guru
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594
|
And -- make sure you changed "inputfile" to the real name of your tracefile, and change "outputfile" to whatever you want to name the output of this script (your time & length numbers)
|
|
|
07-28-2010, 07:42 PM
|
#8
|
Senior Member
Registered: Aug 2006
Posts: 2,697
|
Quote:
Originally Posted by GrapefruiTgirl
Code:
#!/bin/sh
# clean output file:
echo > outputfile
# grab data:
while read line; do
time=$(echo "$line" | cut -f2 -d" ")
len=$(echo "$line" | grep -oE -- 'length: [0-9]+' | cut -f2 -d" ")
echo "$time,$len" >> outputfile
done < inputfile
Try this - replace "inputfile" and "outputfile" with the files you really want to use.
|
OP mentioned he has a very large file. Using awk is better than while read loop with big files. (there's also the pipe to grep and cut for every line that slows thins down).
|
|
|
07-28-2010, 07:46 PM
|
#9
|
Senior Member
Registered: Aug 2006
Posts: 2,697
|
Code:
$ awk '{ printf "%s ",$2;for(i=3;i<=NF;i++){if($i == "length:"){print $(i+1);break}}}' file
1.002 40
1.002 40
1.32033 576
|
|
1 members found this post helpful.
|
07-28-2010, 08:37 PM
|
#10
|
LQ Veteran
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,379
|
I prefer sed for this sort of this - to make it unambiguous try this
Code:
sed -rn 's%^.[[:space:]]+([^[:space:]]*).*length:[[:space:]]+([^[:space:]]*).*%\1,\2%p' trace.txt
|
|
1 members found this post helpful.
|
07-28-2010, 08:53 PM
|
#11
|
LQ Guru
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594
|
Code:
sed -rn 's#^. ([[:digit:]]+\.[[:digit:]]+)(.*)(length:)([[:space:]])([[:digit:]]+)(.*)#\1,\5#p' trace.txt
1.002,40
1.002,40
1.32033,576
I knew awking this thing would take me forever - thanks ghostdog for posting that. (When I search internet for help with this stuff, your name always comes up somewhere along the line  )
So I looked at sed instead as I seem to have better success with sed; and thanks to syg00, finally figured out I was missing the -r and the trailing p ... now it works.
EDIT: P.S. - while the initial scripting method earlier may have been slow and had overhead, it was probably more educational than any of these things.. OP seems pretty new to this sort of thing. A simple script is less cryptic than this stuff and probably easier to understand :/
Last edited by GrapefruiTgirl; 07-29-2010 at 11:12 AM.
|
|
|
07-28-2010, 09:11 PM
|
#12
|
Moderator
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
|
And a slightly shorter sed version ...
Code:
sed -r 's/^.[ \t]+([\.0-9]+).*length:[ \t]+([\.0-9]+).*/\1 \2/' file
Cheers,
Tink
Last edited by Tinkster; 07-28-2010 at 09:12 PM.
Reason: space
|
|
|
07-29-2010, 10:55 AM
|
#13
|
LQ Newbie
Registered: Jun 2010
Posts: 13
Original Poster
Rep:
|
Thanks for the help guys. I will try out each one, though, as GrapefruiTgirl mentioned, the console command prompts of sed and awk are all kind of cryptic to me for now.
Does anyone know of any good tutorials for learning how to parse with sed, grep, and awk?
Which one of these are preferred for what types of applications?
|
|
|
07-29-2010, 11:10 AM
|
#14
|
LQ Guru
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594
|
The key to successful usage of ANY of these tools, is to learn "Regular Expressions", otherwise called "regexps". Those are what allow any of these tools to be able to identify the chunks of data that you want them to. Without knowing regular expressions, doing anything beyond the simplest parsing, is pretty much impossible.
I can't really put into words which of sed or awk is better for what task. Each task is different, and after you are comfortable to some degree with both, you tend to sort of have a feeling that one or the other is better. For me, since I know less awk than sed, I often try to do a job with sed first, because usually I can do it quicker - but this is not always practical nor productive  and so if sed won't easily do a job, awk is the next thing on the list.
Grep is more for just finding text or data in files, but not so much for doing any particular processing of the data - only for filtering it and showing it, or showing where it is.
Sed is a stream editor - it operates line-by-line on streamed data which is fed into it, processing the data and outputting the results on the other end.
AWK is.. I don't know how to describe AWK.. It's like sed, but different, larger, perhaps more powerful & complex, and it can also easily do math computations and bitwise operations, that are klunky or impractical to do with sed.
There are loads of tutorials of all sizes & shapes, all around the 'net, on each of these tools. I haven't got a particular link(s) handy to recommend, but likely other members will have some links to their favorite tutorials to share with you. Just remember, learn & understand regexps first; start at the bottom and work up.
|
|
|
07-29-2010, 08:02 PM
|
#15
|
Senior Member
Registered: Aug 2006
Posts: 2,697
|
Quote:
Originally Posted by yida
Does anyone know of any good tutorials for learning how to parse with sed, grep, and awk?
|
you only need to learn awk ( and a bit of grep). You can tuck sed away in your tool box and never have to use it. Awk does what sed can do and a whole lot more. See my sig for learn gawk.
|
|
|
All times are GMT -5. The time now is 08:49 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|