LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-28-2010, 12:32 PM   #1
yida
LQ Newbie
 
Registered: Jun 2010
Posts: 13

Rep: Reputation: 0
How to grab certain data from trace file?


Alright, I have a network trace file that I want to parse through.

The file looks like this:

+ 1.002 /NodeList/1/DeviceList/0/$ns3::PointToPointNetDevice/TxQueue/Enqueue ns3::PppHeader (Point-to-Point Protocol: IP (0x0021)) ns3::Ipv4Header (tos 0x0 ttl 62 id 0 protocol 6 offset 0 flags [none] length: 40 10.2.1.1 > 10.1.1.1) ns3::TcpHeader (49153 > 26 [ SYN ] Seq=0 Ack=0 Win=65535)
- 1.002 /NodeList/1/DeviceList/0/$ns3::PointToPointNetDevice/TxQueue/Dequeue ns3::PppHeader (Point-to-Point Protocol: IP (0x0021)) ns3::Ipv4Header (tos 0x0 ttl 62 id 0 protocol 6 offset 0 flags [none] length: 40 10.2.1.1 > 10.1.1.1) ns3::TcpHeader (49153 > 26 [ SYN ] Seq=0 Ack=0 Win=65535)
- 1.32033 /NodeList/1/DeviceList/0/$ns3::PointToPointNetDevice/TxQueue/Dequeue ns3::PppHeader (Point-to-Point Protocol: IP (0x0021)) ns3::Ipv4Header (tos 0x0 ttl 62 id 4406 protocol 6 offset 0 flags [none] length: 576 10.2.3.1 > 10.1.54.1) ns3::TcpHeader (143 > 49152 [ ACK ] Seq=75041 Ack=9 Win=65535) Payload Fragment [160:696]
......


All I want to extract from this very large file is the time and the length from each packet flow. And then output it to a new csv file.
Is this possible?

For example:

1.002, 40
1.002, 40
1.32033, 576
...

Thank you for the help!
 
Old 07-28-2010, 12:41 PM   #2
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Code:
awk '{print $2","$23}' file > newfile
Note: this is a very basic way of doing this, and is not very reliable because it depends on each line having the same number of fields up to and including field #23.

Also, I may have mis-counted for the 23rd field, so if column #2 you get in your output is not the length, re-count which field the length is actually at.

If the lines regularly (or at all) have varying number of fields up to the length field, we'll need something more robust.
 
Old 07-28-2010, 01:35 PM   #3
yida
LQ Newbie
 
Registered: Jun 2010
Posts: 13

Original Poster
Rep: Reputation: 0
Thanks for the help!

It worked for most of them, but then not for others

These are the variations I see:

1.002,40
1.002,40
1.00203,(length:
1.00203,(length:
1.46974,(49152
1.46975,(49152
 
Old 07-28-2010, 02:10 PM   #4
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Code:
#!/bin/sh

# clean output file:
echo > outputfile

# grab data:
while read line; do
 time=$(echo "$line" | cut -f2 -d" ")
 len=$(echo "$line" | grep -oE -- 'length: [0-9]+' | cut -f2 -d" ")
 echo "$time,$len" >> outputfile
done < inputfile
Try this - replace "inputfile" and "outputfile" with the files you really want to use.
 
Old 07-28-2010, 02:33 PM   #5
yida
LQ Newbie
 
Registered: Jun 2010
Posts: 13

Original Poster
Rep: Reputation: 0
How do I run this code?
 
Old 07-28-2010, 02:35 PM   #6
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
It's a shell script. Copy it & paste it into a text file, save it in the same directory as your trace file, then execute it.
To execute it, you can do:
Code:
sh filename
Or, make it executable by doing:
Code:
chmod a+x filename
followed by:
Code:
./filename
 
Old 07-28-2010, 02:37 PM   #7
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
And -- make sure you changed "inputfile" to the real name of your tracefile, and change "outputfile" to whatever you want to name the output of this script (your time & length numbers)
 
Old 07-28-2010, 07:42 PM   #8
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by GrapefruiTgirl View Post
Code:
#!/bin/sh

# clean output file:
echo > outputfile

# grab data:
while read line; do
 time=$(echo "$line" | cut -f2 -d" ")
 len=$(echo "$line" | grep -oE -- 'length: [0-9]+' | cut -f2 -d" ")
 echo "$time,$len" >> outputfile
done < inputfile
Try this - replace "inputfile" and "outputfile" with the files you really want to use.
OP mentioned he has a very large file. Using awk is better than while read loop with big files. (there's also the pipe to grep and cut for every line that slows thins down).
 
Old 07-28-2010, 07:46 PM   #9
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Code:
$ awk '{ printf "%s ",$2;for(i=3;i<=NF;i++){if($i == "length:"){print $(i+1);break}}}' file
1.002 40
1.002 40
1.32033 576
 
1 members found this post helpful.
Old 07-28-2010, 08:37 PM   #10
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,379

Rep: Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190Reputation: 4190
I prefer sed for this sort of this - to make it unambiguous try this
Code:
sed -rn 's%^.[[:space:]]+([^[:space:]]*).*length:[[:space:]]+([^[:space:]]*).*%\1,\2%p' trace.txt
 
1 members found this post helpful.
Old 07-28-2010, 08:53 PM   #11
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
Code:
sed -rn 's#^. ([[:digit:]]+\.[[:digit:]]+)(.*)(length:)([[:space:]])([[:digit:]]+)(.*)#\1,\5#p' trace.txt
1.002,40
1.002,40
1.32033,576
I knew awking this thing would take me forever - thanks ghostdog for posting that. (When I search internet for help with this stuff, your name always comes up somewhere along the line )

So I looked at sed instead as I seem to have better success with sed; and thanks to syg00, finally figured out I was missing the -r and the trailing p ... now it works.

EDIT: P.S. - while the initial scripting method earlier may have been slow and had overhead, it was probably more educational than any of these things.. OP seems pretty new to this sort of thing. A simple script is less cryptic than this stuff and probably easier to understand :/

Last edited by GrapefruiTgirl; 07-29-2010 at 11:12 AM.
 
Old 07-28-2010, 09:11 PM   #12
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
And a slightly shorter sed version ...
Code:
sed -r 's/^.[ \t]+([\.0-9]+).*length:[ \t]+([\.0-9]+).*/\1 \2/' file
Cheers,
Tink

Last edited by Tinkster; 07-28-2010 at 09:12 PM. Reason: space
 
Old 07-29-2010, 10:55 AM   #13
yida
LQ Newbie
 
Registered: Jun 2010
Posts: 13

Original Poster
Rep: Reputation: 0
Thanks for the help guys. I will try out each one, though, as GrapefruiTgirl mentioned, the console command prompts of sed and awk are all kind of cryptic to me for now.
Does anyone know of any good tutorials for learning how to parse with sed, grep, and awk?

Which one of these are preferred for what types of applications?
 
Old 07-29-2010, 11:10 AM   #14
GrapefruiTgirl
LQ Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556Reputation: 556
The key to successful usage of ANY of these tools, is to learn "Regular Expressions", otherwise called "regexps". Those are what allow any of these tools to be able to identify the chunks of data that you want them to. Without knowing regular expressions, doing anything beyond the simplest parsing, is pretty much impossible.

I can't really put into words which of sed or awk is better for what task. Each task is different, and after you are comfortable to some degree with both, you tend to sort of have a feeling that one or the other is better. For me, since I know less awk than sed, I often try to do a job with sed first, because usually I can do it quicker - but this is not always practical nor productive and so if sed won't easily do a job, awk is the next thing on the list.

Grep is more for just finding text or data in files, but not so much for doing any particular processing of the data - only for filtering it and showing it, or showing where it is.
Sed is a stream editor - it operates line-by-line on streamed data which is fed into it, processing the data and outputting the results on the other end.
AWK is.. I don't know how to describe AWK.. It's like sed, but different, larger, perhaps more powerful & complex, and it can also easily do math computations and bitwise operations, that are klunky or impractical to do with sed.

There are loads of tutorials of all sizes & shapes, all around the 'net, on each of these tools. I haven't got a particular link(s) handy to recommend, but likely other members will have some links to their favorite tutorials to share with you. Just remember, learn & understand regexps first; start at the bottom and work up.
 
Old 07-29-2010, 08:02 PM   #15
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by yida View Post
Does anyone know of any good tutorials for learning how to parse with sed, grep, and awk?
you only need to learn awk ( and a bit of grep). You can tuck sed away in your tool box and never have to use it. Awk does what sed can do and a whole lot more. See my sig for learn gawk.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Filter through line/s to grab specific fields/data in the line with example shayno90 Linux - Newbie 11 10-14-2009 11:51 AM
no /dev/video file to grab images from Sonarc Linux - Newbie 1 07-07-2009 04:06 AM
How to trace and disable the HTTP TRACE method in Apache 1.3.33 with FreeBSD? SomnathG Linux - Security 1 11-11-2008 09:41 AM
Advanced VI Question - Grab IP Addresses out of log file? tbeehler Linux - Software 9 08-27-2007 09:47 AM
how to format a HD without leaving any trace of old data? BrianK General 23 02-07-2003 06:49 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 08:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration