LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 02-17-2011, 02:59 AM   #1
fad216
LQ Newbie
 
Registered: Feb 2011
Posts: 11

Rep: Reputation: Disabled
eliminate unwanted text in txt file


Hello...

I have a few problem. I have a txt file that convert from pcap to txt file. What I want is to eliminate unwanted text from my txt file. Here is the example of the what I want to do:

This is original file in txt file
Quote:
No. Time Source Destination Protocol Info
1 0.000000 158.27.22.66 61.39.220.82 HTTP Continuation or non-HTTP traffic

0000 00 1f 9e 1a 5b 00 00 21 55 84 9a ff 08 00 45 00 ....[..!U.....E.
0010 05 8c e4 14 40 00 36 06 8d 1c 3a 1b 16 42 a1 8b ....@.6...:..B..
0020 dc 52 00 50 05 12 18 d4 17 f0 64 3f b0 94 80 10 .R.P......d?....
Then I just need that file to select only HEX data
Quote:
001f 9e1a 5b00 0021 5584 9aff 0800 4500 058c e414 4000 3606 8d1c 3a1b 1642 a18b
0020 dc52 0050 0512 18d4 17f0 643f b094 8010
I really need some help.Thanks.
 
Old 02-17-2011, 04:07 AM   #2
thegeek
Member
 
Registered: Oct 2009
Location: Amsterdam
Distribution: CentOS,Fedora,Puppy
Posts: 62

Rep: Reputation: 20
Do all the hex lines begin with 00 ?
If so you could use grep to filter it.

Here is an example:

cat yourdatafile.txt | grep ^00 > newfilteredfile.txt

HTH
 
Old 02-17-2011, 04:51 AM   #3
Noway2
Senior Member
 
Registered: Jul 2007
Distribution: Ubuntu 10.10, Slackware 64-current
Posts: 2,124

Rep: Reputation: 778Reputation: 778Reputation: 778Reputation: 778Reputation: 778Reputation: 778Reputation: 778
If you also want to eliminate the ....[..!U.....E. like stuff at the end, you will need to use a slight more refined regular expression. The ^00 being a simple one. In case you are not familiar with them, regular expressions are an algebra like syntax for pattern matching with symbols that have special meaning. They are really quite simple and useful once you get the hang of them, so I would recommend looking for a tutorial on them.

In your case, the ^ means start of line. You can modify the regular expression to something like the following: ^([0-9]+ )+ which means start of the line followed one or more digits 0-9 followed by a space repeated one or more times. Of course there are an almost infinite number of ways you can do this, including telling it to look for 4 digits, a space, followed by a pattern of two digits, etc.

I also noticed that you are doing byte reversal in your example. If you want this, you should look at using a program like AWK, which is a programming language built around regular expressions. With that, you could identify each byte as a field and then print them out in the desired order, while using a loop across the line to repeat the process.
 
Old 02-17-2011, 06:47 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,494

Rep: Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867
How about:
Code:
egrep -o '^00[[:alnum:] ]+' file
 
Old 02-17-2011, 07:19 AM   #5
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 671Reputation: 671Reputation: 671Reputation: 671Reputation: 671Reputation: 671
Code:
sed -r '/^[[:xdigit:]]{4}/!d;s/^[[:xdigit:]]{4} //;s/([[:xdigit:]]{2}) ([[:xdigit:]]{2}) /\1\2 /g;s/^(.{40}).*/\1/' file
Delete lines that don't begin with for hex digits: /^[[:xdigit:]]{4}/!d
Delete the addresses: 4 hex digits at beginning: s/^[[:xdigit:]]{4} //
Remove the spaces between every other hex digit pairs: s/([[:xdigit:]]{2}) ([[:xdigit:]]{2}) /\1\2 /g
Remove ascii characters at end of the line: s/^(.{40}).*/\1/'

Last edited by jschiwal; 02-17-2011 at 07:22 AM.
 
Old 02-17-2011, 08:49 AM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,494

Rep: Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867Reputation: 2867
Based on jschiwal's take (as original output example does include first 4 digits of last entry):
Code:
sed -rn '/^00/{s/^[^ ]+ | [^ ]+$//g;s/(..) (..)/\1\2/gp}' file

or

awk --re-interval '/^[[:xdigit:]]{2}$/ && ORS=(i++%2)?" ":"\0"' RS=" " file
Last one returns all results on a single line.
 
Old 02-18-2011, 02:49 PM   #7
fad216
LQ Newbie
 
Registered: Feb 2011
Posts: 11

Original Poster
Rep: Reputation: Disabled
thank you for reply...
yes, all hex lines start with 00 because that is the address of hex data.

ok, I have try all commands that are given and I found that this one is better one:

Code:
awk --re-interval '/^[[:xdigit:]]{2}$/ && ORS=(i++%2)?" ":"\0"' RS=" " file
but actually I have thousand no of packet in one txt file something like this:
Quote:
No. Time Source Destination Protocol Info
1 0.000000 158.27.22.66 61.39.220.82 HTTP Continuation or non-HTTP traffic

0000 00 1f 9e 1a 5b 00 00 21 55 84 9a ff 08 00 45 00 ....[..!U.....E.
.
.(up to thousand)
.
.
No. Time Source Destination Protocol Info

1003 145.413121 58.27.22.66 61.29.220.82 HTTP [TCP Previous segment lost] Continuation or non-HTTP traffic

0000 00 1f 9e 1a 5b 00 00 21 55 84 9a ff 08 00 45 00 ....[..!U.....E.
So can I have separate each different packet in different txt file automatically. For example, If I have 1000 packet data file sniff in pcap...so I have 1000 txt file where each file contain HEX file.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Copy the contents of a txt file to other txt files (with similar names) by cp command TheIndependentAquarius Linux - Newbie 7 07-03-2010 12:54 AM
how to get the specific text from a txt file in bash script deepakdeore2004 Programming 8 04-30-2010 06:35 AM
Saving shutdown text screen into a txt file. glore2002 Debian 2 03-30-2010 03:29 AM
How can read from file.txt C++ where can save this file(file.txt) to start reading sam_22 Programming 1 01-11-2007 05:11 PM
Unwanted text hightlight in vi lel800 Linux - Software 3 12-10-2004 11:37 AM


All times are GMT -5. The time now is 02:44 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration