LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 10-08-2007, 04:40 AM   #1
michaeljoser
LQ Newbie
 
Registered: Oct 2007
Distribution: Debian Etch
Posts: 19

Rep: Reputation: 0
how to sort text file and split into smaller files


Hi,

I want to be able to sort a big file which is in the format below. So i want to sort it by ip address then split the file by the ipaddress so i will end up with multiple file containing records for one ip only in each of them... i came across the sort command but not sure how to split the file with the ipaddress as criteria..
thx



RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:22: Team "TERRORIST" triggered "Terrorists_Win" (CT "4") (T "2")
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:22: World triggered "Round_End"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:26: "(o.0)_.!.<7><STEAM_ID_LAN><CT>" say "aaaa ti p envi coupe oli" (dead)
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:27: "theEnd<5><STEAM_ID_LAN><TERRORIST>" triggered "Spawned_With_The_Bomb"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:29: "(o.0)_.!.<7><STEAM_ID_LAN><CT>" say "salope vans"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:22: "x3non<2><STEAM_ID_LAN><TERRORIST>" killed "noob<8><STEAM_ID_LAN><CT>" with "galil"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:22: "TaZ<4><STEAM_ID_LAN><CT>" killed "x3non<2><STEAM_ID_LAN><TERRORIST>" with "awp"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:33: World triggered "Round_Start"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:34: "payne<2><STEAM_ID_LAN><TERRORIST>" say " kill oli get 1000 pt XD"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:34: "(o.0)_.!.<7><STEAM_ID_LAN><CT>" say "ti p atane oli"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:27: "TaZ<4><STEAM_ID_LAN><CT>" killed "NuLL<9><STEAM_ID_LAN><TERRORIST>" with "awp"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:29: "TaZ<4><STEAM_ID_LAN><CT>" killed "Emo|Jpol<6><STEAM_ID_LAN><TERRORIST>" with "deagle"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:29: Team "CT" triggered "CTs_Win" (CT "13") (T "5")
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:29: World triggered "Round_End"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:39: "theEnd<5><STEAM_ID_LAN><TERRORIST>" say "twa vans?"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:32: "x3non<2><STEAM_ID_LAN><TERRORIST>" say_team "ok" (dead)
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:33: "noob<8><STEAM_ID_LAN><CT>" say "wawa" (dead)
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:34: "x3non<2><STEAM_ID_LAN><TERRORIST>" say_team "aster" (dead)
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:35: World triggered "Round_Start"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:45: "oLi<3><STEAM_ID_LAN><TERRORIST>" say "lol lash sa? :P"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:36: "NuLL<9><STEAM_ID_LAN><TERRORIST>" say_team "pa P bon ditou :S"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:51: "la vie | Ishikawa<6><STEAM_ID_LAN><CT>" say "koter?"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:51: "- IceBladder -<4><STEAM_ID_LAN><CT>" say "OLI"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:54: "oLi<3><STEAM_ID_LAN><TERRORIST>" say "wa?"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:54: "- IceBladder -<4><STEAM_ID_LAN><CT>" say "apache dir toi rode to team"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:57: "- IceBladder -<4><STEAM_ID_LAN><CT>" say "li vini dan 5min"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:50: "TaZ<4><STEAM_ID_LAN><CT>" killed "x3non<2><STEAM_ID_LAN
 
Old 10-08-2007, 05:27 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Something like this?

Code:
#!/bin/bash

INFILE="$1"

sort ${INFILE} |\
awk '
BEGIN { FS = "[<>:]" }
{
 print $0 >> $2
}'
This sorts the infile first, the awk will put a specific line into a file named $2 (which is the ip adres). Awk can handle multiple field seperators (I used <> and : in this example).

A sample run looks like this:
Code:
$ cat sort.split.infile
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:22: Team "TERRORIST" triggered "Terrorists_Win" (CT "4") (T "2")
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:22: World triggered "Round_End"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:26: "(o.0)_.!.<7><STEAM_ID_LAN><CT>" say "aaaa ti p envi coupe oli" (dead)
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:27: "theEnd<5><STEAM_ID_LAN><TERRORIST>" triggered "Spawned_With_The_Bomb"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:29: "(o.0)_.!.<7><STEAM_ID_LAN><CT>" say "salope vans"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:22: "x3non<2><STEAM_ID_LAN><TERRORIST>" killed "noob<8><STEAM_ID_LAN><CT>" with "galil"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:22: "TaZ<4><STEAM_ID_LAN><CT>" killed "x3non<2><STEAM_ID_LAN><TERRORIST>" with "awp"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:33: World triggered "Round_Start"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:34: "payne<2><STEAM_ID_LAN><TERRORIST>" say " kill oli get 1000 pt XD"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:34: "(o.0)_.!.<7><STEAM_ID_LAN><CT>" say "ti p atane oli"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:27: "TaZ<4><STEAM_ID_LAN><CT>" killed "NuLL<9><STEAM_ID_LAN><TERRORIST>" with "awp"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:29: "TaZ<4><STEAM_ID_LAN><CT>" killed "Emo|Jpol<6><STEAM_ID_LAN><TERRORIST>" with "deagle"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:29: Team "CT" triggered "CTs_Win" (CT "13") (T "5")
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:29: World triggered "Round_End"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:39: "theEnd<5><STEAM_ID_LAN><TERRORIST>" say "twa vans?"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:32: "x3non<2><STEAM_ID_LAN><TERRORIST>" say_team "ok" (dead)
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:33: "noob<8><STEAM_ID_LAN><CT>" say "wawa" (dead)
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:34: "x3non<2><STEAM_ID_LAN><TERRORIST>" say_team "aster" (dead)
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:35: World triggered "Round_Start"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:45: "oLi<3><STEAM_ID_LAN><TERRORIST>" say "lol lash sa? :P"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:36: "NuLL<9><STEAM_ID_LAN><TERRORIST>" say_team "pa P bon ditou :S"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:51: "la vie | Ishikawa<6><STEAM_ID_LAN><CT>" say "koter?"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:51: "- IceBladder -<4><STEAM_ID_LAN><CT>" say "OLI"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:54: "oLi<3><STEAM_ID_LAN><TERRORIST>" say "wa?"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:54: "- IceBladder -<4><STEAM_ID_LAN><CT>" say "apache dir toi rode to team"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:57: "- IceBladder -<4><STEAM_ID_LAN><CT>" say "li vini dan 5min"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:50: "TaZ<4><STEAM_ID_LAN><CT>" killed "x3non<2><STEAM_ID_LAN


$ ./sort.split.sh sort.split.infile

$ ls -l 4*
-rw-r----- 1 druuna internet 1631 Oct  8 12:24 41.212.144.207
-rw-r----- 1 druuna internet 1370 Oct  8 12:24 41.212.158.233

$ cat 41.212.144.207
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:22: Team "TERRORIST" triggered "Terrorists_Win" (CT "4") (T "2")
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:22: World triggered "Round_End"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:26: "(o.0)_.!.<7><STEAM_ID_LAN><CT>" say "aaaa ti p envi coupe oli" (dead)
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:27: "theEnd<5><STEAM_ID_LAN><TERRORIST>" triggered "Spawned_With_The_Bomb"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:29: "(o.0)_.!.<7><STEAM_ID_LAN><CT>" say "salope vans"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:33: World triggered "Round_Start"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:34: "(o.0)_.!.<7><STEAM_ID_LAN><CT>" say "ti p atane oli"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:34: "payne<2><STEAM_ID_LAN><TERRORIST>" say " kill oli get 1000 pt XD"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:39: "theEnd<5><STEAM_ID_LAN><TERRORIST>" say "twa vans?"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:45: "oLi<3><STEAM_ID_LAN><TERRORIST>" say "lol lash sa? :P"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:51: "- IceBladder -<4><STEAM_ID_LAN><CT>" say "OLI"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:51: "la vie | Ishikawa<6><STEAM_ID_LAN><CT>" say "koter?"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:54: "- IceBladder -<4><STEAM_ID_LAN><CT>" say "apache dir toi rode to team"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:54: "oLi<3><STEAM_ID_LAN><TERRORIST>" say "wa?"
RECV <41.212.144.207:27015>: L 10/07/2007 - 13:25:57: "- IceBladder -<4><STEAM_ID_LAN><CT>" say "li vini dan 5min"

 $ cat 41.212.158.233 
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:22: "TaZ<4><STEAM_ID_LAN><CT>" killed "x3non<2><STEAM_ID_LAN><TERRORIST>" with "awp"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:22: "x3non<2><STEAM_ID_LAN><TERRORIST>" killed "noob<8><STEAM_ID_LAN><CT>" with "galil"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:27: "TaZ<4><STEAM_ID_LAN><CT>" killed "NuLL<9><STEAM_ID_LAN><TERRORIST>" with "awp"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:29: "TaZ<4><STEAM_ID_LAN><CT>" killed "Emo|Jpol<6><STEAM_ID_LAN><TERRORIST>" with "deagle"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:29: Team "CT" triggered "CTs_Win" (CT "13") (T "5")
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:29: World triggered "Round_End"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:32: "x3non<2><STEAM_ID_LAN><TERRORIST>" say_team "ok" (dead)
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:33: "noob<8><STEAM_ID_LAN><CT>" say "wawa" (dead)
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:34: "x3non<2><STEAM_ID_LAN><TERRORIST>" say_team "aster" (dead)
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:35: World triggered "Round_Start"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:36: "NuLL<9><STEAM_ID_LAN><TERRORIST>" say_team "pa P bon ditou :S"
RECV <41.212.158.233:27015>: L 10/07/2007 - 13:23:50: "TaZ<4><STEAM_ID_LAN><CT>" killed "x3non<2><STEAM_ID_LAN
Hope this helps.
 
Old 10-08-2007, 08:12 AM   #3
michaeljoser
LQ Newbie
 
Registered: Oct 2007
Distribution: Debian Etch
Posts: 19

Original Poster
Rep: Reputation: 0
thanks a lot ... wow that's nice script
 
Old 10-08-2007, 10:26 AM   #4
michaeljoser
LQ Newbie
 
Registered: Oct 2007
Distribution: Debian Etch
Posts: 19

Original Poster
Rep: Reputation: 0
the script is very nice but i get a problem when there's a blank line in the log file.... i tried this:
Code:
sed '/^$/d' myFile > tt
without success

any tip on how to remove the blank lines first. oh a last little thing...

how can we remove the "RECV <xx.xxx.xxx.xxx:99999>:" bit from the new files being created??

thx a lot again

Last edited by michaeljoser; 10-08-2007 at 10:32 AM.
 
Old 10-08-2007, 10:53 AM   #5
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

Code:
#!/bin/bash

INFILE="$1"

sort ${INFILE} |\
awk '
BEGIN { FS = "[<>:]" }
 /RECV/ { print substr($0,30) >> $2 }
'
The awk statement is altered. It now searches for lines that have RECV in them, ignoring all others. If a line matches, everything from character 30 to end is printed (removing the RECV <xx.xxx.xxx.xxx:99999>: part).

Hope this helps.
 
Old 10-08-2007, 12:37 PM   #6
michaeljoser
LQ Newbie
 
Registered: Oct 2007
Distribution: Debian Etch
Posts: 19

Original Poster
Rep: Reputation: 0
what should i say.... PERFECT!!!

thanks a lot!!!

plus the script teaches me a lot more that can be done now. thanks!!!
 
Old 10-18-2007, 07:11 AM   #7
michaeljoser
LQ Newbie
 
Registered: Oct 2007
Distribution: Debian Etch
Posts: 19

Original Poster
Rep: Reputation: 0
I modified the script this way:
Code:
#!/bin/bash

INFILE="$1"

sort ${INFILE} |
awk '
BEGIN { FS = "[<>:]" }
 /RECV/ { print substr($0,16+length($2)) >> $2.".log" }
'
but i want to add a number infront of the logfiles so that the output files are like this:
10000.111.111.11.11.log
10001.200.111.11.21.log
10002.111.111.11.31.log
10003.111.111.11.41.log
10004.111.111.11.51.log
10005.111.111.11.61.log

anyone?

thanks for the help
 
Old 10-18-2007, 09:10 AM   #8
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
You may use an array in AWK whose indices are the IP addresses and whose elements are progressive numbers, like this:
Code:
#!/bin/bash

INFILE="$1"

sort ${INFILE} |
awk '
BEGIN { FS = "[<>:]" ; COUNT = 10000 }
 /RECV/ { if ( ! ($2 in prefix) ) prefix[$2] = COUNT++ ;
          print substr($0,16+length($2)) >> prefix[$2]"."$2.".log" }
'
Every time a new IP adress is encountered a new number is computed and assigned to the log filename.
 
Old 10-19-2007, 01:50 AM   #9
michaeljoser
LQ Newbie
 
Registered: Oct 2007
Distribution: Debian Etch
Posts: 19

Original Poster
Rep: Reputation: 0
thanks a lot :P it did the job very nicely
 
  


Reply

Tags
awk, bash, log, sort, text



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
sorting text file - sort command man_linux Linux - General 16 08-09-2006 04:58 PM
bash: split text file iluvatar Programming 4 08-22-2005 08:58 AM
Compress and split a big sized file into smaller files hicham007 Programming 3 07-28-2005 08:56 PM
split up text file jollyjoice Programming 4 06-10-2005 03:33 PM
Reverse Sort Text File BxBoy Linux - General 1 08-02-2004 10:13 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:28 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration