LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 06-14-2013, 06:42 PM   #1
cusvenus
LQ Newbie
 
Registered: Jun 2013
Posts: 3

Rep: Reputation: Disabled
split a file based on column value awk / sed?


I have a file with some data and one of it is a long integer value say column $4, I want to sort that column and based on a range split the file.

EX:- 100 - 5000 (where x = 100 and y = 5000)
Next file is like

X= 5001 and y=10000

Can someone please help?

Thanks in advance
 
Old 06-14-2013, 06:49 PM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.9, Centos 7.3
Posts: 17,396

Rep: Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395Reputation: 2395
It'd be a bit easier if you showed few lines of the file and also explain a bit more about how you want it split.
For a sort numeric on field 4 try
Code:
sort -k4 -n file
http://linux.die.net/man/1/sort
 
Old 06-14-2013, 06:54 PM   #3
cusvenus
LQ Newbie
 
Registered: Jun 2013
Posts: 3

Original Poster
Rep: Reputation: Disabled
Thanks Chris.

Below is the data.

t1,e1,l1,r1,t2,s1
137,597,LG1,520000,Group 1-1,true
1370,8,JBC,40000,Group 1-1,false
137,597,LG1,2110000,Group 1-1,true
1370,8,JBC,800000,Group 1-1,false
137,597,LG1,210000,Group 1-1,true
1370,8,JBC,2000,Group 1-1,false
137,597,LG1,20800,Group 1-1,true
1370,8,JBC,2808000,Group 1-1,false
137,597,LG1,20700,Group 1-1,true
1370,8,JBC,2803000,Group 1-1,false
137,597,LG1,20400,Group 1-1,true
1370,8,JBC,28010,Group 1-1,false

say if I have 4th column sorted and want to split based on every 5 lines from that number

sed 1d test.csv | sort -k4 -n

1370,8,JBC,2000,Group 1-1,false
1370,8,JBC,28010,Group 1-1,false
1370,8,JBC,2803000,Group 1-1,false
1370,8,JBC,2808000,Group 1-1,false
1370,8,JBC,40000,Group 1-1,false
1370,8,JBC,800000,Group 1-1,false
137,597,LG1,20400,Group 1-1,true
137,597,LG1,20700,Group 1-1,true
137,597,LG1,20800,Group 1-1,true
137,597,LG1,210000,Group 1-1,true
137,597,LG1,2110000,Group 1-1,true
137,597,LG1,520000,Group 1-1,true

Last edited by cusvenus; 06-14-2013 at 06:59 PM.
 
Old 06-15-2013, 03:13 AM   #4
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,530

Rep: Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897Reputation: 2897
Split? Into files? Into arrays? With a gap in the output?
 
Old 06-15-2013, 03:36 AM   #5
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian Jessie / sid
Posts: 1,471

Rep: Reputation: 444Reputation: 444Reputation: 444Reputation: 444Reputation: 444
Quote:
Originally Posted by cusvenus View Post
Thanks Chris.

Below is the data.

..snip..

say if I have 4th column sorted and want to split based on every 5 lines from that number
so you want to sort column 4 and then have a new file for every 6 lines?

This might be a coincidence , but looking at your sorted data.. I see a pattern
Code:
1370,8,JBC,2000,Group 1-1,false
1370,8,JBC,28010,Group 1-1,false
1370,8,JBC,2803000,Group 1-1,false
1370,8,JBC,2808000,Group 1-1,false
1370,8,JBC,40000,Group 1-1,false
1370,8,JBC,800000,Group 1-1,false
137,597,LG1,20400,Group 1-1,true
137,597,LG1,20700,Group 1-1,true
137,597,LG1,20800,Group 1-1,true
137,597,LG1,210000,Group 1-1,true
137,597,LG1,2110000,Group 1-1,true
137,597,LG1,520000,Group 1-1,true
Would you want new files based on fields 1, 2 and 3?

edit:
this would do that
Code:
sed 1d test.csv | sort -k4 -n | awk -F, '{print >> $1"-"$2"-"$3".csv"}'
No wait, that data is *not* sorted by column 4..

Last edited by Firerat; 06-15-2013 at 03:57 AM.
 
Old 06-15-2013, 04:16 AM   #6
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian Jessie / sid
Posts: 1,471

Rep: Reputation: 444Reputation: 444Reputation: 444Reputation: 444Reputation: 444
here is is correctly sorted

Code:
sed 1d test.csv | sort -t, -k4 -n  | awk -F, 'NR%5==1{File="File"++i".csv";}{print > File}'

will give you files containing 5 lines

But I'm not certain that is what you want
 
  


Reply

Tags
awk, sed, shell


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Insert column with awk or sed between two columns captainentropy Linux - Newbie 9 11-27-2014 11:49 AM
how can I split a file into many files using a string in awk or sed atjurhs Linux - Newbie 15 06-11-2013 11:45 PM
awk split single column into multiple columns based on RS wolverene13 Programming 11 11-01-2012 05:07 PM
split very large 200mb text file by every N lines (sed/awk fails) doug23 Programming 8 08-10-2009 06:08 PM
sed/awk group on first column Eddie Adams Linux - General 4 04-09-2009 10:23 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 01:14 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration