LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 09-14-2006, 02:06 PM   #1
th3gh05t
LQ Newbie
 
Registered: Mar 2005
Distribution: Ubuntu
Posts: 9

Rep: Reputation: 0
Spliting files


Hi,

I have a big 450 MB (.csv) file that I need to split up into smaller 2 MB files.

I have run the split cmd like this:

"split -b 2m myfile.csv"

That splits up the main file, but it outputs lots of file that look like this, "xaa, xab, xac, etc" with no extensions. Not only are there no extensions for the 200+ files it created, but it also messed up the formatting.

How do I specify what the filename and extension are to be?

Any help in this matter will be greatly appreiciated!

Thanks, th3gh05t
 
Old 09-14-2006, 02:10 PM   #2
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
this is all detailed in the manpage for split. there are options to define a standard prefix and scheme for unique extensions given there. not that unix doesn't really give much credence to file suffixes, they don't really mean anything, unlike in windows, so while you can provide a suffix which inclues a . to make it look like windows, don't forget they are not important
 
Old 09-14-2006, 02:13 PM   #3
tuxrules
Senior Member
 
Registered: Jun 2004
Location: Chicago
Distribution: Slackware64 -current
Posts: 1,158

Rep: Reputation: 62
Quote:
Originally Posted by th3gh05t
Hi,

I have a big 450 MB (.csv) file that I need to split up into smaller 2 MB files.

I have run the split cmd like this:

"split -b 2m myfile.csv"

That splits up the main file, but it outputs lots of file that look like this, "xaa, xab, xac, etc" with no extensions. Not only are there no extensions for the 200+ files it created, but it also messed up the formatting.

How do I specify what the filename and extension are to be?

Any help in this matter will be greatly appreiciated!

Thanks, th3gh05t
Code:
man split
It has all the options and it also documents the problems you are having. Basically you are not invoking the command the right way.

Tux,
 
Old 09-14-2006, 02:15 PM   #4
th3gh05t
LQ Newbie
 
Registered: Mar 2005
Distribution: Ubuntu
Posts: 9

Original Poster
Rep: Reputation: 0
Yes, I have looked at that many times, but I can still not figure out what I need to input.

If someone could just post the command, that would be great!

Thanks!
 
Old 09-14-2006, 02:59 PM   #5
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
well why not give us an exact example list of filenames you want to end up with and we'll see if we can match them.
 
Old 09-14-2006, 03:09 PM   #6
th3gh05t
LQ Newbie
 
Registered: Mar 2005
Distribution: Ubuntu
Posts: 9

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by acid_kewpie
well why not give us an exact example list of filenames you want to end up with and we'll see if we can match them.
001.csv, 002.csv, 003.csv, etc
 
Old 09-14-2006, 04:16 PM   #7
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
ok, well split doesn't provide for that particular format, certainly simply to rename file001, file002, file003 to file001.csv, file002.csv etc.... [code] for i in file*; do mv $i $i.csv; done{/code]

are you sure that this is the best way to handle csv data though? you should really be looking at copying entire lines surely?
 
Old 09-14-2006, 04:24 PM   #8
th3gh05t
LQ Newbie
 
Registered: Mar 2005
Distribution: Ubuntu
Posts: 9

Original Poster
Rep: Reputation: 0
Hi,

This is the cmd I used:
"split -a 3 -b 2m -d myfile.csv"

And the files came out like this:
x001, x002, x003

Here is the problem though. Because there is no extension given for the files when they are being split, the formatting gets messed up. Even when I rename the file (x001) to (x001.txt) or (x001.csv), gEdit cannot open the files. It says that it doesn't support the character encoding. But (x001.csv) will open up fine in Spreadsheet.

What should I do?
 
Old 09-14-2006, 04:31 PM   #9
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
what do the first few lines look like it you run "head x001" for example. the format should just be plain ascii or something will have gone wrong. the original data is readable too? also what does "file x001" say?
 
Old 09-14-2006, 04:39 PM   #10
th3gh05t
LQ Newbie
 
Registered: Mar 2005
Distribution: Ubuntu
Posts: 9

Original Poster
Rep: Reputation: 0
Hi,

file x000:
Code:
x000: MPEG ADTS, layer I, v1,  96 kBits, 44.1 kHz, Stereo
Im sorry but I cannot post what shows up when I "head x000" It is sensative client information.
 
Old 09-14-2006, 05:03 PM   #11
Electro
LQ Guru
 
Registered: Jan 2002
Posts: 6,042

Rep: Reputation: Disabled
The split command only cuts the file apart. The formatting is still there but only after you join all the files together. You should not make it harder than it really is.

split -b 2m -d -a 3 myfile.csv myfile.csv-

After split, test by doing cat myfile.csv-??? > myfile-splittest.csv
 
Old 09-15-2006, 04:46 AM   #12
muha
Member
 
Registered: Nov 2005
Distribution: xubuntu, grml
Posts: 451

Rep: Reputation: 38
What about the original file? What format is that when you do:
Code:
file myfile.csv
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Perl: Array spliting, sorry another question :( PB0711 Programming 3 07-27-2006 01:38 PM
Script Question: Spliting A File tonyfreeman Programming 6 03-04-2006 03:35 AM
problem spliting file to fit into a dvd Paxmaster Linux - Software 5 09-18-2005 06:03 PM
saving log files / spliting ethernet without a router w/ linux aarond Linux - Security 2 07-31-2004 02:34 PM
spliting a simple string kubicon Linux - General 2 02-27-2004 04:11 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 05:39 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration