LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Spliting files (https://www.linuxquestions.org/questions/linux-newbie-8/spliting-files-483510/)

th3gh05t 09-14-2006 02:06 PM

Spliting files
 
Hi,

I have a big 450 MB (.csv) file that I need to split up into smaller 2 MB files.

I have run the split cmd like this:

"split -b 2m myfile.csv"

That splits up the main file, but it outputs lots of file that look like this, "xaa, xab, xac, etc" with no extensions. Not only are there no extensions for the 200+ files it created, but it also messed up the formatting.

How do I specify what the filename and extension are to be?

Any help in this matter will be greatly appreiciated!

Thanks, th3gh05t

acid_kewpie 09-14-2006 02:10 PM

this is all detailed in the manpage for split. there are options to define a standard prefix and scheme for unique extensions given there. not that unix doesn't really give much credence to file suffixes, they don't really mean anything, unlike in windows, so while you can provide a suffix which inclues a . to make it look like windows, don't forget they are not important

tuxrules 09-14-2006 02:13 PM

Quote:

Originally Posted by th3gh05t
Hi,

I have a big 450 MB (.csv) file that I need to split up into smaller 2 MB files.

I have run the split cmd like this:

"split -b 2m myfile.csv"

That splits up the main file, but it outputs lots of file that look like this, "xaa, xab, xac, etc" with no extensions. Not only are there no extensions for the 200+ files it created, but it also messed up the formatting.

How do I specify what the filename and extension are to be?

Any help in this matter will be greatly appreiciated!

Thanks, th3gh05t

Code:

man split
It has all the options and it also documents the problems you are having. Basically you are not invoking the command the right way.

Tux,

th3gh05t 09-14-2006 02:15 PM

Yes, I have looked at that many times, but I can still not figure out what I need to input.

If someone could just post the command, that would be great!

Thanks!

acid_kewpie 09-14-2006 02:59 PM

well why not give us an exact example list of filenames you want to end up with and we'll see if we can match them.

th3gh05t 09-14-2006 03:09 PM

Quote:

Originally Posted by acid_kewpie
well why not give us an exact example list of filenames you want to end up with and we'll see if we can match them.

001.csv, 002.csv, 003.csv, etc

acid_kewpie 09-14-2006 04:16 PM

ok, well split doesn't provide for that particular format, certainly simply to rename file001, file002, file003 to file001.csv, file002.csv etc.... [code] for i in file*; do mv $i $i.csv; done{/code]

are you sure that this is the best way to handle csv data though? you should really be looking at copying entire lines surely?

th3gh05t 09-14-2006 04:24 PM

Hi,

This is the cmd I used:
"split -a 3 -b 2m -d myfile.csv"

And the files came out like this:
x001, x002, x003

Here is the problem though. Because there is no extension given for the files when they are being split, the formatting gets messed up. Even when I rename the file (x001) to (x001.txt) or (x001.csv), gEdit cannot open the files. It says that it doesn't support the character encoding. But (x001.csv) will open up fine in Spreadsheet.

What should I do?

acid_kewpie 09-14-2006 04:31 PM

what do the first few lines look like it you run "head x001" for example. the format should just be plain ascii or something will have gone wrong. the original data is readable too? also what does "file x001" say?

th3gh05t 09-14-2006 04:39 PM

Hi,

file x000:
Code:

x000: MPEG ADTS, layer I, v1,  96 kBits, 44.1 kHz, Stereo
Im sorry but I cannot post what shows up when I "head x000" It is sensative client information.

Electro 09-14-2006 05:03 PM

The split command only cuts the file apart. The formatting is still there but only after you join all the files together. You should not make it harder than it really is.

split -b 2m -d -a 3 myfile.csv myfile.csv-

After split, test by doing cat myfile.csv-??? > myfile-splittest.csv

muha 09-15-2006 04:46 AM

What about the original file? What format is that when you do:
Code:

file myfile.csv


All times are GMT -5. The time now is 08:45 AM.