Spliting files
Hi,
I have a big 450 MB (.csv) file that I need to split up into smaller 2 MB files. I have run the split cmd like this: "split -b 2m myfile.csv" That splits up the main file, but it outputs lots of file that look like this, "xaa, xab, xac, etc" with no extensions. Not only are there no extensions for the 200+ files it created, but it also messed up the formatting. How do I specify what the filename and extension are to be? Any help in this matter will be greatly appreiciated! Thanks, th3gh05t |
this is all detailed in the manpage for split. there are options to define a standard prefix and scheme for unique extensions given there. not that unix doesn't really give much credence to file suffixes, they don't really mean anything, unlike in windows, so while you can provide a suffix which inclues a . to make it look like windows, don't forget they are not important
|
Quote:
Code:
man split Tux, |
Yes, I have looked at that many times, but I can still not figure out what I need to input.
If someone could just post the command, that would be great! Thanks! |
well why not give us an exact example list of filenames you want to end up with and we'll see if we can match them.
|
Quote:
|
ok, well split doesn't provide for that particular format, certainly simply to rename file001, file002, file003 to file001.csv, file002.csv etc.... [code] for i in file*; do mv $i $i.csv; done{/code]
are you sure that this is the best way to handle csv data though? you should really be looking at copying entire lines surely? |
Hi,
This is the cmd I used: "split -a 3 -b 2m -d myfile.csv" And the files came out like this: x001, x002, x003 Here is the problem though. Because there is no extension given for the files when they are being split, the formatting gets messed up. Even when I rename the file (x001) to (x001.txt) or (x001.csv), gEdit cannot open the files. It says that it doesn't support the character encoding. But (x001.csv) will open up fine in Spreadsheet. What should I do? |
what do the first few lines look like it you run "head x001" for example. the format should just be plain ascii or something will have gone wrong. the original data is readable too? also what does "file x001" say?
|
Hi,
file x000: Code:
x000: MPEG ADTS, layer I, v1, 96 kBits, 44.1 kHz, Stereo |
The split command only cuts the file apart. The formatting is still there but only after you join all the files together. You should not make it harder than it really is.
split -b 2m -d -a 3 myfile.csv myfile.csv- After split, test by doing cat myfile.csv-??? > myfile-splittest.csv |
What about the original file? What format is that when you do:
Code:
file myfile.csv |
All times are GMT -5. The time now is 08:45 AM. |