Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I have a big 450 MB (.csv) file that I need to split up into smaller 2 MB files.
I have run the split cmd like this:
"split -b 2m myfile.csv"
That splits up the main file, but it outputs lots of file that look like this, "xaa, xab, xac, etc" with no extensions. Not only are there no extensions for the 200+ files it created, but it also messed up the formatting.
How do I specify what the filename and extension are to be?
Any help in this matter will be greatly appreiciated!
this is all detailed in the manpage for split. there are options to define a standard prefix and scheme for unique extensions given there. not that unix doesn't really give much credence to file suffixes, they don't really mean anything, unlike in windows, so while you can provide a suffix which inclues a . to make it look like windows, don't forget they are not important
I have a big 450 MB (.csv) file that I need to split up into smaller 2 MB files.
I have run the split cmd like this:
"split -b 2m myfile.csv"
That splits up the main file, but it outputs lots of file that look like this, "xaa, xab, xac, etc" with no extensions. Not only are there no extensions for the 200+ files it created, but it also messed up the formatting.
How do I specify what the filename and extension are to be?
Any help in this matter will be greatly appreiciated!
Thanks, th3gh05t
Code:
man split
It has all the options and it also documents the problems you are having. Basically you are not invoking the command the right way.
ok, well split doesn't provide for that particular format, certainly simply to rename file001, file002, file003 to file001.csv, file002.csv etc.... [code] for i in file*; do mv $i $i.csv; done{/code]
are you sure that this is the best way to handle csv data though? you should really be looking at copying entire lines surely?
This is the cmd I used:
"split -a 3 -b 2m -d myfile.csv"
And the files came out like this:
x001, x002, x003
Here is the problem though. Because there is no extension given for the files when they are being split, the formatting gets messed up. Even when I rename the file (x001) to (x001.txt) or (x001.csv), gEdit cannot open the files. It says that it doesn't support the character encoding. But (x001.csv) will open up fine in Spreadsheet.
what do the first few lines look like it you run "head x001" for example. the format should just be plain ascii or something will have gone wrong. the original data is readable too? also what does "file x001" say?
The split command only cuts the file apart. The formatting is still there but only after you join all the files together. You should not make it harder than it really is.
split -b 2m -d -a 3 myfile.csv myfile.csv-
After split, test by doing cat myfile.csv-??? > myfile-splittest.csv
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.