Anything that can easily manipulate Comma Separated Value (.csv) files?
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
Anything that can easily manipulate Comma Separated Value (.csv) files?
Some years ago I wrote a lengthy GWbasic (ie ancient MS DOS) program to manipulate a large multi-column Comma Separated Value (.csv) file that I need to use.
I have managed to get GWbasic running in Linux in DosBox. I have been unable to get a GWbasic-clone, PC-Basic, to run on my computer. Perhaps it is time to write something that runs directly in Linux.
The GWbasic program does things like remove columns, re-arrange the order of columns, sort the whole file according to the contents of one particular column, and add up the values of a particular column when the values in another column are the same.
Is there any program with scripting that can do this? Or what computer language would be easiest to write this in?
you can use a lot of languages, like awk/perl/python. The easiest one is the one you already know, but actually it depends on the difficulty of your original program too.
Some years ago I wrote a lengthy GWbasic (ie ancient MS DOS) program to manipulate a large multi-column Comma Separated Value (.csv) file that I need to use.
I have managed to get GWbasic running in Linux in DosBox. I have been unable to get a GWbasic-clone, PC-Basic, to run on my computer. Perhaps it is time to write something that runs directly in Linux.
The GWbasic program does things like remove columns, re-arrange the order of columns, sort the whole file according to the contents of one particular column, and add up the values of a particular column when the values in another column are the same.
Is there any program with scripting that can do this? Or what computer language would be easiest to write this in?
The 'easiest' answer is the one pan64 gave you..it's whatever you're comfortable with. And before I could answer this, I'd ask some questions...like how big this CSV file is (rows/columns), how often you need to manipulate it, how FAST you need it manipulated, and most importantly, how it's generated. Because:
It could very well be that whatever is generating this file can be set to output things the way you want them, and eliminate the need to manipulate the CSV at all.
If it's HUGE (like hundreds of columns/tens-of-thousands of rows), then language choice makes a difference.
If you only have to do this a few times a year, why bother with a program at all? Open the file in Libreoffice spreadsheet, and delete/sort/total/whatever you want, and save it back out. You could even write a macro to do this, if it's always a set number of steps/moves/etc.
Personally, I'd do it in Perl, because that's my typical go-to language for text manipulation, and as pan64 alluded to, go with what you know best.
Nice idea TBone - an ex-basic programmer should be able to handle spreadsheet macros pretty naturally.
As for language, here's a vote for awk if the OP has no learned language on Linux. I found the learning curve much easier than perl and the day-to-day usability more accessible. My usage of perl has dropped dramatically over the years in favour of awk.
Nice idea TBone - an ex-basic programmer should be able to handle spreadsheet macros pretty naturally.
As for language, here's a vote for awk if the OP has no learned language on Linux. I found the learning curve much easier than perl and the day-to-day usability more accessible. My usage of perl has dropped dramatically over the years in favour of awk.
Really? That's interesting, because I find perl MUCH easier to work in, mainly because of the huge library of modules you can get from CPAN that can cookie-cutter a lot of what you need. There are lots of modules already done to handle CSV files, which deal with double/single quotes, etc., without having to re-invent the wheel.
I use awk sparingly, but (like you do with awk probably), have a library of perl code I've written over the years I can re-use easily. Once you get that library built, it's hard to avoid using it.
All good points - I take it you haven't moved on to perl6 then ? ....
A new user might think "perl6 must be newer/better than perl5, I'll use that". Now that would be an interesting introduction to languages.
All good points - I take it you haven't moved on to perl6 then ? ....
A new user might think "perl6 must be newer/better than perl5, I'll use that". Now that would be an interesting introduction to languages.
Well...not just yet. I'm waiting. awk is great, though, but just not something that I've spent time learning as well as I could. Like the old adage, "if all you have is a hammer, everything looks like a nail".
But as far as the Op's question goes, it would (to me), depend on how often this has to be done, and the conditions it has to run under. If this is a CSV that's shoveled over from a vendor daily, then that's something that needs to be automated with a proper program. But once a quarter? Overkill...use Excel/Libreoffice to deal with it in a few minutes and be done. Code a program to do it on a rainy day, when you're out of things to do.
How often and/or does it get over written by the creating program?
A search for:
Code:
unix|linux manipulate cvs files
came up with some nice awk|sed|perl options in just the first three hits.
I guess "column" isn't what is wanted. From the man page:
Code:
NAME
column — columnate lists
SYNOPSIS
column [-entx] [-c columns] [-s sep] [file ...]
DESCRIPTION
The column utility formats its input into multiple columns. Rows are filled before columns. Input is taken from file operands, or,
by default, from the standard input. Empty lines are ignored unless the -e option is used.
The options are as follows:
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.