LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 01-01-2017, 11:58 AM   #1
grumpyskeptic
Member
 
Registered: Apr 2016
Posts: 291

Rep: Reputation: Disabled
Anything that can easily manipulate Comma Separated Value (.csv) files?


Some years ago I wrote a lengthy GWbasic (ie ancient MS DOS) program to manipulate a large multi-column Comma Separated Value (.csv) file that I need to use.

I have managed to get GWbasic running in Linux in DosBox. I have been unable to get a GWbasic-clone, PC-Basic, to run on my computer. Perhaps it is time to write something that runs directly in Linux.

The GWbasic program does things like remove columns, re-arrange the order of columns, sort the whole file according to the contents of one particular column, and add up the values of a particular column when the values in another column are the same.

Is there any program with scripting that can do this? Or what computer language would be easiest to write this in?

Thanks
 
Old 01-01-2017, 12:01 PM   #2
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 13,579

Rep: Reputation: 4342Reputation: 4342Reputation: 4342Reputation: 4342Reputation: 4342Reputation: 4342Reputation: 4342Reputation: 4342Reputation: 4342Reputation: 4342Reputation: 4342
you can use a lot of languages, like awk/perl/python. The easiest one is the one you already know, but actually it depends on the difficulty of your original program too.
 
Old 01-01-2017, 03:26 PM   #3
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 22,314

Rep: Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016
Quote:
Originally Posted by grumpyskeptic View Post
Some years ago I wrote a lengthy GWbasic (ie ancient MS DOS) program to manipulate a large multi-column Comma Separated Value (.csv) file that I need to use.

I have managed to get GWbasic running in Linux in DosBox. I have been unable to get a GWbasic-clone, PC-Basic, to run on my computer. Perhaps it is time to write something that runs directly in Linux.

The GWbasic program does things like remove columns, re-arrange the order of columns, sort the whole file according to the contents of one particular column, and add up the values of a particular column when the values in another column are the same.

Is there any program with scripting that can do this? Or what computer language would be easiest to write this in?
The 'easiest' answer is the one pan64 gave you..it's whatever you're comfortable with. And before I could answer this, I'd ask some questions...like how big this CSV file is (rows/columns), how often you need to manipulate it, how FAST you need it manipulated, and most importantly, how it's generated. Because:
  • It could very well be that whatever is generating this file can be set to output things the way you want them, and eliminate the need to manipulate the CSV at all.
  • If it's HUGE (like hundreds of columns/tens-of-thousands of rows), then language choice makes a difference.
  • If you only have to do this a few times a year, why bother with a program at all? Open the file in Libreoffice spreadsheet, and delete/sort/total/whatever you want, and save it back out. You could even write a macro to do this, if it's always a set number of steps/moves/etc.
Personally, I'd do it in Perl, because that's my typical go-to language for text manipulation, and as pan64 alluded to, go with what you know best.
 
Old 01-01-2017, 05:44 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,479

Rep: Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096
Nice idea TBone - an ex-basic programmer should be able to handle spreadsheet macros pretty naturally.
As for language, here's a vote for awk if the OP has no learned language on Linux. I found the learning curve much easier than perl and the day-to-day usability more accessible. My usage of perl has dropped dramatically over the years in favour of awk.
 
Old 01-01-2017, 07:21 PM   #5
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 22,314

Rep: Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016
Quote:
Originally Posted by syg00 View Post
Nice idea TBone - an ex-basic programmer should be able to handle spreadsheet macros pretty naturally.
As for language, here's a vote for awk if the OP has no learned language on Linux. I found the learning curve much easier than perl and the day-to-day usability more accessible. My usage of perl has dropped dramatically over the years in favour of awk.
Really? That's interesting, because I find perl MUCH easier to work in, mainly because of the huge library of modules you can get from CPAN that can cookie-cutter a lot of what you need. There are lots of modules already done to handle CSV files, which deal with double/single quotes, etc., without having to re-invent the wheel.

I use awk sparingly, but (like you do with awk probably), have a library of perl code I've written over the years I can re-use easily. Once you get that library built, it's hard to avoid using it.
 
Old 01-01-2017, 07:49 PM   #6
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 18,479

Rep: Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096Reputation: 3096
All good points - I take it you haven't moved on to perl6 then ? ....
A new user might think "perl6 must be newer/better than perl5, I'll use that". Now that would be an interesting introduction to languages.
 
Old 01-01-2017, 07:55 PM   #7
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 22,314

Rep: Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016
Quote:
Originally Posted by syg00 View Post
All good points - I take it you haven't moved on to perl6 then ? ....
A new user might think "perl6 must be newer/better than perl5, I'll use that". Now that would be an interesting introduction to languages.
Well...not just yet. I'm waiting. awk is great, though, but just not something that I've spent time learning as well as I could. Like the old adage, "if all you have is a hammer, everything looks like a nail".

But as far as the Op's question goes, it would (to me), depend on how often this has to be done, and the conditions it has to run under. If this is a CSV that's shoveled over from a vendor daily, then that's something that needs to be automated with a proper program. But once a quarter? Overkill...use Excel/Libreoffice to deal with it in a few minutes and be done. Code a program to do it on a rainy day, when you're out of things to do.
 
Old 01-02-2017, 07:50 AM   #8
Sector11
Member
 
Registered: Feb 2010
Distribution: BunsenLabs (Debian Stable)
Posts: 132

Rep: Reputation: Disabled
How often and/or does it get over written by the creating program?

A search for:
Code:
unix|linux manipulate cvs files
came up with some nice awk|sed|perl options in just the first three hits.

I guess "column" isn't what is wanted. From the man page:
Code:
NAME
     column  columnate lists

SYNOPSIS
     column [-entx] [-c columns] [-s sep] [file ...]

DESCRIPTION
     The column utility formats its input into multiple columns.  Rows are filled before columns.  Input is taken from file operands, or,
     by default, from the standard input.  Empty lines are ignored unless the -e option is used.

     The options are as follows:
This did a nice job on a test file:
Code:
column -t -s , /media/5/Documents/Text/packages-rev-cols.txt > /media/5/Documents/Text/packages-rev-cols-output.txt
See the results.
Should be noted; I'm "computer" language challenged. I can tweak scripts, but have problems creating them.
 
Old 01-02-2017, 08:02 AM   #9
Turbocapitalist
Senior Member
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 4,430
Blog Entries: 3

Rep: Reputation: 2206Reputation: 2206Reputation: 2206Reputation: 2206Reputation: 2206Reputation: 2206Reputation: 2206Reputation: 2206Reputation: 2206Reputation: 2206Reputation: 2206
My 2 cents are for perl. It's quick to make short scripts with it. The split() function would be what I'd look at to split the columns.

Edit: if the data is more complicated you might use CPAN's Parse::CSV or Text::CSV

Last edited by Turbocapitalist; 01-02-2017 at 08:07 AM.
 
Old 01-02-2017, 12:13 PM   #10
TB0ne
LQ Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 22,314

Rep: Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016Reputation: 6016
Quote:
Originally Posted by Turbocapitalist View Post
My 2 cents are for perl. It's quick to make short scripts with it. The split() function would be what I'd look at to split the columns.

Edit: if the data is more complicated you might use CPAN's Parse::CSV or Text::CSV
Agreed, and the Text::CSV module is the one I'd use.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Reading comma(,) separated value in Shell script from CSV file jbvijayendra Linux - Newbie 6 06-15-2016 04:17 PM
[SOLVED] Bash script to merge files together (given as a comma separated string) DomeKor Linux - Newbie 10 09-28-2011 12:29 AM
[SOLVED] Randomize the comma separated string in shell mariakumar Linux - General 13 10-11-2010 01:32 AM
Parsing a comma separated CSV file where fields have commas in to trickyflash Linux - General 7 03-26-2009 04:30 PM
How to delete Comma in a comma separated file with double quotes as quote character pklcnu Linux - Newbie 2 03-24-2009 06:50 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 07:05 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration