LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices



Reply
 
Search this Thread
Old 11-07-2007, 11:25 PM   #1
donv2
Member
 
Registered: Nov 2004
Location: Upper right corner of USA
Distribution: Ubuntu/Mint, unSLUng (NSLU2), Arch/PlugApps (Dockstar)
Posts: 50

Rep: Reputation: 15
.csv file upoload, manipulation, download script help


Hi, looking for any helpful pointers or straight out code. I'm a basic hack at programming/scripting but can generally work with examples well to reach the final desired result. I've been googling for hours but have not found a solution that is close yet.

I'm trying to help a friend that has a repeating task of exporting data from her palm pilot (to a .csv file) and going through some manipulation to support printing out mailing labels of the data. The work can be done through a manual spreadsheet manipulation (concatenating some fields), but let's just say that computer skills are not her best strength and I want to get out of the 11pm phone call business.

What I'd like to do is host a script on my shared hosting linux box that would allow her to upload a .csv file, have it automatically concatenate the few fields required, and then provide a download of the resulting .csv file for importing into her label application.

Format of original file:
Code:
"Last","First","Street","Town","ST","Zip"
"Bush","George","1600 Pennsylvania Avenue NW","Washington","DC","20500"
Format of desired output file:
Code:
"Last","First","Street","Town","ST","Zip","Name","city st zip"
"Bush","George","1600 Pennsylvania Avenue NW","Washington","DC","20500","George Bush","Washington, DC  20500"
Resources available on my host include perl 4.x and 5, php, bash, etc.

TIA,
Don

Last edited by donv2; 11-07-2007 at 11:27 PM.
 
Old 11-08-2007, 08:31 AM   #2
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 231Reputation: 231Reputation: 231
I'm sure some Perl fan can come up w/ a different solution, but it seems to me 1 line of sed could do the job:
Code:
sed -r 's+^"([^"]*)","([^"]*)","([^"]*)","([^"]*)","([^"]*)","([^"]*)"+&,"\2 \1","\4 \5 \6"+' old.csv >new.csv
or, if you're daring:
Code:
sed -ri 's+^"([^"]*)","([^"]*)","([^"]*)","([^"]*)","([^"]*)","([^"]*)"+&,"\2 \1","\4 \5 \6"+' old.csv
Notes
  • You will need to clean up the header line -- it will come out:
    "Last","First","Street","Town","ST","Zip","First Last","Town ST Zip"
  • I have opted to use '+' as my separator instead of the "standard" '/'.
  • RTM sed.
  • I don't think awk will work here because your field separator effectively is '","', not a single character.
  • I am counting on there being quotes around every field.
    I think this is safe because it looks like the exporting program is playing safe & adding them to every field.

Last edited by archtoad6; 11-08-2007 at 08:34 AM. Reason: reformat
 
Old 11-08-2007, 12:20 PM   #3
donv2
Member
 
Registered: Nov 2004
Location: Upper right corner of USA
Distribution: Ubuntu/Mint, unSLUng (NSLU2), Arch/PlugApps (Dockstar)
Posts: 50

Original Poster
Rep: Reputation: 15
Thanks! The difference in the header line I will just fix on the label template side (tell it to look for these new field names instead of the prior ones).

Will try this in a little bit and report back on success.

-Don
 
Old 11-08-2007, 05:46 PM   #4
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 231Reputation: 231Reputation: 231
Quote:
Originally Posted by donv2 View Post
Thanks!
You're very welcome.


Quote:
Originally Posted by donv2 View Post
The difference in the header line I will just fix on the label template side (tell it to look for these new field names instead of the prior ones).
Suit yourself, but it would be another, simpler line of sed to fix them.


Quote:
Originally Posted by donv2 View Post
Will try this in a little bit and report back on success.
Please do, the feedback is nice.
 
Old 11-08-2007, 11:23 PM   #5
angrybanana
Member
 
Registered: Oct 2003
Distribution: Archlinux
Posts: 147

Rep: Reputation: 21
Quote:
Originally Posted by donv2 View Post
Resources available on my host include perl 4.x and 5, php, bash, etc.
I'm gonna take the "etc" to mean that python is also available, then I'm gonna use that as an excuse to write python code .

Code:
import sys
import csv


def mapper(data):
        "Process and modify lines"
        #add fields to first line
        for line in data:
                line += ['name', 'city st zip']
                yield line
                break
        #process rest of lines
        for line in data:
                name = ' '.join(line[1::-1]) 
                address = '%s, %s %s' % tuple(line[3:6])
                line += [name, address]
                yield line

#check for correct number of arguments and parse them
num_args = len(sys.argv)
if not num_args in (2, 3):
        sys.exit('Usage: %s cvsfile [outfile]' %sys.argv[0])
outfile = sys.stdout
if num_args == 3:
        outfile = open(sys.argv[2], 'w')

#setup the reader/writer handlers
reader = csv.reader(open(sys.argv[1]))
writer = csv.writer(outfile, quoting=csv.QUOTE_ALL)
#process and write the data
writer.writerows(mapper(reader))
Usage:
Code:
parse.py csvfile [outfile]
The outfile is optional, if it's not there it will print to stdout.

This script is probably overkill for what you need, but it allows ease of modification if you ever need to add more fields. It also uses the csv module in Python, so it's less error prone and will handle csv that are formatted differently (not all csv's use quotes and such).

EDIT: Do NOT use the same file for input and output, It'll delete your data (since it's reading it line by line)

Last edited by angrybanana; 11-08-2007 at 11:54 PM.
 
Old 11-09-2007, 01:03 AM   #6
angrybanana
Member
 
Registered: Oct 2003
Distribution: Archlinux
Posts: 147

Rep: Reputation: 21
Quote:
Originally Posted by archtoad6 View Post
  • I don't think awk will work here because your field separator effectively is '","', not a single character.
awk can handle this (at least gawk can):
Code:
$ awk 'BEGIN{FS=OFS="\",\"";RS="(\"\n\"|\n\"|\"\n)"} 
{$7="name";$8="city st zip"} NR>1{$7=$2" "$1;$8=$3", "$4" "$5} {ORS=RT}1' file

Last edited by angrybanana; 11-09-2007 at 01:05 AM.
 
Old 12-19-2007, 04:20 PM   #7
archtoad6
Senior Member
 
Registered: Oct 2004
Location: Houston, TX (usa)
Distribution: MEPIS, Debian, Knoppix,
Posts: 4,727
Blog Entries: 15

Rep: Reputation: 231Reputation: 231Reputation: 231
Oops, thanks for the tip.
 
  


Reply

Tags
concatenate, csv, file, manipulation, script


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Filtering a CSV file from web log with shell script? Micro420 Programming 8 08-22-2007 04:13 AM
looking for a perl script to convert html table data into a csv file swiftguy121 Linux - Software 2 04-25-2007 08:28 PM
bash script - remove header row from csv file pljvaldez Programming 5 08-30-2006 12:05 PM
script for file manipulation yongitz Programming 3 08-17-2006 12:22 PM
Shell script to read from csv file hendemeg Programming 1 05-11-2004 09:23 PM


All times are GMT -5. The time now is 03:50 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration