LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 07-25-2012, 08:58 AM   #16
dugan
Senior Member
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 4,571

Rep: Reputation: 1394Reputation: 1394Reputation: 1394Reputation: 1394Reputation: 1394Reputation: 1394Reputation: 1394Reputation: 1394Reputation: 1394Reputation: 1394

In Python (untested):

Code:
from csv import csv
from os import remove

new_rows = []
with open('file.csv') as f:
    csv_reader = reader(f)
    for row in reader:
        new_rows.append(row)
        new_rows[-1][29] = str(float(row[29]) * 1.56)

remove('files.csv')

with open('file.csv', 'w') as f:
    csv_reader = writer(f)
    csv_writer.writerows(new_rows)

Last edited by dugan; 07-25-2012 at 12:38 PM.
 
Old 07-25-2012, 09:23 AM   #17
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,439

Rep: Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879
Unfortunately Wim's solution will not work, as is evident when tested on the file provided. It suffers from greedy regex.

If you have the latest version of gawk (4+) this is quite easy to overcome:
Code:
awk 'BEGIN{FPAT="([^,]+)|(\"[^\"]+\")"}NR > 1 && $29 = sprintf("%.2f", gensub(/[.]/,",","1",$29) * 1,56)' file
With an older version you will need to process each line with a loop and reconstruct the fields, tiresome but possible so advise if you need such a solution?
 
Old 07-26-2012, 04:39 AM   #18
knilux
Member
 
Registered: Mar 2012
Location: The Netherlands
Distribution: OpenSuSE
Posts: 42

Original Poster
Rep: Reputation: Disabled
When I checked for comma's in the fields, I did not find any because I took another part of the file with shorter descriptions. After testing I got these errors and used them to cut and paste. I should have done that before, but then again, you learn something each day.

I will test Grail's latest solution and come back here. The installed version of gawk is 3.1.6.

Has anyone tested Dugan's solution?
 
Old 07-26-2012, 05:24 AM   #19
knilux
Member
 
Registered: Mar 2012
Location: The Netherlands
Distribution: OpenSuSE
Posts: 42

Original Poster
Rep: Reputation: Disabled
I tested the script twice with different files, but it only seems to remove the first line. But maybe this is because I have an older version of gawk? I will see if I can get a newer one installed.
 
Old 07-26-2012, 06:13 AM   #20
knilux
Member
 
Registered: Mar 2012
Location: The Netherlands
Distribution: OpenSuSE
Posts: 42

Original Poster
Rep: Reputation: Disabled
I am afraid a newer version can not be installed. According to the gawk site version 4 is used for SuSE 12.1. This machine runs 11.3 and has gawk 3.1.6-31.1 installed. Updating gives errors because of missing lib's. Is version 3 not capable of doing this?
 
Old 07-26-2012, 06:47 AM   #21
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 3,847

Rep: Reputation: 590Reputation: 590Reputation: 590Reputation: 590Reputation: 590Reputation: 590
if all the fields are encapsulated in quotes you can use that as the field separator. you just gotta' make sure there arent any quotes inside of any fields.
 
Old 07-26-2012, 08:16 AM   #22
knilux
Member
 
Registered: Mar 2012
Location: The Netherlands
Distribution: OpenSuSE
Posts: 42

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by schneidz View Post
if all the fields are encapsulated in quotes you can use that as the field separator. you just gotta' make sure there arent any quotes inside of any fields.
I have checked the file but could not find any. Importing it in a spreadsheet works fine when I tell it that the quotes are textseparators. I think it is safe to assume there are no quotes in the fields
 
Old 07-26-2012, 08:50 AM   #23
Wim Sturkenboom
Senior Member
 
Registered: Jan 2005
Location: Roodepoort, South Africa
Distribution: Slackware 10.1/10.2/12, Ubuntu 12.04, Crunchbang Statler
Posts: 3,786

Rep: Reputation: 282Reputation: 282Reputation: 282
Opening it in a spreadsheet should work straight away with the comma as separator; you might have to set some additional parameters.

Both openoffice and excel do not give problems with your sample file.
 
Old 07-26-2012, 08:53 AM   #24
schneidz
Senior Member
 
Registered: May 2005
Location: boston, usa
Distribution: fc-15/ fc-20-live-usb/ aix
Posts: 3,847

Rep: Reputation: 590Reputation: 590Reputation: 590Reputation: 590Reputation: 590Reputation: 590
Quote:
Originally Posted by knilux View Post
I have checked the file but could not find any. Importing it in a spreadsheet works fine when I tell it that the quotes are textseparators. I think it is safe to assume there are no quotes in the fields
\ok, so try it with awk and hope for the best.\
 
Old 07-26-2012, 09:06 AM   #25
knilux
Member
 
Registered: Mar 2012
Location: The Netherlands
Distribution: OpenSuSE
Posts: 42

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by schneidz View Post
\ok, so try it with awk and hope for the best.\
I tried before, but it does not work. Probably because my gawk has an older version. Is there no way to use version 3 instead of 4?
 
Old 07-26-2012, 10:34 AM   #26
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,439

Rep: Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879
Right ... so take 3. Yes my previous was only designed to work on 4 and above.

I feel schneidz has let the light in though:
Code:
awk 'BEGIN{OFS=FS="\",\""}(NR > 1 && $29 = sprintf("%.2f", gensub(/[.]/,",","1",$29) * 1,56)) || NR == 1' file
The NR stuff is just so we ignore the first line as it is the header and will not wish to be multiplied by anything. The last NR will still print this line
 
Old 07-27-2012, 05:43 AM   #27
knilux
Member
 
Registered: Mar 2012
Location: The Netherlands
Distribution: OpenSuSE
Posts: 42

Original Poster
Rep: Reputation: Disabled
Thanks Grail. This line begins to do something. It does work on the 29th column but, strangely enough, not on all numbers and the outcome is not correct. Also, when a number is less than 1 it rounds it down to 0. (I had to change the 1,56 to 1.56)

I hope you have any clue. It seems to be more difficult than we anticipated. But thanks for your help and patience so far.
 
Old 07-27-2012, 10:53 AM   #28
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,439

Rep: Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879
I am not sure I follow? have you new data that it is not working with? I did only peruse the results on your current data but most seemed to be good.

Also, if you are now using 1.56, the gensub is no longer needed as it will replace the dot with a comma and I would assume then treat the value like a string and remove everything after the
first non-number, ie the comma.
 
Old 07-30-2012, 01:28 AM   #29
knilux
Member
 
Registered: Mar 2012
Location: The Netherlands
Distribution: OpenSuSE
Posts: 42

Original Poster
Rep: Reputation: Disabled
Yes, I tested the script with another file. That is, I extracted another part of exactly the same big file. It should not make any difference. The data has the same format as the one you tested. I have attached it for your reference.

I tested it with the comma in the 1,56 and with a dot. The results were not the exactly the same but both are not correct.
Attached Files
File Type: txt testbestand.csv.txt (18.0 KB, 5 views)
 
Old 07-30-2012, 02:25 AM   #30
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,439

Rep: Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879Reputation: 1879
Well I am still a little confused on whether or not the locale is playing any part at all, but the following works for me using 1.56:
Code:
awk 'BEGIN{OFS=FS="\",\""}(NR > 1 && $29 = sprintf("%.2f", $29 * 1.56)) || NR == 1' testbestand.csv.txt
 
  


Reply

Tags
csv, script


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Replace a field in quotes within a csv file hattori.hanzo Programming 3 05-29-2012 01:10 AM
Bash or PHP: Split csv file based on field value? guest Programming 4 02-06-2009 12:57 AM
AWK: change a particular field in a csv file help help help!!!! haydar68 Programming 20 08-03-2008 01:10 AM
Split CSV, field as filename richmur Programming 2 10-24-2006 08:39 AM


All times are GMT -5. The time now is 08:01 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration