[SOLVED] Script to multiply 29th field in CSV file by 1,56

knilux · 07-30-2012, 04:14 AM

It works on some rows, not all. The outcome is OK in the following rows (the first row is the header). 3, 4, 7, 8, 11, 12, 13, 22, 24, 25, 27, 28, 29 and 30. The rest is left untouched. This time I used a dot, just like you. Please check if you have the same results.

rosehosting.com · 07-30-2012, 06:35 AM

Not tested, but it should work.

Code:

awk 'BEGIN { FS=OFS="\",\""; } { if (NR > 1) $29 = $29 * 1.56; print}' file.csv

schneidz · 07-30-2012, 08:33 AM

i get this from your test input:

Code:

[schneidz@hyper Downloads]$ awk -F \" '{print $58}' testbestand.csv.txt 
PriceLevel1
2.62
13.65
31.97
2.47
2.28
8.66
10.65
22.13
13.54
1.39
37.85
31.93
9.58
0.0804
0.0804
1.62
1.87
1.78
1.94
0.0229
0.0433
0.0318
0.26
0.22
0.24
0.54
0.43
0.59
0.73

knilux · 07-30-2012, 09:13 AM

This is not what it should be. I have attached a pdf to clarify what I mean. I made 3 colums: input, output and calc. The latter is the right outcome of the calculation. (Note: forget the comma's, read them as dots)

Quote:

Originally Posted by schneidz

i get this from your test input:

Code:

[schneidz@hyper Downloads]$ awk -F \" '{print $58}' testbestand.csv.txt 
PriceLevel1
2.62
13.65
31.97
2.47
2.28
8.66
10.65
22.13
13.54
1.39
37.85
31.93
9.58
0.0804
0.0804
1.62
1.87
1.78
1.94
0.0229
0.0433
0.0318
0.26
0.22
0.24
0.54
0.43
0.59
0.73

knilux · 07-30-2012, 09:17 AM

Quote:

Originally Posted by rosehosting.com

Not tested, but it should work.

Code:

awk 'BEGIN { FS=OFS="\",\""; } { if (NR > 1) $29 = $29 * 1.56; print}' file.csv

I am sorry to say, but it does not work completely well. Some of the numbers are not calculated at all. Please also check the previous post with the PDF file.

Maybe awk still gets confused because of the comma's? Is it better to use the "" to check/mark where the fields end?

AnanthaP · 07-30-2012, 09:47 AM

Main thing is that in the hand calculation, you want upto 2 decimals rounded.

a=$29*1.56+.005; b=sprintf("%ld",$a*100); print $29; $b/100.

Check it out on 37.85.
a=59.046+.005=59.051
b=5905.
Therefore the result is 59.05.

The rest is standard awk technique. BTW -N / --use-lc-numeric will use the locale's decimal point as intended. In your case, it seems that you dont want this (need field separators as commas and decimal point as dot).

I refer you to:
http://www.gnu.org/software/gawk/manual/gawk.html
OK

grail · 07-30-2012, 10:14 AM

Well I must say schneidz ... you made me look very hard at my code after your simple solution made me feel like a twit. Unfortunately, your solution has been caught by an issue
I was struggling with to, which is that there are fields represented by - ,,, - this of course has zero quotes in it but do represent fields.

Record 2 has such an entry and hence the value 2.62 is actually the field after the required field.

I have not been able to find a simple solution, short of using a later version:

Code:

#!/usr/bin/awk -f

BEGIN{	OFS=FS=","  }

NR > 1{
    for(i=1;i<=NF;i++){
	n = split($i,_,"\"")

	if( n == 3 || !n )
	    j++
	else
	    if($i ~ /"$/)
		j++

	if(j == 29){
	    sub(/[^"]+/,sprintf("%.2f",_[2] * 1.56),$i)
	    j=0
	    break
	}
    }
}
1

To use simply issue like so:

Code:

./awk_script.awk testbestand.csv.txt

knilux · 07-30-2012, 11:12 AM

Grail, you did it again!

It works!

I used the original file with 7000 records and as far as I could see there were no errors. It took about 2 seconds to process it. Also the 2 decimal rounding that AnanthP mentioned is OK. Although this was not my main concern, as this file is imported into another program and that one handles the rounding too.

Thanks everyone for helping me out. It is great to know there are so many people here that can help you!

schneidz · 07-30-2012, 11:47 AM

good catch grail. i think this would be an example where c wouldve probably been easier.

grail · 07-31-2012, 11:34 AM

Well I am definitely not as adept in C, but here is an alternative I threw together in ruby

Code:

ruby -pe 'if $. > 1; x = $_.scan(/("[^"]*")?,/)[28][0]; $_.sub!(/#{x}/,sprintf("\"%.2f\"",(x.gsub(/"/,"").to_f * 1.56).round(2))); end' testbestand.csv.txt

And according to a diff with awk output, seemed good

knilux · 08-02-2012, 04:43 AM

Thanks for thinking with me. Why did you create another solution in Ruby? Just a challenge?

I tested it, but get an error: -e:1:in `round': wrong number of arguments (1 for 0) (ArgumentError) from -e:1
Something is wrong with the "round" argument. Maybe it needs an extra space after the word "round"?

Quote:

Originally Posted by grail

Well I am definitely not as adept in C, but here is an alternative I threw together in ruby

Code:

ruby -pe 'if $. > 1; x = $_.scan(/("[^"]*")?,/)[28][0]; $_.sub!(/#{x}/,sprintf("\"%.2f\"",(x.gsub(/"/,"").to_f * 1.56).round(2))); end' testbestand.csv.txt

And according to a diff with awk output, seemed good

grail · 08-02-2012, 10:26 AM

Quote:

Why did you create another solution in Ruby? Just a challenge?

I am trying to get better acquainted with Ruby as I think it is a great language and I like its versatility.
And, yes the challenge was fun

As for not working, my only guess would be that you might be using an older version??

Code:

$ ruby --version
ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-linux]

However, I am not 100% sure as fairly new to it. It does run just fine for me and I used copy and paste from your post.
I can tell you that there is no spacing issue as the call to round is being used on the float outputed from the previous set of brackets.

Edit: A quick search would tell me that you are using a pre-1.9 version, possibly 1.8.7 which does not support rounding to a specific number of digits, ie it takes no arguments.

knilux · 08-02-2012, 12:18 PM

Yes , I have Ruby 1.8.7. What is the advantage of Ruby over C? Speed?

If you like a challenge, I have another one for you. Let me know if you like to hear it

grail · 08-02-2012, 01:24 PM

I guess what I like about Ruby is its versatility. As you can see from the example you can work on the output of a function immediately as the type can do a particular thing,
ie "scan" returns an array so you can immediately reference elements of the array

I would guess that speed would not be somewhere it has an advantage over something like C, however the ease and speed at which you can cobble code together I quite like.

Yeah chuck up another challenge. As you usual, i can't guarantee a solution but happy to have a go

knilux · 08-03-2012, 11:57 AM

So, if I want to know something about programming where do I start? C, Ruby or what?

I will start a new thread for the other "challenge", but have to think how to explain what I really want. Maybe it is not even possible. Will let you know when ready.

I have added a new topic: Add URL to a CSV file