LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Value counting in awk scripts (http://www.linuxquestions.org/questions/programming-9/value-counting-in-awk-scripts-493782/)

sarajevo 10-19-2006 05:13 AM

Value counting in awk scripts
 
Hi all, I have some situation like below

awk -F"|" '

$44 == "31" && $33 == "0554444" {a=a+1}
$44 == "31" && $33 == "3040444" {b=b+1}

and then this script prints values of a and b for that numbers (code below )
END {print " 0554444 ", a}
END {print " 3040444 ", b}

I want to change this, so I used solutions like below

In file | is delimiter, so I used tr ' | ' | | to translate it to empty space.
cat somefile | tr ' | ' ' ' > somefile.txt

awk '$33 ~ /^4[0-2]/ { print $33 }' somefile.txt > numbersd.txt

But I do not know how to implement counting in this case in other words how count ( on some case ) all values for a and then print them
( print them is an easy part )
The critical part is {a=a+1} I do not understand this. I am new on this position.

Regards

druuna 10-19-2006 05:45 AM

Hi,

Something like this?

awk '$33 ~ /^4[0-2]/ { print $33 ; a++ } END { print "Amount : " a }' somefile.txt > numbersd.txt

Its output will be like this:

4012345
4112345
4212345
Amount : 3

Input used is:

4012345
4112345
4212345
4312345
4412345
4512345

Hope this helps.

sarajevo 10-19-2006 08:03 AM

I probably wrote bad explanation of my problem ,,,I have folloving situation

fileA something like

x1 x2 x3 x4 x5 x6 x7
33 324 43 5454 54 54 545
345 55 5 4233 4 4 4
...........................
............................
x4 column represents phone numbers fied, and column x5 represents how many calls were sent to that number. Other fields are not interested. Using awk I have to calculate how many calls were past to every phone number. In output file I have to get something like bellow

x4 x5
5454 54
3244 4
... ...
... ...

and so on for every phone number.

I can sort phone numbers to files and numbers of calls, but I do not know how to count calls for every particular number.

If someone know, or have some hint how to do this, please write it down.

Regards

Sarajevo

druuna 10-19-2006 08:14 AM

Hi,

I still don't understand.

x4 is the unique phone number, x5 is the amount of calls for that number. This means that nothing needs to be calculated.

Isn't printing x4 and x5 if x4 is ^4[0-2] enough?

Something like this:

awk '$33 ~ /^4[0-2]/ { print $33, $34 }' somefile.txt

sarajevo 10-19-2006 08:39 AM

Using this ( $-are random for this case ,I am testing this )

cat somefile.txt | awk '$10 ~ /^9[0-2]/ { print $10 "\t" $20 }'
I got output like this
x4 x5
9333007 7
9333007 8
9333007 9
9333007 7
9333007 9
9000229 9
9000229 5
9000229 7
9001220 7

But I nedd something like below

x4 x5
9333007 40 (7+8+9+7+9)
9000229 21 (9+5+7 )
9001220 7(only seven calls)


What means show all calls for particular phone number.

Druuna thank you anyway.

Regards

druuna 10-19-2006 11:39 AM

Hi,

This is not as simple as it might look, especially when confined to using awk ( and one or two other unix tools). Personally I would choose perl to tackle this.

But.......

It can (partially) be done as a 'one-liner' using awk and sort:

awk '$1 ~ /^9[0-2]/ { i++ ; number[i]=$1 ;telno[$1]=telno[$1]+$2 } END { for (x=1;x<=i;x++ ) print number[x] "\t" telno[number[x]] }' infile | sort -u

Using the example input you gave in post #5, this will be the output:
Quote:

$ awk '$1 ~ /^9[0-2]/ { i++ ; number[i]=$1 ;telno[$1]=telno[$1]+$2 } END { for (x=1;x<=i;x++ ) print number[x] "\t" telno[number[x]] }' infile | sort -u
9000229 21
9001220 7
This will 'only' show the unique phone numbers and the total amount of calls made, that's why I think it's a partial solution to your problem.

You must agree, that if one-liners get this long and/or hard to read, you should script them:
Code:

#!/bin/bash
# Usage: telno <infile>

awk '
$1 ~ /^9[0-2]/ {
  i++ ; number[i]=$1 ;telno[$1]=telno[$1]+$2
}
END {
  for (x=1;x<=i;x++ ) print number[x] "\t" telno[number[x]]
}' $1 | sort -u


./telno infile
9000229 21
9001220 7

I'll let you try to figure out how the awk command works ;) If you have any trouble with it, just ask.

Hope this helps.

ghostdog74 10-19-2006 11:58 AM

Python alternative:
Sample input:
33 324 43 5454 54 54 545
345 55 5 4233 4 4 4
33 324 43 5454 54 54 545
345 55 5 4233 9 4 4
33 324 43 5453 54 54 545
345 55 5 4233 4 4 4
33 324 43 5452 20 54 545
345 55 5 4231 19 4 4

Code:

#!/usr/bin/python
store = {} #store results
for line in open("file"):
        line = line.strip().split()
        phone,freq = line[3],line[4]       
        if not store.has_key(phone):
                store[phone] = 0
        store[phone] = store[phone]  + int(freq)

for i,j in store.iteritems():
        print i, j

output:
Code:

sun:/home # ./test.py
4233 17
5452 20
5453 54
5454 108
4231 19


sarajevo 10-20-2006 01:52 AM

Quote:

Originally Posted by sarajevo
Using this ( $-are random for this case ,I am testing this )

cat somefile.txt | awk '$10 ~ /^9[0-2]/ { print $10 "\t" $20 }'
I got output like this
x4 x5
9333007 7
9333007 8
9333007 9
9333007 7
9333007 9
9000229 9
9000229 5
9000229 7
9001220 7

But I nedd something like below

x4 x5
9333007 40 (7+8+9+7+9)
9000229 21 (9+5+7 )
9001220 7(only seven calls)


What means show all calls for particular phone number.

Druuna thank you anyway.

Regards




My boss .... Column X5 represent duration of call and solution I got by Druuna works ok, but now I have to make somethig like

x4 Some column
9333007 5 (1+1+1+1+1=5)
9000229 3(1+1+1 )
9001220 1(only one call)
They need how many calls were sent. For example for number 9333007 I had 5 calls, first lasted 7min, second 8 min, third 9 min, fourth 7 min, fifth 9 min and do same for all numbers. :confused: :study:
Confusing isn't it ?
Regards
Thank you

druuna 10-20-2006 06:13 AM

Hi,

Your examples and explanation are indeed confusing......

This:

9333007 7
9333007 9
9000229 9
9001220 7


Only shows phonenumber and amount of calls. Where does the duration of the call come from (You stated: 9333007 had 5 calls, first lasted 7min, second 8 min........)??

And what about this:

9001220 1(only one call)
9001220 7(only seven calls)
9000229 3(1+1+1 )


When do you want to use (1+1+1) and when (only seven calls) I can understand the only one call, although it's not very consistent.

Please clearly explain what it is you must do, including valid (not just some random numbers) examples of the input file(s) and the output expected.

Also post the things you tried yourself.

sarajevo 10-20-2006 09:24 AM

I had terrible week, and I made many mistakes, sorry

I made it using

awk -F"|" '

BEGIN { print "" }
$21 == "12" && $10 == "93120000" {t=t+1}
$21 == "12" && $10 == "90003237" {z=z+1}
$21 == "12" && $10 == "90003238" {b=b+1}
$21 == "12" && $10 == "90003239" {d=d+1}
$21 == "12" && $10 == "90013230" {e=e+1}
$21 == "12" && $10 == "91005535" {f=f+1}
$21 == "12" && $10 == "91006535" {g=g+1}
$21 == "12" && $10 == "91005377" {h=h+1}
$21 == "12" && $10== " 90034324" {i=i+1}
$21 == "12" && $10 == "92006344" {j=j+1}

END {print " 93120000 ", t}
END {print " 90003237 ", z}
END {print " 90003238 ", b}
END {print " 90003239 ", d}
END {print " 90013230 ", e}
END {print " 91005535 ", f}
END {print " 91006536 ", g}
END {print " 91005377 ", h}
END {print " 90034324 ", i}
END {print " 92006344 ", j}'


And this print all vales for every number. Thank you Druuna very much for your help, and sory again for my messy posts.
It is other story that this solution is not nice to see, but it works on some dusty server, and no one will see it....

Thank you again for your help.

Regards :)


All times are GMT -5. The time now is 03:42 PM.