Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
10-19-2006, 04:13 AM
|
#1
|
|
Member
Registered: Apr 2005
Distribution: Debian, OpenBSD,Fedora,RedHat
Posts: 228
Rep:
|
Value counting in awk scripts
Hi all, I have some situation like below
awk -F"|" '
$44 == "31" && $33 == "0554444" {a=a+1}
$44 == "31" && $33 == "3040444" {b=b+1}
and then this script prints values of a and b for that numbers (code below )
END {print " 0554444 ", a}
END {print " 3040444 ", b}
I want to change this, so I used solutions like below
In file | is delimiter, so I used tr ' | ' | | to translate it to empty space.
cat somefile | tr ' | ' ' ' > somefile.txt
awk '$33 ~ /^4[0-2]/ { print $33 }' somefile.txt > numbersd.txt
But I do not know how to implement counting in this case in other words how count ( on some case ) all values for a and then print them
( print them is an easy part )
The critical part is {a=a+1} I do not understand this. I am new on this position.
Regards
|
|
|
|
10-19-2006, 04:45 AM
|
#2
|
|
LQ Veteran
Registered: Sep 2003
Location: the Netherlands
Distribution: lfs, debian, rhel
Posts: 8,718
|
Hi,
Something like this?
awk '$33 ~ /^4[0-2]/ { print $33 ; a++ } END { print "Amount : " a }' somefile.txt > numbersd.txt
Its output will be like this:
4012345
4112345
4212345
Amount : 3
Input used is:
4012345
4112345
4212345
4312345
4412345
4512345
Hope this helps.
|
|
|
|
10-19-2006, 07:03 AM
|
#3
|
|
Member
Registered: Apr 2005
Distribution: Debian, OpenBSD,Fedora,RedHat
Posts: 228
Original Poster
Rep:
|
I probably wrote bad explanation of my problem ,,,I have folloving situation
fileA something like
x1 x2 x3 x4 x5 x6 x7
33 324 43 5454 54 54 545
345 55 5 4233 4 4 4
...........................
............................
x4 column represents phone numbers fied, and column x5 represents how many calls were sent to that number. Other fields are not interested. Using awk I have to calculate how many calls were past to every phone number. In output file I have to get something like bellow
x4 x5
5454 54
3244 4
... ...
... ...
and so on for every phone number.
I can sort phone numbers to files and numbers of calls, but I do not know how to count calls for every particular number.
If someone know, or have some hint how to do this, please write it down.
Regards
Sarajevo
|
|
|
|
10-19-2006, 07:14 AM
|
#4
|
|
LQ Veteran
Registered: Sep 2003
Location: the Netherlands
Distribution: lfs, debian, rhel
Posts: 8,718
|
Hi,
I still don't understand.
x4 is the unique phone number, x5 is the amount of calls for that number. This means that nothing needs to be calculated.
Isn't printing x4 and x5 if x4 is ^4[0-2] enough?
Something like this:
awk '$33 ~ /^4[0-2]/ { print $33, $34 }' somefile.txt
|
|
|
|
10-19-2006, 07:39 AM
|
#5
|
|
Member
Registered: Apr 2005
Distribution: Debian, OpenBSD,Fedora,RedHat
Posts: 228
Original Poster
Rep:
|
Using this ( $-are random for this case ,I am testing this )
cat somefile.txt | awk '$10 ~ /^9[0-2]/ { print $10 "\t" $20 }'
I got output like this
x4 x5
9333007 7
9333007 8
9333007 9
9333007 7
9333007 9
9000229 9
9000229 5
9000229 7
9001220 7
But I nedd something like below
x4 x5
9333007 40 (7+8+9+7+9)
9000229 21 (9+5+7 )
9001220 7(only seven calls)
What means show all calls for particular phone number.
Druuna thank you anyway.
Regards
|
|
|
|
10-19-2006, 10:39 AM
|
#6
|
|
LQ Veteran
Registered: Sep 2003
Location: the Netherlands
Distribution: lfs, debian, rhel
Posts: 8,718
|
Hi,
This is not as simple as it might look, especially when confined to using awk ( and one or two other unix tools). Personally I would choose perl to tackle this.
But.......
It can (partially) be done as a 'one-liner' using awk and sort:
awk '$1 ~ /^9[0-2]/ { i++ ; number[i]=$1 ;telno[$1]=telno[$1]+$2 } END { for (x=1;x<=i;x++ ) print number[x] "\t" telno[number[x]] }' infile | sort -u
Using the example input you gave in post #5, this will be the output:
Quote:
$ awk '$1 ~ /^9[0-2]/ { i++ ; number[i]=$1 ;telno[$1]=telno[$1]+$2 } END { for (x=1;x<=i;x++ ) print number[x] "\t" telno[number[x]] }' infile | sort -u
9000229 21
9001220 7
|
This will 'only' show the unique phone numbers and the total amount of calls made, that's why I think it's a partial solution to your problem.
You must agree, that if one-liners get this long and/or hard to read, you should script them:
Code:
#!/bin/bash
# Usage: telno <infile>
awk '
$1 ~ /^9[0-2]/ {
i++ ; number[i]=$1 ;telno[$1]=telno[$1]+$2
}
END {
for (x=1;x<=i;x++ ) print number[x] "\t" telno[number[x]]
}' $1 | sort -u
./telno infile
9000229 21
9001220 7
I'll let you try to figure out how the awk command works  If you have any trouble with it, just ask.
Hope this helps.
|
|
|
|
10-19-2006, 10:58 AM
|
#7
|
|
Senior Member
Registered: Aug 2006
Posts: 2,695
|
Python alternative:
Sample input:
33 324 43 5454 54 54 545
345 55 5 4233 4 4 4
33 324 43 5454 54 54 545
345 55 5 4233 9 4 4
33 324 43 5453 54 54 545
345 55 5 4233 4 4 4
33 324 43 5452 20 54 545
345 55 5 4231 19 4 4
Code:
#!/usr/bin/python
store = {} #store results
for line in open("file"):
line = line.strip().split()
phone,freq = line[3],line[4]
if not store.has_key(phone):
store[phone] = 0
store[phone] = store[phone] + int(freq)
for i,j in store.iteritems():
print i, j
output:
Code:
sun:/home # ./test.py
4233 17
5452 20
5453 54
5454 108
4231 19
|
|
|
|
10-20-2006, 12:52 AM
|
#8
|
|
Member
Registered: Apr 2005
Distribution: Debian, OpenBSD,Fedora,RedHat
Posts: 228
Original Poster
Rep:
|
Quote:
|
Originally Posted by sarajevo
Using this ( $-are random for this case ,I am testing this )
cat somefile.txt | awk '$10 ~ /^9[0-2]/ { print $10 "\t" $20 }'
I got output like this
x4 x5
9333007 7
9333007 8
9333007 9
9333007 7
9333007 9
9000229 9
9000229 5
9000229 7
9001220 7
But I nedd something like below
x4 x5
9333007 40 (7+8+9+7+9)
9000229 21 (9+5+7 )
9001220 7(only seven calls)
What means show all calls for particular phone number.
Druuna thank you anyway.
Regards
|
My boss .... Column X5 represent duration of call and solution I got by Druuna works ok, but now I have to make somethig like
x4 Some column
9333007 5 (1+1+1+1+1=5)
9000229 3(1+1+1 )
9001220 1(only one call)
They need how many calls were sent. For example for number 9333007 I had 5 calls, first lasted 7min, second 8 min, third 9 min, fourth 7 min, fifth 9 min and do same for all numbers.
Confusing isn't it ?
Regards
Thank you
|
|
|
|
10-20-2006, 05:13 AM
|
#9
|
|
LQ Veteran
Registered: Sep 2003
Location: the Netherlands
Distribution: lfs, debian, rhel
Posts: 8,718
|
Hi,
Your examples and explanation are indeed confusing......
This:
9333007 7
9333007 9
9000229 9
9001220 7
Only shows phonenumber and amount of calls. Where does the duration of the call come from (You stated: 9333007 had 5 calls, first lasted 7min, second 8 min........)??
And what about this:
9001220 1(only one call)
9001220 7(only seven calls)
9000229 3(1+1+1 )
When do you want to use (1+1+1) and when (only seven calls) I can understand the only one call, although it's not very consistent.
Please clearly explain what it is you must do, including valid (not just some random numbers) examples of the input file(s) and the output expected.
Also post the things you tried yourself.
|
|
|
|
10-20-2006, 08:24 AM
|
#10
|
|
Member
Registered: Apr 2005
Distribution: Debian, OpenBSD,Fedora,RedHat
Posts: 228
Original Poster
Rep:
|
I had terrible week, and I made many mistakes, sorry
I made it using
awk -F"|" '
BEGIN { print "" }
$21 == "12" && $10 == "93120000" {t=t+1}
$21 == "12" && $10 == "90003237" {z=z+1}
$21 == "12" && $10 == "90003238" {b=b+1}
$21 == "12" && $10 == "90003239" {d=d+1}
$21 == "12" && $10 == "90013230" {e=e+1}
$21 == "12" && $10 == "91005535" {f=f+1}
$21 == "12" && $10 == "91006535" {g=g+1}
$21 == "12" && $10 == "91005377" {h=h+1}
$21 == "12" && $10== " 90034324" {i=i+1}
$21 == "12" && $10 == "92006344" {j=j+1}
END {print " 93120000 ", t}
END {print " 90003237 ", z}
END {print " 90003238 ", b}
END {print " 90003239 ", d}
END {print " 90013230 ", e}
END {print " 91005535 ", f}
END {print " 91006536 ", g}
END {print " 91005377 ", h}
END {print " 90034324 ", i}
END {print " 92006344 ", j}'
And this print all vales for every number. Thank you Druuna very much for your help, and sory again for my messy posts.
It is other story that this solution is not nice to see, but it works on some dusty server, and no one will see it....
Thank you again for your help.
Regards 
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 02:37 PM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|