LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-26-2005, 12:14 AM   #16
jschiwal
LQ Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682Reputation: 682

I used his sample data set as a sed exercise. I used variables in the sample sed one-liner, because the patterns would be way to long to post otherwise.
If he were to use sed, he would have more "\($fp\),\($pp\)," pairs in his input pattern to cover each frequency,power pair. The substitution pattern would be like "#\1,\2,\4,\5#w freq1" and "#\1,\2,\6,\7#w freq2 .

Code:
export dayp='[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]]'
export tp=[[:digit:]][[:digit:]]:[[:digit:]][[:digit:]]:[[:digit:]][[:digit:]]
export durp='[[:digit:]][[:digit:]]*'
export fp='[[:digit:]][[:digit:]]*\.[[:digit:]][[:digit:]]*'
export pp='-[[:digit:]][[:digit:]]*'

sed -n 's#^\('$dayp'\),\('$tp'\),\('$durp'\),\('$fp'\),\('$pp'\),.*$#date \1\ttime \2\tduration \3\tfreq \4\tpower \5#p' sensordata
date 05/02/05   time 17:40:41   duration 22     freq 853.26250  power -120
date 05/02/05   time 17:40:53   duration 345    freq 853.26250  power -120
date 05/02/05   time 17:40:54   duration 372    freq 853.26250  power -120
date 05/02/05   time 17:40:55   duration 399    freq 853.26250  power -120
date 05/02/05   time 17:40:56   duration 427    freq 853.26250  power -120
I imagine that if there are many frequency/power pairs in the data-set, that an awk script may run faster, because the sed version has a hold and get command for each additional output file (freq/power pair)
Just in case it isn't clear,
dayp = day pattern
tp = time pattern
durp = duration pattern
fp = frequency pattern
pp = power pattern
 
Old 05-26-2005, 07:24 AM   #17
oracle11112
LQ Newbie
 
Registered: May 2005
Posts: 12

Original Poster
Rep: Reputation: 0
Tink,

That worked wonderfully, actually both jschiwal and your's worked but the awk command ran faster on the larger chunks of data. Since speed is an issue when I get the final data, I went with awk. "and it's much shorter cleaner code"... My second to last step is to sort the data from each of the files that were generated and copy out only the lines where the power of the frequencies are greater than -100. As in:


05/02/05,17:40:53,853.26250,-120
05/02/05,17:40:53,853.26250,-70
05/02/05,17:40:53,853.26250,-70
05/02/05,17:40:53,853.26250,-120
05/02/05,17:40:53,853.26250,-120
...

So again we're looking at

date, time, freqency, power

If I copy out on the creater than -100 power's the new output file will contain

05/02/05,17:40:53,853.26250,-70
05/02/05,17:40:53,853.26250,-70
....

if you're wondering the power is measured in dbm, not that that is important but that's why the it's measured in negatives.
 
Old 05-26-2005, 01:31 PM   #18
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
The chunks of data you mention there don't really match the
stuff the first awk-run would have created, nor the original
data ... is this a whole new approach to extract data from the
original data-set, like the first approach of mine where I mis-
understood your intentions?



Cheers,
Tink
 
Old 05-26-2005, 03:17 PM   #19
oracle11112
LQ Newbie
 
Registered: May 2005
Posts: 12

Original Poster
Rep: Reputation: 0
Ok so lets start from the begining.

So far My orriginal data looked like this:

Code:
05/02/05,17:40:46,156,853.26250,-120,853.83750,-80,854.51250,-120,855.21250,-120,855.71250,-120,868.60000,-121,868.91250,-121,869.00000,-121,867.00000,-121,868.00000,-121
05/02/05,17:40:46,157,853.26250,-120,853.83750,-80,854.51250,-120,855.21250,-120,855.71250,-120,868.60000,-121,868.91250,-121,869.00000,-121,867.00000,-121,868.00000,-121
05/02/05,17:40:46,158,853.26250,-120,853.83750,-80,854.51250,-120,855.21250,-120,855.71250,-120,868.60000,-121,868.91250,-121,869.00000,-121,867.00000,-121,868.00000,-121
05/02/05,17:40:53,345,853.26250,-70,853.83750,-83,854.51250,-120,855.21250,-120,855.71250,-120,868.60000,-121,868.91250,-121,869.00000,-121,867.00000,-121,868.00000,-121
05/02/05,17:40:53,346,853.26250,-70,853.83750,-83,854.51250,-120,855.21250,-120,855.71250,-120,868.60000,-121,868.91250,-121,869.00000,-121,867.00000,-121,868.00000,-121
05/02/05,17:40:53,347,853.26250,-70,853.83750,-83,854.51250,-120,855.21250,-120,855.71250,-120,868.60000,-121,868.91250,-121,869.00000,-121,867.00000,-121,868.00000,-121
Where it's in the format of:

Code:
date,time,repetition,frequency1,power1,frequency2,power2,frequency3,power3,frequency4,power4,frequency5,power5,frequency6,power6,freq7,pwr7,freq8,pwr8,freq9,pwr9,freq10,pwr10
You can see from the data that it scans multiple times per second... So I used the uniq command like this

Code:
 
uniq --check-chars=17 file_name_input.txt > phase1_output_file_name.txt
This gives me the following output in phase1_output_file_name.txt

Code:
05/02/05,17:40:46,156,853.26250,-120,853.83750,-80,854.51250,-120,855.21250,-120,855.71250,-120,868.60000,-121,868.91250,-121,869.00000,-121,867.00000,-121,868.00000,-121
05/02/05,17:40:53,347,853.26250,-70,853.83750,-83,854.51250,-120,855.21250,-120,855.71250,-120,868.60000,-121,868.91250,-121,869.00000,-121,867.00000,-121,868.00000,-121
Next I used your script of:

Code:
awk -F, '{for(i=4; i < NF; i+=2){printf "%9s %8s %2d %-8f %-8f\n", $1, $2,$3,$i,$(i+1) >> (i/2-1) }} ' $phase1_output_file_name
This gives me 10 output files, 1 for each of the frequencies. They each look like:

Code:
05/02/05  17:40:46  156  853.26250000  -120.00000
05/02/05  17:40:53  347  853.26250000  -70.00000
Which is exactly what I want to see, for that phase of my processing. Now, of course each of the 10 files has upwards of 60,000 seconds worth of records, and I really only need the data that is greater than -100 in the power feild. So I need somthing that will process the above text and only leave me with the lines where the power feild is greater than -100. So the output would simply be:

Code:
05/02/05  17:40:53  347  853.26250000  -70.00000
Where the line containing -120.00000 as a power was dropped because it is less than -100.

Hopefully that will clear it up.

Last edited by oracle11112; 05-26-2005 at 03:19 PM.
 
Old 05-26-2005, 03:23 PM   #20
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Code:
awk -F, '{for(i=4; i < NF; i+=2){if( $(i+1) > -100 ){ printf "%9s %8s %2d %8f %8f\n", $1, $2,$3,$i,$(i+1) >> (i/2-1) } }} ' file
should give you only entries greater -100 in the varied files.


Cheers,
Tink


P.S.: I find awk awksome ;}
 
Old 05-26-2005, 09:05 PM   #21
oracle11112
LQ Newbie
 
Registered: May 2005
Posts: 12

Original Poster
Rep: Reputation: 0
You're awsome, I'm learning so much, and AWK is the best.

Ok so everything went great, and from:

Code:
awk -F, '{for(i=4; i < NF; i+=2){if( $(i+1) > -100 ){ printf "%9s %8s %2d %8f %8f\n", $1, $2,$3,$i,$(i+1) >> (i/2-1) } }} ' file
I got the following output:

Code:
05/02/05  17:40:53  347  853.26250000  -70.00000
05/02/05  17:40:54  348  853.26250000  -70.00000
05/02/05  17:40:55  349  853.26250000  -70.00000
05/02/05  17:41:01  355  853.26250000  -76.00000
05/02/05  17:41:02  356  853.26250000  -76.00000
05/02/05  17:41:03  359  853.26250000  -76.00000
05/02/05  17:41:04  360  853.26250000  -76.00000
05/02/05  17:41:05  361  853.26250000  -76.00000
05/02/05  17:41:06  362  853.26250000  -76.00000
05/02/05  17:41:07  363  853.26250000  -76.00000
Which is perfect. It tells me that someone pushed a button on a "Push-to-talk" radio and sent
a message at a frequency of 854.26250000. And for the time frame of 17:40:53-55 the transmission
had a power at the receiver of -70 dbm's. For the time frame of 17:41:01-07 the transmission had
a power at the receiver of -76 dbm's.

So far we've generated 20 files named 1-20, and each containes data like the above each containting
its frequency log.

So for the final task I need a way to calculate the span of each talk period. But as you can see from
the output of the above sample, there is a gap in time when the system was off between 17:40:55 and 17:41:01.
In MathLab I was able to generate the folowing by subtracting each time from the time before it to come up with a
1 or a 0. The 1 was for periods where the time - time before = 1 and the 0 was for time - time before = >1.
The I added all the 1's until I hit a 0 and I started over on the next line.

My output looked like this:

Code:
3
7
So what fancy awk or sed command do you have for me now that can do this? Tink if you can do this I'm donating at least 50$
 
Old 05-26-2005, 11:00 PM   #22
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Is the discriminating feature the time, or could one
safely assume that the difference in the power is an
indicator for the change as well? Just looking for an
optimum approach to the problem ;)


Cheers,
Tink
 
Old 05-27-2005, 05:39 AM   #23
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Ooooh kay :)

On the last bit of data the following awk script (a bit
more complex than the plain splitting) gives the desired
result...

Code:
#!/usr/bin/awk -f
function tim_secs(string){
  split( string, secs , ":")
    return ( secs[1]*3600 + secs[2]*60 + secs[3])
}
BEGIN{
  first=0
  new=1
}
{
  one=tim_secs( $2)
  if(first!=0){
    if((one - two)==1){
      a[new]+=1
    }else{
      new+=1
    }
  }
  first=1
  two=one
}
END{
  for (i in a) print a[i]+1
}
Just save it to some file, chmod u+x it, and run it like so:

awkfile data.txt

For a set of e.g. 20 files an invocaltion like
Code:
shopt -s extglob;for i in `ls -1 +([0-9])`;do  echo $i:;  awkfile $i;  echo;done; shopt -u extglob
should output the seconds for each file...

Cheers,
Tink
 
Old 05-27-2005, 12:27 PM   #24
oracle11112
LQ Newbie
 
Registered: May 2005
Posts: 12

Original Poster
Rep: Reputation: 0
As usual, TINKSTER, you are brilliant. That worked better than I could have possibly imagined. It generates output that I > to seperate files for each of the 20 frequencies. Each file contains a number list of each of the call durations. I know I said that that was the final task but I just found out that the Lead Engineer needs another file with the total of seconds from each file.

I would assume that I could just add all the lines in the file. But I don't know how. The files contain only numbers like

10
16
1
5
18
44

and so on...

Once I have that information, I'll know the total usage time for the system. I'm sure I'll have to calculate somthing else as well, but for now that's what I've been told.
 
Old 05-27-2005, 04:06 PM   #25
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
Thinking too complicated here :}

Since all the lines represent one second you just need to add a wc -l
in the loop ;}

Code:
rm totals; shopt -s extglob;for i in `ls -1 +([0-9])`;do wc -l $i >> totals; echo $i:;  awkfile $i;  echo; done; shopt -u extglob

Cheers,
Tink
 
Old 05-28-2005, 11:20 AM   #26
oracle11112
LQ Newbie
 
Registered: May 2005
Posts: 12

Original Poster
Rep: Reputation: 0
Yes I was thinking way to complicated, I got it to work with a expr function loop but you're way works way better and way shorter thanks.
 
Old 05-28-2005, 07:00 PM   #27
Tinkster
Moderator
 
Registered: Apr 2002
Location: earth
Distribution: slackware by choice, others too :} ... android.
Posts: 23,067
Blog Entries: 11

Rep: Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928Reputation: 928
One is glad to be of service :D


Cheers,
Tink
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
assign new data partition to specific users wycolorado Linux - Newbie 2 01-01-2005 09:53 PM
how to search files with specific contents ? sachinh Linux - Security 4 07-22-2004 08:00 AM
reading data from a specific hd hard locks system thesammy Linux - Hardware 0 07-17-2004 02:17 AM
search for specific text in fields using awk Helene Programming 2 04-23-2004 12:13 AM
search for data in MySQL between two times. rhuser Programming 3 03-14-2003 12:32 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:46 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration