Help working on a script to search for specific data.
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I used his sample data set as a sed exercise. I used variables in the sample sed one-liner, because the patterns would be way to long to post otherwise.
If he were to use sed, he would have more "\($fp\),\($pp\)," pairs in his input pattern to cover each frequency,power pair. The substitution pattern would be like "#\1,\2,\4,\5#w freq1" and "#\1,\2,\6,\7#w freq2 .
Code:
export dayp='[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]]/[[:digit:]][[:digit:]]'
export tp=[[:digit:]][[:digit:]]:[[:digit:]][[:digit:]]:[[:digit:]][[:digit:]]
export durp='[[:digit:]][[:digit:]]*'
export fp='[[:digit:]][[:digit:]]*\.[[:digit:]][[:digit:]]*'
export pp='-[[:digit:]][[:digit:]]*'
sed -n 's#^\('$dayp'\),\('$tp'\),\('$durp'\),\('$fp'\),\('$pp'\),.*$#date \1\ttime \2\tduration \3\tfreq \4\tpower \5#p' sensordata
date 05/02/05 time 17:40:41 duration 22 freq 853.26250 power -120
date 05/02/05 time 17:40:53 duration 345 freq 853.26250 power -120
date 05/02/05 time 17:40:54 duration 372 freq 853.26250 power -120
date 05/02/05 time 17:40:55 duration 399 freq 853.26250 power -120
date 05/02/05 time 17:40:56 duration 427 freq 853.26250 power -120
I imagine that if there are many frequency/power pairs in the data-set, that an awk script may run faster, because the sed version has a hold and get command for each additional output file (freq/power pair)
Just in case it isn't clear,
dayp = day pattern
tp = time pattern
durp = duration pattern
fp = frequency pattern
pp = power pattern
That worked wonderfully, actually both jschiwal and your's worked but the awk command ran faster on the larger chunks of data. Since speed is an issue when I get the final data, I went with awk. "and it's much shorter cleaner code"... My second to last step is to sort the data from each of the files that were generated and copy out only the lines where the power of the frequencies are greater than -100. As in:
The chunks of data you mention there don't really match the
stuff the first awk-run would have created, nor the original
data ... is this a whole new approach to extract data from the
original data-set, like the first approach of mine where I mis-
understood your intentions?
Which is exactly what I want to see, for that phase of my processing. Now, of course each of the 10 files has upwards of 60,000 seconds worth of records, and I really only need the data that is greater than -100 in the power feild. So I need somthing that will process the above text and only leave me with the lines where the power feild is greater than -100. So the output would simply be:
Code:
05/02/05 17:40:53 347 853.26250000 -70.00000
Where the line containing -120.00000 as a power was dropped because it is less than -100.
Hopefully that will clear it up.
Last edited by oracle11112; 05-26-2005 at 03:19 PM.
Which is perfect. It tells me that someone pushed a button on a "Push-to-talk" radio and sent
a message at a frequency of 854.26250000. And for the time frame of 17:40:53-55 the transmission
had a power at the receiver of -70 dbm's. For the time frame of 17:41:01-07 the transmission had
a power at the receiver of -76 dbm's.
So far we've generated 20 files named 1-20, and each containes data like the above each containting
its frequency log.
So for the final task I need a way to calculate the span of each talk period. But as you can see from
the output of the above sample, there is a gap in time when the system was off between 17:40:55 and 17:41:01.
In MathLab I was able to generate the folowing by subtracting each time from the time before it to come up with a
1 or a 0. The 1 was for periods where the time - time before = 1 and the 0 was for time - time before = >1.
The I added all the 1's until I hit a 0 and I started over on the next line.
My output looked like this:
Code:
3
7
So what fancy awk or sed command do you have for me now that can do this? Tink if you can do this I'm donating at least 50$
Is the discriminating feature the time, or could one
safely assume that the difference in the power is an
indicator for the change as well? Just looking for an
optimum approach to the problem ;)
As usual, TINKSTER, you are brilliant. That worked better than I could have possibly imagined. It generates output that I > to seperate files for each of the 20 frequencies. Each file contains a number list of each of the call durations. I know I said that that was the final task but I just found out that the Lead Engineer needs another file with the total of seconds from each file.
I would assume that I could just add all the lines in the file. But I don't know how. The files contain only numbers like
10
16
1
5
18
44
and so on...
Once I have that information, I'll know the total usage time for the system. I'm sure I'll have to calculate somthing else as well, but for now that's what I've been told.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.