ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
for information only
interesting different outputs ,many fields agree ,others are wrong
with the same file
which you see here
Minimum_ 10.0 1.0 27.0 1.0 34.0 1.0 24.0 1.0 #your code
daily-minimum 10.0 27.8 34.5 24.4 23.6 41.6 36.9 17.2 #the Output from Nominal Animalis correct
Yes, I noticed that after I posted, and was working on a correction with somewhat more complicated calculation algorithms. (There is a problem using sum(x^2)-(avg(x))^2 in a computation: If the numbers are large and not too far from the average, the computed difference may be zero or the sum of imprecisely computed values. Your numbers seem like they might be like that.)
The problem in the code I posted is the test if ($field) { which should be if ($field !="") { so the first function looks like this:
Code:
function get_stats_for(date, dates, count, sum, ss, min, max, field)
{
dates[date]++
for (field = 6; field <= NF; field++) {
if ($field !="") {
++count[date, field]
if (count[date, field] == 1) {
min[date, field]=max[date, field]=$field
sum[date, field]=ss[date, field]=0.0
}
sum[date, field] += $field
ss[date, field] += $field * $field
if ($field < min[date, field]) min[date, field] = $field
if ($field > max[date, field]) max[date, field] = $field
}
}
}
That test may not, in fact, be necessary: It protects against the case where one of the input fields is entered as a null string, which would only happen if you were using an actual csv file for your input, and an observation was missing (i.e., entered as ",,").
Oh, if you're willing to assume that the first observation represents a "typical" value, then the folling pair of function may produce more numerically stable results:
Code:
function get_stats_for(date, dates, count, sum, ss, min, max, field)
{
dates[date]++
for (field = 6; field <= NF; field++) {
if ($field !="") {
++count[date, field]
if (count[date, field] == 1) {
min[date, field]=max[date, field]=$field
sum[date, field]=ss[date,field]=0.0
base[date, field]=$field
}
diff=($field) - base[date,field]
sum[date, field] += diff
ss[date, field] += diff^2
if ($field < min[date, field]) min[date, field] = $field
if ($field > max[date, field]) max[date, field] = $field
}
}
}
function write_results(unit, dates, count, sum, ss, min, max, n,k,i,sorted,date)
{
printf("\n%s results\n", unit)
n=asorti(dates, sorted)
for (k=1;k<=n;++k) {
date=sorted[k]
print " " date
printf("\tCount")
for (field = 6; field<=fields;++field) {
printf("\t%d", count[date,field])
}
printf("\n\tMinimum")
for (field = 6; field <= fields; field++) {
printf("\t%.1f", min[date,field])
}
printf("\n\tAverage%s",datum, s)
for (field = 6; field <= fields; field++) {
printf("\t%.1f", (count[date,field]>0)?((sum[date,field] / count[date,field])+base[date,field]):0)
}
printf("\n\tStdErr")
for (field = 6; field <= fields; field++) {
if (count[date,field] > 1) {
avg=sum[date,field]/count[date,field]
printf("\t%.1f", sqrt((ss[date,field] - avg^2) / (count[date,field]-1)))
}
else {
printf("\t%.1f", 0.0)
}
}
printf("\n\tMaximum")
for (field = 6; field <= fields; field++) {
printf("\t%.1f", max[date,field])
}
printf("\n\n")
}
}
I have a new final output of a log file
number of columns are the same
However, this has changed the date format
Unfortunately, the historian complex with the date
see here
old complex string
{ cmd = "LANG=C LC_ALL=C date -d \047" $1 " " $2 " " $3 " " $4 " " $5 "\047 +\047%Y-%m-%d %GW%V %Y-%m\047"
cmd | getline datestr
close(cmd)
for information only
interesting different outputs ,many fields agree ,others are wrong
with the same file
which you see here
Minimum_ 10.0 1.0 27.0 1.0 34.0 1.0 24.0 1.0 #your code
daily-minimum 10.0 27.8 34.5 24.4 23.6 41.6 36.9 17.2 #the Output from Nominal Animalis correct
I haven't actually looked, but I did mention that I'd changed the variance estimator to a standard error estimate to illustrate how easy it was to make simple changes. (Depending on your data model, a standard error is often preferable to an uncorrected variance. But that's a different subject.)
I haven't actually looked, but I did mention that I'd changed the variance estimator to a standard error estimate to illustrate how easy it was to make simple changes. (Depending on your data model, a standard error is often preferable to an uncorrected variance. But that's a different subject.)
Thanks for the answer
I have another problem
the output of the date has changed
Try using -Rd instead of -d and dropping the whole output format specification. Here's an example:
Code:
$ date -d '03/04/2012 20:55:59' +'%Y-%m-%d %GW%V %Y-%m' #(Using your output specification)
2012-03-04 2012W09 2012-03
$ date -Rd '03/04/2012 20:55:59' #(Using the RFC 2822 standard. My local specifies that times, by default, are UTC-8.)
Sun, 04 Mar 2012 20:55:59 -0800
Note that the output format you used (the first output, above) does not produce the "new output" string you displayed. Is this embedded in your code somewhere? (I didn't try to figure out what code you were actually using, since this thread is somewhat long. I'd suggest that you reference a post number for the whole block of code from which you extract parts for which you have questions.)
Note also that trying your commands interactively (as I did, above) is a useful technique for isolating problems.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.