° is the degree character. The actual binary data depends on what character set is used. In UTF-8 it is
\xC2\xB0 and in ISO-8859-1, ISO-8859-15 and Windows-1252 it is
\xB0 .
Let's assume you don't know the character set, and that you're only interested in getting the five numeric values (and nothing else) from the file using awk. The solution is simple: use a field separator that includes the "degrees Fahrenheit high", and only handle the first line with such fields. Note that because the separator will follow each value, NF will be one more than the number of temperatures.
Code:
awk 'BEGIN { RS="[\r\n]+"; FS="[^-+0-9.,]+[Hh][Ii][Gg][Hh][\t\v\f ]*"; }
(NF > 1) { printf("%s", $1);
for (i = 2; i < NF; i++) printf("\t%s", $i);
printf("\n");
exit(0);
}' input-file > output-file
If you wanted to record the high and low temperatures, and output them on different lines, use
Code:
awk 'BEGIN { RS="[\r\n]+"; FS="[^-+0-9.,]+([Hh][Ii][Gg][Hh]|[Ll][Oo][Ww])[\t\v\f ]*"; }
(NF > 1) { if ($0 ~ /[Hh][Ii][Gg][Hh]/) {
split("", hi);
nhi = NF-1;
for (i = 1; i < nhi; i++) hi[i] = $i;
} else
if ($0 ~ /[Ll][Oo][Ww]/) {
split("", lo)
nlo = NF-1;
for (i = 1; i < nlo; i++) lo[i] = $i;
}
}
END { printf("%s", hi[1]);
for (i = 2; i <= nhi; i++) printf("\t%s", hi[i]);
printf("\n");
printf("%s", lo[1]);
for (i = 2; i <= nlo; i++) printf("\t%s", lo[i]);
printf("\n");
}' input-file > output-file
If you want to output each high and low value in pairs (low1 high1 low2 high2 ... low
N high
N), use
Code:
END { n = nhi; if (nlo > n) n = nlo;
printf("%s\t%s", lo[1], hi[1]);
for (i = 2; i <= n; i++) printf("\t%s\t%s", lo[i], hi[i]);
printf("\n")
}
instead.