Please refer to the GNU Awk User's Guide.
The initial chapters are short and easy to read, but describe the methodology behind awk scripts.
Understanding those first makes it much, much easier to write and understand awk scripts.
Awk scripts are composed of
actions, which are applied to
records.
By default, each line is a separate record.
Each action is a snippet of code within braces, optionally preceded by a pattern. If there is a pattern, the action will be applied only to matching records.
To remove all non-digits from the sixth field, you can use e.g.
Code:
gsub(/[^0-9]+/, "", $6)
within an action. On the other hand, you can ignore leading characters, and convert the first numeric sequence to a number, using e.g.
Code:
gsub(/^[^+\-.0-9]+/, "", $6)
$6 = 1.0 * $6
where the second line converts the rest of the field to a number, via awk's automatic type conversion. This way, "aa-1.2e6bbc" will be converted to "-1.2e6" == -1200000.
Also, if you check the Guide, you'll see that
gsub() modifies the string in place. This is typical for awk, and may be surprising.
To print the modified record in an action,
suffices; it prints the entire record if no arguments are given.
If you wished to reorder the fields, or print only some of them, you can use e.g.
Code:
print $6, $5, $4, $3, $2, $1
Note that the comma is a parameter separator, and is not printed.
Leaving it out would concatenate the fields without any separators.
To do the above for each input record, put it in an action. The end result will be something like
Code:
gawk '{ gsub(/[^0-9]+/, "", $6); print }'
Cheers,
Nominal Animal