LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   awk program. a little query! (https://www.linuxquestions.org/questions/programming-9/awk-program-a-little-query-566237/)

indiancosmonaut 07-03-2007 04:40 AM

awk program. a little query!
 
Hello All,

I have a small query.
I used the following awk command on the unix command line...

awk ' { print $0 } { FS = "( [[:digit:]]{4} )"; print $1; } ' coins.txt

The data in coins.txt is...

gold 1 1986 USA American Eagle
gold 1 1908 Austria-Hungary Franz Josef 100 silver 10 1981 USA ingot
gold 1 1984 Switzerland ingot
gold 1 1979 RSA Krugerrand
gold 0.5 1981 RSA Krugerrand

The output should be:

gold 1 1986 USA American Eagle
gold 1
gold 1 1908 Austria-Hungary Franz Josef 100 Korona
gold 1
silver 10 1981 USA ingot
silver 10
gold 1 1984 Switzerland ingot
gold 1
gold 1 1979 RSA Krugerrand
gold 1
gold 0.5 1981 RSA Krugerrand
gold 0.5

But, it is...

gold 1 1986 USA American Eagle
gold
gold 1 1908 Austria-Hungary Franz Josef 100 Korona
gold 1
silver 10 1981 USA ingot
silver 10
gold 1 1984 Switzerland ingot
gold 1
gold 1 1979 RSA Krugerrand
gold 1
gold 0.5 1981 RSA Krugerrand
gold 0.5

i.e. : The data in red should be "gold 1" instead of just "gold".

Why is it like this?

Thanks in advance,

indiancosmonaut.

druuna 07-03-2007 05:15 AM

Hi,

The awk line shown doesn't look syntactically correct. Too my knowledge you cannot use the FS option in the main part (it should be in the BEGIN part or given to awk with the -F option).

BTW the input you show does not reflect the output examples: korona is nowhere to be found in the infile.....

Another thing: Is line 2 correct (gold 1 1908 Austria-Hungary Franz Josef 100 silver 10 1981 USA ingot) or should this one line be 2 lines. I.e:
gold 1 1908 Austria-Hungary Franz Josef 100
silver 10 1981 USA ingot

This will do what you want if line 2 are actually 2 lines:

awk '{ print $0 } { print $1, $2 }' coins.txt

Hope this helps.

jschiwal 07-03-2007 05:28 AM

I guess that BEGIN {FS="[[:digit:]]{4}"} will not work.

Code:

cat coins
gold 1 1986 USA American Eagle
gold 1 1908 Austria-Hungary Franz Josef 100
silver 10 1981 USA ingot
gold 1 1984 Switzerland ingot
gold 1 1979 RSA Krugerrand
gold 0.5 1981 RSA Krugerrand
jschiwal@hpamd64:~/testdir> awk 'BEGIN {FS="[[:digit:]]{4}"}{ print $0 "\n" $1,$2 }' coins
gold 1 1986 USA American Eagle
gold 1 1986 USA American Eagle
gold 1 1908 Austria-Hungary Franz Josef 100
gold 1 1908 Austria-Hungary Franz Josef 100
silver 10 1981 USA ingot
silver 10 1981 USA ingot
gold 1 1984 Switzerland ingot
gold 1 1984 Switzerland ingot
gold 1 1979 RSA Krugerrand
gold 1 1979 RSA Krugerrand
gold 0.5 1981 RSA Krugerrand
gold 0.5 1981 RSA Krugerrand
jschiwal@hpamd64:~/testdir> awk 'BEGIN {FS="[0-9][0-9][0-9][0-9]"}{ print $0 "\n" $1 }' coins
gold 1 1986 USA American Eagle
gold 1
gold 1 1908 Austria-Hungary Franz Josef 100
gold 1
silver 10 1981 USA ingot
silver 10
gold 1 1984 Switzerland ingot
gold 1
gold 1 1979 RSA Krugerrand
gold 1
gold 0.5 1981 RSA Krugerrand
gold 0.5


indiancosmonaut 07-04-2007 12:38 AM

Hi druuna, jschiwal

Thank you for responding to my query.

druuna,

You are correct. My apologies for writing the wrong infile content.

Its actually,

gold 1 1986 USA American Eagle
gold 1 1908 Austria-Hungary Franz Josef 100 Korona
silver 10 1981 USA ingot
gold 1 1984 Switzerland ingot
gold 1 1979 RSA Krugerrand
gold 0.5 1981 RSA Krugerrand

jschiwal,

The following are working fine...

1. awk 'BEGIN {FS="[0-9][0-9][0-9][0-9]"}{ print $0 "\n" $1 }' coins
2. awk 'BEGIN {FS="[[:digit:]]{4}"}{ print $0 "\n" $1 }' coins
3. awk 'BEGIN {FS=" [[:digit:]]{4} "}{ print $0 "\n" $1 }' coins

The output:

gold 1 1986 USA American Eagle
gold 1
gold 1 1908 Austria-Hungary Franz Josef 100 Korona
gold 1
silver 10 1981 USA ingot
silver 10
gold 1 1984 Switzerland ingot
gold 1
gold 1 1979 RSA Krugerrand
gold 1
gold 0.5 1981 RSA Krugerrand
gold 0.5
----------------------------------------------------------------------

One more question:
In example.3. i have put a space before and after the [[:digit:]]{4}.
Do you see any difference in example 2 and 3.

Thanks a lot to both.

Best Regards,
indiancosmonaut

ghostdog74 07-04-2007 01:02 AM

what about this:
Code:

awk '{ print $0 }{print $1,$2}' "file"
output:
Code:

# ./test1.sh
gold 1 1986 USA American Eagle
gold 1
gold 1 1908 Austria-Hungary Franz Josef 100 silver 10 1981 USA ingot
gold 1
gold 1 1984 Switzerland ingot
gold 1
gold 1 1979 RSA Krugerrand
gold 1
gold 0.5 1981 RSA Krugerrand
gold 0.5


jschiwal 07-04-2007 01:33 AM

The "BEGIN {FS="[[:digit:]]}" part does not work on my system.
Code:

awk 'BEGIN {FS="[[:digit:]]{4}"}{ print $1  }' coins
gold 1 1986 USA American Eagle
gold 1 1908 Austria-Hungary Franz Josef 100
silver 10 1981 USA ingot
gold 1 1984 Switzerland ingot
gold 1 1979 RSA Krugerrand
gold 0.5 1981 RSA Krugerrand

Here I tried to just print the first field.
This works:
Code:

awk 'BEGIN {FS="[0-9][0-9][0-9]"}{ print $1  }' coins
gold 1
gold 1
silver 10
gold 1
gold 1
gold 0.5

Code:

awk --version
GNU Awk 3.1.5
...


indiancosmonaut 07-05-2007 04:38 AM

Hi jschiwal,

Its is working on my system. I couldn't find the version of my awk!
Apparently it is 'POSIX friendly'.

Thanks a lot for the replies. They really helped! [:)]

-----

Hi ghostdog74,

Actually i was looking to clear my concept on fieldseparator.
But thanks anyway.

Best Regards,
indiancosmonaut

ghostdog74 07-05-2007 09:48 AM

to use metacharacters and character lists in regular expression, use the --re-interval option. Or use --posix option


All times are GMT -5. The time now is 01:03 PM.