ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I want to rearrange one logfile to process it on one analytics tool. For this I've using commands like this:
Code:
awk '{print $2,$1,$4,$7}' file.log
My problem is that some of theses logfields are enclosed into doble quottes and in these fields there are blank spaces and are being treatten like several fields instead one.
in your case you can try something like this:
awk ' { $3 = $1; $1 = ""; $5 = ""; $6 = ""; print } '
or you can use perl, that will keep $7 in one piece.
In a more general case, you should try the following awk code:
Code:
BEGIN {
FS = OFS = "\""
}
{
for ( i = 2; i < NF; i += 2 )
gsub(/ +/, "\033", $i)
split($0, m, " ")
for ( i = 1; i <= length(m); i++ )
gsub("\033", " ", m[i])
print m[2] " " m[1] " " m[4] " " m[7]
}
Basically it uses double quotes as field separator, and changes the spaces inside double quotes pairs with an hidden character (octal code 033). Then it splits the (new/modified) record based on the remaining blank spaces, that are the effective separators as per your requirement. Finally it changes the hidden characters back to blank spaces and prints out the desired fields.
Here is an example:
Code:
$ cat file
one "two two" three four "five five five" six "seven seven"
one "two two" three four "five five five" six "seven seven"
one "two two" three four "five five five" six "seven seven"
$ awk 'BEGIN{ FS = OFS = "\"" }{ for ( i = 2; i < NF; i += 2 ) gsub(/ +/, "\033", $i); split($0, m, " "); for ( i = 1; i <= length(m); i++ ) gsub("\033", " ", m[i]); print m[2] " " m[1] " " m[4] " " m[7] }' file
"two two" one four "seven seven"
"two two" one four "seven seven"
"two two" one four "seven seven"
Using your sample:
Code:
$ awk 'BEGIN{ FS = OFS = "\"" }{ for ( i = 2; i < NF; i += 2 ) gsub(/ +/, "\033", $i); split($0, m, " "); for ( i = 1; i <= length(m); i++ ) gsub("\033", " ", m[i]); print m[2] " " m[1] " " m[4] " " m[7] }' file
09:13:42.483 2012-09-26 - "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/4.0; GTB7.4; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.3; .NET4.0C)"
$ awk 'BEGIN{ FS = OFS = "\"" }{ for ( i = 2; i < NF; i += 2 ) gsub(/ +/, "\033", $i); split($0, m, " "); for ( i = 1; i <= length(m); i++ ) gsub("\033", " ", m[i]); print m[2] " " m[1] " " m[4] " " m[7] }' file
Thank you colucix, it helped me a lot, but it was not completely valid for mi purpose because the field enclosed with double quotes is no the last field that i need to print, I tried this command :
Code:
awk 'BEGIN{ FS = OFS = "\"" }{ for ( i = 2; i < NF; i += 2 ) gsub(/ +/, "\033", $i); split($0, m, " "); for ( i = 1; i <= length(m); i++ ) gsub("\033", " ", m[i]); print m[1] " " m[2] " " m[5] " " m[12] " " m[8] " " m[6] " " m[16] " " m[4] " " m[3] " " m[7] " m[9] }'
But the system threw me the following error
Code:
^ unfinished string
Last edited by diegovillar; 11-06-2012 at 11:18 AM.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.