LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Simple (?) awk, two delimiters (http://www.linuxquestions.org/questions/programming-9/simple-awk-two-delimiters-707169/)

int0x80 02-24-2009 12:04 PM

Simple (?) awk, two delimiters
 
I am trying to compile a list of IP addresses, timestamps, and useragents to my site. My log format is as follows:

Code:

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined

Example:
216.54.147.14 - - [24/Feb/2009:08:53:50 -0500] "GET / HTTP/1.1" 200 56 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.6) Gecko/2009020911 Ubuntu/8.04 (hardy) Firefox/3.0.6"

I am able to pull items out separately, but cannot figure out how to do this more efficiently.

awk '{print $1, $4, $5}' gets me the IP address and timestamp.
awk -F\" '{print $6}' gets me the full useragent.

Is there a way to combine the two into one awk command? Or is there a fast/easy way to run the two awks separately and correlate the two lists?

theNbomr 02-24-2009 01:41 PM

I have written numerous Perl scripts to parse these files, and have not yet found a one-line (or even a small number of lines) way of splitting the fields cleanly (too many delimiters used). I cannot see how such a format could have been adopted as any kind of standard.
Here's hoping someone comes forward with a clean solution in some regex-supporting language.
--- rod.

ghostdog74 02-24-2009 07:48 PM

Quote:

Originally Posted by int0x80 (Post 3455843)
Is there a way to combine the two into one awk command?

try reading the documentation. Here's a section on field separators.

int0x80 02-25-2009 08:53 AM

Quote:

Originally Posted by ghostdog74 (Post 3456296)
try reading the documentation. Here's a section on field separators.

Perhaps I am approaching this in the wrong manner. My thought is to print the first fields (IP, timestamp) then switch the FS and print the remaining field (useragent). I don't see anything discussing switching the FS inline, however.


All times are GMT -5. The time now is 08:10 AM.