LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices



Reply
 
Search this Thread
Old 11-10-2009, 03:23 PM   #1
LinuxGold
LQ Newbie
 
Registered: Aug 2006
Posts: 2

Rep: Reputation: 0
Retrieving FQDN from squid access.log


I am trying to figure out a way to retrieve FQDN from access.log i.e. for http(s)://www.cnn.com/whever/complex/line/this/might/be I would like output to be:

www.cnn.com

Here is what my access.log contain:

Code:
root@cachepilot:/var/log/squid# tail /var/log/squid/access.log
1257884141.586    119 10.182.16.205 TCP_MISS/200 2423 GET http://t0.gstatic.com/images?q=tbn:B3386mi19hBkMM:http://www.artistryofiron.com/Images/Cutouts/lizzard.jpg - DEFAULT_PARENT/wwwproxy.k12.de.us image/jpeg
1257884141.618    176 10.182.16.205 TCP_MISS/200 5745 GET http://t1.gstatic.com/images?q=tbn:DLRVN_v-bM3PsM:http://lounginlizzard.com/store/images/llhammockchair.jpg - DEFAULT_PARENT/wwwproxy.k12.de.us image/jpeg
1257884141.736    117 10.182.16.205 TCP_MISS/200 4045 GET http://t2.gstatic.com/images?q=tbn:UHSmX_XQMcMzOM:http://1.bp.blogspot.com/_ks1ZRojRD8k/SJJTVhimFVI/AAAAAAAAAiE/bhpsac_DHpY/s400/alien_lizzard.jpg - DEFAULT_PARENT/wwwproxy.k12.de.us image/jpeg
1257884141.736      0 10.182.16.205 TCP_NEGATIVE_HIT/204 370 GET http://clients1.google.com/generate_204 - NONE/- text/html
1257884141.756    169 10.182.16.205 TCP_MISS/200 3743 GET http://t1.gstatic.com/images?q=tbn:pMfVy6iLu7pjPM:http://www.buchinger.or.at/pic/animals/20050101-Lizzard-Nakkuru.jpg - DEFAULT_PARENT/wwwproxy.k12.de.us image/jpeg
1257884141.765    192 10.182.16.205 TCP_MISS/200 4775 GET http://t1.gstatic.com/images?q=tbn:zXJFTlypl8JCVM:http://bp1.blogger.com/_RloTrSWsm7A/RipCKwIS3_I/AAAAAAAAAGc/pXo0bLfKV-Q/s400/caiman_lizzard_pantanal-FBAL.jpg - DEFAULT_PARENT/wwwproxy.k12.de.us image/jpeg
1257884141.826    269 10.182.16.205 TCP_MISS/200 6589 GET http://t3.gstatic.com/images?q=tbn:87-PS67EHxfMJM:http://nmsouthernskies.com/PhotoHiRes/Lizzard.JPG - DEFAULT_PARENT/wwwproxy.k12.de.us image/jpeg
1257884141.978    152 10.182.16.205 TCP_MISS/204 452 GET http://images.google.com/csi?v=3&s=images&action=&e=17259,21329,21517,21766,22107,22712&ei=Acr5St7BNIKV8Abl9pTSDA&rt=prt.422,xjs.516,ol.1703 - DEFAULT_PARENT/wwwproxy.k12.de.us text/html
1257884144.617     64 10.182.16.156 TCP_MISS/304 452 GET http://www.etymonline.com/style.css - DEFAULT_PARENT/wwwproxy.k12.de.us text/css
1257884144.618     59 10.182.16.156 TCP_MISS/304 457 GET http://www.etymonline.com/graphics/header.jpg - DEFAULT_PARENT/wwwproxy.k12.de.us image/jpeg
Here is my current script that I work on so far -- it is hair-pulling experience:
Code:
tail /var/log/squid/access.log | 
awk '{print $1,$3,$7}' | 
while read line; do
echo $line
time=$(echo "$line" | cut -d ' ' -f 1)
ip=$(echo "$line" | cut -d ' ' -f 2)
url=$(echo "$line" | cut -d ' ' -f 3)
fqdn=$(echo "$url" | cut -d '/' -f 2)
echo "Time: $time $ip => $fqdn"
done
The output is as follows:

Code:
root@cachepilot:/var/log/squid# tail /var/log/squid/access.log |
> awk '{print $1,$3,$7}' |
> while read line; do
> echo $line
> time=$(echo "$line" | cut -d ' ' -f 1)
> ip=$(echo "$line" | cut -d ' ' -f 2)
> url=$(echo "$line" | cut -d ' ' -f 3)
> fqdn=$(echo "$url" | cut -d '/' -f 2)
> echo "Time: $time $ip => $fqdn"
> done
1257884402.710 10.182.16.156 http://www.etymonline.com/style.css
Time: 1257884402.710 10.182.16.156 =>
1257884402.750 10.182.16.153 http://www.schoolnotes.com/cgi-bin/notesupdate-new.pl
Time: 1257884402.750 10.182.16.153 =>
1257884403.135 10.182.16.153 http://clk.atdmt.com/go/175199563/direct;vt.1;wi.160;hi.600;ai.131775080;ct.d;ea.364/01/
Time: 1257884403.135 10.182.16.153 =>
1257884404.009 10.182.16.97 http://64.12.161.103/aim/fetchEvents?aimsid=088.1218851095.1557230312:cchowe&seqNum=10&rnd=1257884404.436330&timeout=20000&r=84&k=ke1KS-K6BdPDaRnC&f=json&a=%252FwQAAAAAAABBSgVh6p%252BakWe8%252FHCb%252F3YAGJT5VgU26pjC3sle9NGn0XL11zN9gHdCK32tHIf1OMCIJL%252B7B6cabYQnU0BHtC2HFXsNP5h2%252BsbgdAcT0lc2q5o%252FoVloSsXKXsJD1S5YUOHl5KamubP1a6QUEdbGRfrxr6nrkw%253D%253D&dojo.preventCache=1257884379785&c=dojo.io.script.jsonp_dojoIoScript85._jsonpCallback
Time: 1257884404.009 10.182.16.97 =>
I am trying to output the domain name after "=>" i.e. on the last line it should be :
Code:
Time: 1257884404.009 10.182.16.97 => 64.12.161.103
Any suggestions?
 
Old 11-10-2009, 04:32 PM   #2
kbp
Senior Member
 
Registered: Aug 2009
Posts: 3,758

Rep: Reputation: 644Reputation: 644Reputation: 644Reputation: 644Reputation: 644Reputation: 644
Hey,

This is a bit hacky, it relies on the greed of the first '.*' .. but it may help

Code:
sed 's/.*http[s]*:\/\/\([^/]*\).*/\1/' /path/to/file
cheers
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
convert LAN IP address to Host Name when I give cmd tail -f /var/log/squid/access.log rs15 Linux - Networking 6 01-22-2012 02:45 AM
so many tcp_denied in Squid access log dev_dks Linux - Networking 2 09-17-2008 06:19 AM
Squid Proxy is unable to log "access.log" file. nishith Linux - Server 6 08-10-2008 01:00 AM
Can SQUID log skype calls,voip,chat programs in access.log revinking Linux - Newbie 6 07-27-2008 02:14 PM
My squid won't fill /var/log/squid/access.log linuxlah Linux - General 5 10-06-2003 11:51 PM


All times are GMT -5. The time now is 05:07 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration