LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-27-2009, 03:34 AM   #1
ytd
Member
 
Registered: Jan 2009
Posts: 197

Rep: Reputation: 31
How do I extract "words" from a log in linux ?


I have a big log file in linux (aprox 700MB), and i want to extract from that log all the characters that contains eg: http://mywebsite/link1/2.html

How do i do that ?

It takes too much time to search with vi with /string because I have alot of access logs on my http://mywebsite/link1/2.html with alot of IP's and I want to extract all the IP's that accessed this link: http://mywebsite/link1/2.html
 
Old 04-27-2009, 03:57 AM   #2
namit
Member
 
Registered: Aug 2005
Distribution: Debian
Posts: 355

Rep: Reputation: 30
grep in this case would be good

type
man grep

otherwise you can just use
cat logfile | grep "http://mywebsite/link1/2.html"
 
Old 04-27-2009, 04:10 AM   #3
ytd
Member
 
Registered: Jan 2009
Posts: 197

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by namit View Post
grep in this case would be good

type
man grep

otherwise you can just use
cat logfile | grep "http://mywebsite/link1/2.html"
Ok, ok... this is good. But I want to filter more, I want to extract the "http://mywebsite/link1/2.html" from [20/Apr/2009:08:22:31 +0300] to [20/Apr/2009:22:00:00 +0300]
From 08:22:31 to 22:00.

How do I do that ?
 
Old 04-27-2009, 04:12 AM   #4
namit
Member
 
Registered: Aug 2005
Distribution: Debian
Posts: 355

Rep: Reputation: 30
Please give me a section of log file.
 
Old 04-27-2009, 04:16 AM   #5
ytd
Member
 
Registered: Jan 2009
Posts: 197

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by namit View Post
Please give me a section of log file.
213.233.92.150 - - [20/Apr/2009:08:22:31 +0300] "GET /xxx/css/xxx.css HTTP/1.1" 200 7171 "http://mywebsite/link1/2.html" "HTC_TyTN_II Mozilla/4.0 (compatible; MSIE 6.0; Windows CE; IEMobile 7.11)"
 
Old 04-27-2009, 04:46 AM   #6
namit
Member
 
Registered: Aug 2005
Distribution: Debian
Posts: 355

Rep: Reputation: 30
so in that case just do multiple greps in it or you could read up about regular expressions.

cat logfile | grep "http://mywebsite/link1/2.html" | grep "[20/Apr/2009"
 
Old 04-27-2009, 05:03 AM   #7
ytd
Member
 
Registered: Jan 2009
Posts: 197

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by namit View Post
so in that case just do multiple greps in it or you could read up about regular expressions.

cat logfile | grep "http://mywebsite/link1/2.html" | grep "[20/Apr/2009"
I have too many logs in the same date, in the same hour, in the same minute. I cannot copy them all because the scroll bar of the putty is too "large" and I cannot copy the whole log. Any hints ?
 
Old 04-27-2009, 05:13 AM   #8
ytd
Member
 
Registered: Jan 2009
Posts: 197

Original Poster
Rep: Reputation: 31
Quote:
Originally Posted by namit View Post
so in that case just do multiple greps in it or you could read up about regular expressions.

cat logfile | grep "http://mywebsite/link1/2.html" | grep "[20/Apr/2009"
This is so annoying, there are too many logs, I can't even grep per minute because there are WAY TOO MANY !

Even if i do: cat access_log.2 | grep "http://mysite" | grep "20/Apr/2009:09:1" in order to have all the logs in the 2009 hour 09 and minute 1x it dosen't show me the whole logs because there are too many, and i need too export them all.

And please don't tell me to do cat access_log.2 | grep "http://mysite" | grep "20/Apr/2009:09:11" and then ...2009:09:12 and then 2009:09:13 'cause I'm gonna go crazy. I need to export the logs from hour 9:00 to hour 22:00, so please.

Last edited by ytd; 04-27-2009 at 05:16 AM.
 
Old 04-28-2009, 05:25 AM   #9
namit
Member
 
Registered: Aug 2005
Distribution: Debian
Posts: 355

Rep: Reputation: 30
No this is not the way you do it use a regular expression in grep and should show you the list ebcause [0-9] for seconds and minutes, also look at things like sed and you select sections out of log.
 
Old 04-28-2009, 05:31 AM   #10
PMP
Member
 
Registered: Apr 2009
Location: ~
Distribution: RHEL, Fedora
Posts: 381

Rep: Reputation: 58
This is an apache access log, use any utility for parsing this log, perl has got so many modules to do so check them out.
 
Old 04-28-2009, 09:30 AM   #11
AlucardZero
Senior Member
 
Registered: May 2006
Location: USA
Distribution: Debian
Posts: 4,808

Rep: Reputation: 604Reputation: 604Reputation: 604Reputation: 604Reputation: 604Reputation: 604
Redirect the output to a file, or add another pipe to "more" or "less"
 
Old 04-28-2009, 10:04 AM   #12
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
it will be slow you are going to do multiple greps on a 700++MB file.
Code:
grep -o "http.*\" " file
 
Old 04-28-2009, 10:54 AM   #13
onebuck
Moderator
 
Registered: Jan 2005
Location: Midwest USA, Central Illinois
Distribution: SlackwareŽ
Posts: 12,549
Blog Entries: 23

Rep: Reputation: 1943Reputation: 1943Reputation: 1943Reputation: 1943Reputation: 1943Reputation: 1943Reputation: 1943Reputation: 1943Reputation: 1943Reputation: 1943Reputation: 1943
Hi,

Look at 'Advanced Bash-Scripting Guide' to learn more about scripting!

You could look at the 'Tutorial' section of 'Slackware-Links'. More than just SlackwareŽ links!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
The "Log out" and "Lock screen" actions cannot be executed through keyboard shortcuts Snood Linux - Desktop 0 04-22-2009 10:30 AM
Regular expression to extract "y" from "abc/x.y.z" rag84dec Linux - Newbie 1 05-29-2008 03:47 AM
New SQUID user: How to clear the "access.log" and "store.log" automatically? yuzuohong Linux - Networking 2 12-02-2006 06:37 AM
Google Ad-Words advertises MS's "Get the fact campaign" on LQ? Squall General 6 03-23-2004 11:47 PM


All times are GMT -5. The time now is 05:22 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration