LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 04-23-2011, 08:09 PM   #1
TheNewGuy2936
LQ Newbie
 
Registered: Apr 2011
Location: Brooklyn NYC
Distribution: Redhat & CentOS
Posts: 23

Rep: Reputation: 6
Extracting text from a file.


Hello everyone:

There are some log files that I wish to get some information from (Apache Access Log) but it is HUGE! All I need as of right now is any information from date and time A to date and time B. What commands can I use to extract this information from the access_log and put it into another file with just that information? I created a file called "access_info" by doing
Code:
touch access_info
but I was not sure where to go from there. Thank you everyone in advance for your help!
 
Old 04-23-2011, 08:29 PM   #2
frankbell
Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Mageia, Mint
Posts: 7,627

Rep: Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443
grep should do the job.

grep [date] /path-to/filename.

The date has to be in the same format used in the file.

See man grep for more.
 
Old 04-23-2011, 08:48 PM   #3
TheNewGuy2936
LQ Newbie
 
Registered: Apr 2011
Location: Brooklyn NYC
Distribution: Redhat & CentOS
Posts: 23

Original Poster
Rep: Reputation: 6
So just to be sure before I do so, can I do something like this:

Code:
grep 2011-03-18 access_log > access_info
 
Old 04-23-2011, 08:59 PM   #4
frankbell
Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Mageia, Mint
Posts: 7,627

Rep: Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443
I think so (I'm still learning grep).

You can test with

grep 2011-03-18 access_log

to see whether the desired output appears on the screen.

Depending on your current directory, you may have to include the full path to access_log.
 
1 members found this post helpful.
Old 04-23-2011, 10:22 PM   #5
Telengard
Member
 
Registered: Apr 2007
Location: USA
Distribution: Kubuntu 8.04
Posts: 579
Blog Entries: 8

Rep: Reputation: 147Reputation: 147
You may want to adopt the practice of enclosing regular expressions in single quotes when invoking grep within bash. Bash uses some of the same meta-characters as used in regular expressions (even though they mean different things).

Code:
grep '2011-03-18' access_log
In this case it shouldn't matter, so frankbell's example should work fine. If your regular expression included [, *, or | then Bash would eat them before grep.

Last edited by Telengard; 04-23-2011 at 10:24 PM.
 
1 members found this post helpful.
Old 04-23-2011, 10:30 PM   #6
frankbell
Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Mageia, Mint
Posts: 7,627

Rep: Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443Reputation: 1443
Telengard, becoming proficient at regular expressions is next on my list.

Thanks.
 
Old 04-23-2011, 10:55 PM   #7
Telengard
Member
 
Registered: Apr 2007
Location: USA
Distribution: Kubuntu 8.04
Posts: 579
Blog Entries: 8

Rep: Reputation: 147Reputation: 147
Quote:
Originally Posted by frankbell View Post
Telengard, becoming proficient at regular expressions is next on my list.
I began experimenting with Linux in 2005 to help with my Unix and programming classes. I've been using Linux on my own machines since 2006. In April 2009 I switch to Linux full time on my personal desktop. I still don't consider myself proficient with regular expressions. Take your time. ;D

A few tips you may find helpful:
  • man 7 regex provides a fairly easy to read overview of regular expressions.
  • Gnu Grep 2.7 manual is more in depth.
  • Regular-Expressions.info is a good introductory course for practical applications.
  • Almost every program using regular expressions has its own special syntax conventions. Regular expressions which work in one program don't necessarily work in all of them.

HTH
 
Old 04-23-2011, 11:27 PM   #8
kurumi
Member
 
Registered: Apr 2010
Posts: 223

Rep: Reputation: 45
there are other tools besides grep that can do pattern matching, example awk, Perl, Python etc. I prefer Ruby.
Code:
$ ruby -ne 'print if /your date pattern/../next date pattern/' file
 
Old 04-23-2011, 11:36 PM   #9
Telengard
Member
 
Registered: Apr 2007
Location: USA
Distribution: Kubuntu 8.04
Posts: 579
Blog Entries: 8

Rep: Reputation: 147Reputation: 147
Quote:
Originally Posted by kurumi View Post
there are other tools besides grep that can do pattern matching, example awk, Perl, Python etc. I prefer Ruby.
Code:
$ ruby -ne 'print if /your date pattern/../next date pattern/' file
I see your ruby and raise you an AWK

Code:
awk '/2011-03-18/' access_log
There are many, many ways:

Code:
while read ; do [[ "$REPLY" =~ 2011-03-18 ]] && echo "$REPLY" ; done < access_log
 
Old 04-25-2011, 02:14 AM   #10
kurumi
Member
 
Registered: Apr 2010
Posts: 223

Rep: Reputation: 45
Quote:
Originally Posted by Telengard View Post
I see your ruby and raise you an AWK

Code:
awk '/2011-03-18/' access_log
Lol, that's not equivalent to my Ruby example. Same with the shell one.
 
Old 04-25-2011, 04:57 PM   #11
Telengard
Member
 
Registered: Apr 2007
Location: USA
Distribution: Kubuntu 8.04
Posts: 579
Blog Entries: 8

Rep: Reputation: 147Reputation: 147
Quote:
Originally Posted by kurumi View Post
Lol, that's not equivalent to my Ruby example. Same with the shell one.
I don't know Ruby
 
Old 04-25-2011, 07:00 PM   #12
kurumi
Member
 
Registered: Apr 2010
Posts: 223

Rep: Reputation: 45
Quote:
Originally Posted by Telengard View Post
I don't know Ruby
awk has a similar syntax
Code:
awk '/pattern1/,/pattern2/' file
 
1 members found this post helpful.
Old 04-25-2011, 09:56 PM   #13
Telengard
Member
 
Registered: Apr 2007
Location: USA
Distribution: Kubuntu 8.04
Posts: 579
Blog Entries: 8

Rep: Reputation: 147Reputation: 147
Quote:
Originally Posted by kurumi View Post
Code:
awk '/pattern1/,/pattern2/' file
The Gawk manual calls this a range pattern. I guess that means your Ruby program begins printing with your date pattern and stops printing after next date pattern.
 
Old 04-26-2011, 10:16 AM   #14
TheNewGuy2936
LQ Newbie
 
Registered: Apr 2011
Location: Brooklyn NYC
Distribution: Redhat & CentOS
Posts: 23

Original Poster
Rep: Reputation: 6
the "awk" worked out great! Thanks again for everyones help! This really helped me on this big access_log file which was 7.2 GIGS
 
  


Reply

Tags
grep, regular expressions


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Extracting block of text from log file hattori.hanzo Programming 1 11-22-2010 10:55 PM
extracting particular lines from a text file skuz_ball Programming 18 10-28-2008 12:31 PM
read a line from text file and extracting the details needed pdklinux79 Linux - Newbie 6 06-06-2008 10:41 PM
extracting data from html files into one text file adityavpratap Slackware 9 05-10-2007 10:30 AM
extracting a chunk of text from a large text file lothario Linux - Software 3 02-28-2007 08:16 AM


All times are GMT -5. The time now is 05:51 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration