LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
LinkBack Search this Thread
Old 02-17-2008, 09:00 AM   #1
homey
Senior Member
 
Registered: Oct 2003
Posts: 3,057

Rep: Reputation: 53
trimming perl array elements


Hi,
I am using a perl script to read the dansguardian access.log and output to an html file which I email to myself and other admins.
The site field can be humungus and I trim it down abit so one doesn't have to use the scroll bar to view the rest of the page.
For example:
Code:
http://toolbar.google.com/service/update?as=tbie&version=4.0.1020.6156&os=big&hl=en&tbbrand=SUNA&sd=com&osver=5.1&ossp=2.0&browser=6.0.2900.2180&rlz=1T4SUNA_en___US205&id=457FF06587E1BE25D4AC192011FA87B26DEACsQYIO&ds=1&lurslt=105,5,105,5,105  GET 146
I can get the basic information from that field and the next one using this perl code. This work fine, I just wondered if ther is a better, more correct way of doing this?
Code:
	$site = ($logfile_fields[4]); $site =~ s/(.{90}).*/$1/;
	$reason = ($logfile_fields[5]); $reason =~ s/(.*)\:.*/$1/;
 
Old 02-17-2008, 10:37 AM   #2
otheus
LQ Newbie
 
Registered: Jun 2006
Location: Austria
Distribution: RHEL AS 4
Posts: 25

Rep: Reputation: 16
Quote:
Originally Posted by homey View Post
This work fine, I just wondered if ther is a better, more correct way of doing this?
Code:
	$site = ($logfile_fields[4]); $site =~ s/(.{90}).*/$1/;
	$reason = ($logfile_fields[5]); $reason =~ s/(.*)\:.*/$1/;
This is probably more efficient, for what it's worth:
Code:
        ($site) = ($logfile_fields[4] =~ /^.{90}/);
        ($reason) = ($logfile_fields[5] =~ /^(.*?):/);
The difference would probably only be noticed over very large files, like you say are being processed.
 
Old 02-17-2008, 11:41 AM   #3
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 64
And for the first one, you can eliminate RE scanning by using substr. So you end up with:
Code:
	$site = substr($logfile_fields[4], 0, 89);
	($reason) = ($logfile_fields[5] =~ /^(.*?):/);
 
Old 02-17-2008, 12:08 PM   #4
homey
Senior Member
 
Registered: Oct 2003
Posts: 3,057

Original Poster
Rep: Reputation: 53
Thank you!

The first part works very nicely and looks much better than what I had.
However, the reason line pops an error when I use your code.
Code:
	($reason) = ($logfile_fields[5] =~ /^(.*?):/);

Use of uninitialized value in concatenation (.) or string at ./test.pl line 83, <LOG_FILE> line 10697.
 
Old 02-17-2008, 12:35 PM   #5
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 64
Quote:
Originally Posted by homey View Post
However, the reason line pops an error when I use your code.
Code:
	($reason) = ($logfile_fields[5] =~ /^(.*?):/);

Use of uninitialized value in concatenation (.) or string at ./test.pl line 83, <LOG_FILE> line 10697.
What perl version are you using? You can always get rid of the ? (greediness modifier), it’s not strictly necessary if you have only one colon.
Code:
	($reason) = ($logfile_fields[5] =~ /^(.*):/);

Last edited by osor; 02-17-2008 at 12:49 PM.
 
Old 02-17-2008, 01:02 PM   #6
homey
Senior Member
 
Registered: Oct 2003
Posts: 3,057

Original Poster
Rep: Reputation: 53
Thanks,
perl -v
This is perl, v5.8.8 built for i386-linux-thread-multi

that still pops the error...
Use of uninitialized value in concatenation (.) or string at ./test.pl line 83, <LOG_FILE> line 10697.


Just to clarify, I want to show everything up to the : in that field. If no : is there, just print what is there.
For example:
Code:
GET 7
GET 0
*DENIED* Banned site
Banned extension
*DENIED* Banned Phrase found

from these snips...

http://www.streamaudio.com/stations/player/pages/newplayer/nowplay/adtrack.asp?type=Replacement&adid=74158&station=WHKO_FM  GET 7

http://ads1.msn.com/library/dap.js *DENIED* Banned site: ads1.msn.com GET 0

http://dl.google.com/toolbar/T4/data/en/big/4.0.1601.4978-big/GoogleNav.cab *DENIED* Banned extension: .cab GET 0

http://www.wlwt.com/index.html *DENIED* Banned Phrase found:  gator  GET 89562

http://oascentral.clearchannel.com/RealMedia/ads/adstream_sx.ads/wlw-am/home/@Top,Middle1,Left1!Top?_RM_HTML_LANDINGSITE_=www.700wlw.com *DENIED* Banned site: clearchannel.com GET 0

http://oascentral.clearchannel.com/RealMedia/ads/adstream_sx.ads/wlw-am/newsondemand/@Top,Middle1,Left1!Top?_RM_HTML_LANDINGSITE_=www.700wlw.com *DENIED* Banned site: clearchannel.com GET 0
 
Old 02-17-2008, 01:24 PM   #7
osor
HCL Maintainer
 
Registered: Jan 2006
Distribution: (H)LFS, Gentoo
Posts: 2,450

Rep: Reputation: 64
I see. I think the initial assumption was that there is always a colon. Anyway, you can do something like this:
Code:
($reason) = ($logfile_fields[5] =~ /^(.*?)(:|$)/);
But at this point, I am not sure how much faster it is from:
Code:
($reason = $logfile_fields[5]) =~ s/:.*//;
 
Old 02-17-2008, 03:48 PM   #8
homey
Senior Member
 
Registered: Oct 2003
Posts: 3,057

Original Poster
Rep: Reputation: 53
Thank you, wish I'd thought of such a nice solution!
Code:
	$site = substr($logfile_fields[4],0,87);
	($reason = $logfile_fields[5]) =~ s/:.*//;
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are Off
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
print array elements in one line bharatbsharma Programming 1 10-29-2007 08:58 AM
Passing Array Elements to functions melikai Programming 4 10-31-2006 10:27 PM
Deleting elements from array in perl with splice signalno9 Programming 2 08-16-2005 10:57 PM
odd behaviour of array elements in c++ markhod Programming 4 03-14-2005 09:58 AM
perl - get number of elements in an array AM1SHFURN1TURE Programming 3 03-07-2005 03:59 PM


All times are GMT -5. The time now is 05:37 AM.

Main Menu
 
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration