LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-17-2005, 05:40 AM   #1
Baltasar
Member
 
Registered: Jan 2004
Distribution: Fedora & Debian
Posts: 43

Rep: Reputation: 15
splitlog.pl


Hi,

I need some help with the perl script splitlog.pl from the Apache source.
I have many name-based virtual hosts with one access_log and now i need to split them for each virtual host. What do i have to change that each virtual host gets his own access_log in his documentroot or home folder?
Thank you in advance!



#!@perlbin@
# The combined log file is read from stdin. Records read
# will be appended to any existing log files.

%is_open = ();

while ($log_line = <STDIN>) {
#
# Get the first token from the log record; it's the
# identity of the virtual host to which the record
# applies.
#
($vhost) = split (/\s/, $log_line);
#
# Normalize the virtual host name to all lowercase.
# If it's blank, the request was handled by the default
# server, so supply a default name. This shouldn't
# happen, but caution rocks.
#
$vhost = lc ($vhost) or "access";
#
# if the vhost contains a "/" or "\", it is illegal so just use
# the default log to avoid any security issues due if it is interprted
# as a directory separator.
if ($vhost =~ m#[/\\]#) { $vhost = "access" }
#
# If the log file for this virtual host isn't opened
# yet, do it now.
#
if (! $is_open{$vhost}) {
open $vhost, ">>${vhost}.log"
or die ("Can't open ${vhost}.log");
$is_open{$vhost} = 1;
}
#
# Strip off the first token (which may be null in the
# case of the default server), and write the edited
# record to the current log file.
#
$log_line =~ s/^\S*\s+//;
printf $vhost "%s", $log_line;
}
exit 0;
 
Old 03-17-2005, 11:36 AM   #2
TheLinuxDuck
Member
 
Registered: Sep 2002
Location: Tulsa, OK
Distribution: Slack, baby!
Posts: 349

Rep: Reputation: 33
If you want to indicate specific log files from specific virtual hosts on one machine, go and edit your httpd.conf file and add lines similar to the following (log lines):
Code:
<VirtualHost *:80>
ServerName nakocity.com
ServerAlias www.nakocity.com
DocumentRoot /domains/nakocity.com/public_html
ErrorLog /domains/nakocity.com/logs/error_log
CustomLog /domains/nakocity.com/logs/access_log common
</VirtualHost>

<VirtualHost *:80>
ServerName basiccablehacks.org
ServerAlias www.basiccablehacks.org
DocumentRoot /domains/basiccablehacks/public_html
ErrorLog /domains/basiccablehacks/logs/error_log
CustomLog /domains/basiccablehacks/logs/access_log common
</VirtualHost>
Once you restart apache, you'll be good to go.
 
Old 03-18-2005, 07:31 AM   #3
Baltasar
Member
 
Registered: Jan 2004
Distribution: Fedora & Debian
Posts: 43

Original Poster
Rep: Reputation: 15
Thank you for your reply.
I knew that way, but this solution is only possible if there are just a few virtual hosts.
When you have hundreds of virtual hosts you ran out of file descriptors.
This is described at http://httpd.apache.org/docs-2.0/misc/descriptors.html
So the only solution is to have one log file and split it for each virtual host.
My problem is i'm not a perl coder and get it to work. Could someone please help me.
 
Old 03-18-2005, 08:55 AM   #4
TheLinuxDuck
Member
 
Registered: Sep 2002
Location: Tulsa, OK
Distribution: Slack, baby!
Posts: 349

Rep: Reputation: 33
There isn't a super easy solution to this. The reason being, how does the script know in what dir to put each log file? You'll have to maintain a listing somewhere, or the path to each domain should be a consistant dir tree layout, in order to provide a method for building the location dynamically. What is the dir structure for the domains? Also, could you provide some a bit of a sample log, showing several different hosts?
 
Old 03-18-2005, 09:28 AM   #5
Baltasar
Member
 
Registered: Jan 2004
Distribution: Fedora & Debian
Posts: 43

Original Poster
Rep: Reputation: 15
The structure looks like this:

/home/vhost1/public_html
/home/vhost1/cgi-bin
/home/vhost1/logs
/home/vhost2/public_html
/home/vhost2/cgi-bin
/home/vhost2/logs
....

The name of the vhost folder doesn't match with the virtual host domain.
For example the home folder for www.vhost1.com would be /home/li5fs78d/ and the Apache DocumentRoot /home/li5fs78d/public_html. I knew this makes it even more complicated, but a "egrep 'vhost1.com' /etc/passwd" would return li5fs78d:x:1425:1425:vhost1.com:/home/li5fs78d:/sbin/nologin
Maybe this helps a little bit.
I already use the combined_vhost log fomat, so the first entry in the line from the access_log will always be the virtual host name.
LogFormat "%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined_vhost
Here is a example output:

vhost1.com 111.222.111.222 - - [18/Mar/2005:16:21:04 +0100] "GET / HTTP/1.1" 200 13050 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en_US; rv:1.7.5) Gecko/20041108 Firefox/1.0"
vhost2.com 111.222.111.222 - - [18/Mar/2005:16:21:04 +0100] "GET / HTTP/1.1" 200 14335 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en_US; rv:1.7.5) Gecko/20041108 Firefox/1.0"
 
Old 03-18-2005, 03:39 PM   #6
TheLinuxDuck
Member
 
Registered: Sep 2002
Location: Tulsa, OK
Distribution: Slack, baby!
Posts: 349

Rep: Reputation: 33
Ok. What you'll need to do is this:

At the beginning of the script, you'll need to add in a section that loads the passwd file, more specifically, the user's name and home directory.

Might look something like this:
Code:
  my($hostname, $pathname);
  my($userlist) = {};

  while(($hostname,$pathname) = (getpwent())[6,7]) {
    #  skip anything that doesn't have it's home in /home
    #
    next if($pathname !~ /^\/home/);
    $userlist->{$hostname} = $pathname;
  }

  #  don't forget to include the default logfile!
  #
  $userlist->{access} = '/var/log/access'; # or whatever yours is
For some reason, what getpwent() returns on my system seems to be different than what perl's book says it should be, but for me, position 6 was the comment field (the host name), and position 7 was the home dir. You might want to verify this on your system.

This code will make the path checking much easier.

Now, right after
Code:
# If the log file for this virtual host isn't opened
# yet, do it now.
#
but before

Code:
 if (! $is_open{$vhost}) {
We're going to add in some path building, as long as the host exists in the userlist hash. Here is my take on it:
Code:
# first check to see if the host exists in our userlist
  #
  if(defined($userlist->{$vhost})) {
    #  build the path to the file
    #
    my($logpath) = $userlist->{$vhost} . '/';

    # make sure path exists
    #
    if(-d "$logpath") {
      # now attach the filename
      #
      $logpath .= $vhost . ".log";

      # we have permission?
      #
      if(-w "$logpath") {

        # been opened already?
        #
        if (!$is_open{$vhost}) {
          open $vhost, ">>$logpath" or die ("Can't open $logpath\n");
          $is_open{$vhost} = 1;
        }

        #
        # Strip off the first token (which may be null in the
        # case of the default server), and write the edited
        # record to the current log file.
        #
        $log_line =~ s/^\S*\s+//;
        printf $vhost "%s", $log_line;
      }
    }
  }
This should take care of it.. now, I have not tested this code, because my setup is different han yours..

--edit--

Sorry, I noticed a few bugs in the code

Last edited by TheLinuxDuck; 03-18-2005 at 03:43 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:53 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration