LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 02-10-2011, 04:25 PM   #16
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 713Reputation: 713Reputation: 713Reputation: 713Reputation: 713Reputation: 713Reputation: 713

Unless I misunderstood you, you said there are some fields that have more than one "name: value" statement per line.
 
Old 02-10-2011, 06:18 PM   #17
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
The script will skip log files that do not strictly match the pattern. I assumed DDMMYYYY-IT.log format; is that correct?

As to the multi-line reports: You can do additional filtering before processing, to make sure all input lines are of format key:value. Here is a second version with some input filtering. This time, the file is read using the file_get_contents function, and filtered using the preg_replace function. Any line not containing a colon or not starting with a capital letter is merged with the previous line. If you want to keep the newlines in the report texts, that too is possible, it just takes more detailed filtering.
Code:
<html>
 <head>
  <title>
   Example problem report
  </title>
  <style type="text/css">
   table.report {
    width: 40em !important;
    padding: 0 0 0 0;
    border: 1px solid #cccccc;
    margin: 0 0 2em 0;
    border-collapse: collapse;
    border-spacing: 0;
   }
   table.report td {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: left;
    vertical-align: top;
    font-weight: normal;
   }
   table.report th {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: right;
    vertical-align: top;
    font-weight: bold;
   }
   table.report th.title {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border-top: 0 none;
    border-right: 0 none;
    border-bottom: 1px solid #cccccc;
    border-left: 0 none;
    background: #efefef;
    text-align: center;
    vertical-align: middle;
    font-weight: bold;
   }
  </style>
 </head>
 <body>
<?PHP

   /* Combine all whitespace into a single space. Make sure there is a space
    * at the beginning, no space at the end. Append a newline, and return the result.
   */
   function fixline($line) {
       return ' ' . trim(@preg_replace('/[\t\n\v\f\r ]+/', ' ', $line), ' ') . "\n";
   }

   $files = glob('/path/to/it/logs/*.log', GLOB_NOSORT);
   echo "  <p>Found ", @count($files), " log files.</p>\n";

   if ($files !== FALSE) {

       $temp  = $files;
       $files = array();
       foreach ($temp as $logfile) {
           $index = preg_replace('/^.*\/([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]).*$/', '$1', $logfile);
           $time = mktime(0,0,0, intval(substr($index, 2, 2), 10), intval(substr($index, 0, 2), 10), intval(substr($index,4,4), 10));
           $files[$time] = $logfile;
       }
       unset($temp);
       krsort($files);

       foreach ($files as $time => $logfile) {

           $data = @file_get_contents($logfile);
           if ($data !== FALSE) {

               /* Convert all newline conventions to a single "\n". Also,
                * remove all leading and trailing whitespace from each line.
               */
               $data = @preg_replace('/[\t\n\v\f\r ]*[\n\r][\t\n\v\f\r ]*/', "\n", $data);

               /* Merge any line not starting with a capital letter with the previous line. */
               $data = @preg_replace('/\n([^A-Z])/', " \\1", $data);

               /* Merge all lines not containing a colon to the previous line. */
               $data = @preg_replace('/\n([^:]*)\n/e', "fixline('\\1')", $data);

               /* Convert to an array of lines. */
               $data = @explode("\n", $data);

               $title = date('D, j M Y', $time);
               echo "  <table class=\"report\">\n";
               echo "   <tr>\n";
               echo "    <th class=\"title\" colspan=\"2\">", htmlentities($title, ENT_QUOTES, 'UTF-8'), "</th>\n";
               echo "   </tr>\n";
               foreach ($data as $entry) {
                   @list($key, $value) = @explode(':', $entry, 2);
                   echo "   <tr>\n";
                   echo "    <th>", htmlentities(trim($key), ENT_COMPAT, 'UTF-8'), "</th>\n";
                   echo "    <td>", htmlentities(trim($value), ENT_COMPAT, 'UTF-8'), "</td>\n";
                   echo "   </tr>\n";
               }
               echo "  </table>\n";
           }
       }
   }
?>
 </body>
</html>
Note that this is still not bulletproof. If you have a list of all possible headings, or a pattern that only matches a heading, you could split the line at their occurrences, instead of newlines.

Does this script work for you better?
Nominal Animal

Last edited by Nominal Animal; 03-21-2011 at 08:35 AM.
 
Old 02-11-2011, 09:18 AM   #18
devUnix
Member
 
Registered: Oct 2010
Location: Bengaluru, India
Distribution: RHEL 5.1 on My PC, & SunOS / Sun Solaris, RHEL, SuSe, Debian, FreeBSD and other Linux flavors @ Work
Posts: 553

Original Poster
Rep: Reputation: 46
Quote:
Originally Posted by Nominal Animal View Post
The script will skip log files that do not strictly match the pattern. I assumed DDMMYYYY-IT.log format; is that correct?

As to the multi-line reports: You can do additional filtering before processing, to make sure all input lines are of format key:value. Here is a second version with some input filtering. This time, the file is read using the file_get_contents function, and filtered using the preg_replace function. Any line not containing a colon or not starting with a capital letter is merged with the previous line. If you want to keep the newlines in the report texts, that too is possible, it just takes more detailed filtering.
Code:
<html>
 <head>
  <title>
   Example problem report
  </title>
  <style type="text/css">
   table.report {
    width: 40em !important;
    padding: 0 0 0 0;
    border: 1px solid #cccccc;
    margin: 0 0 2em 0;
    border-collapse: collapse;
    border-spacing: 0;
   }
   table.report td {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: left;
    vertical-align: top;
    font-weight: normal;
   }
   table.report th {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: right;
    vertical-align: top;
    font-weight: bold;
   }
   table.report th.title {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border-top: 0 none;
    border-right: 0 none;
    border-bottom: 1px solid #cccccc;
    border-left: 0 none;
    background: #efefef;
    text-align: center;
    vertical-align: middle;
    font-weight: bold;
   }
  </style>
 </head>
 <body>
<?PHP

   /* Combine all whitespace into a single space. Make sure there is a space
    * at the beginning, no space at the end. Append a newline, and return the result.
   */
   function fixline($line) {
       return ' ' . trim(@preg_replace('/[\t\n\v\f\r ]+/', ' ', $line), ' ') . "\n";
   }

   $files = glob('/path/to/it/logs/*.log', GLOB_NOSORT);
   echo "  <p>Found ", @count($files), " log files.</p>\n";

   if ($files !== FALSE) {

       $temp  = $files;
       $files = array();
       foreach ($temp as $logfile) {
           $index = preg_replace('/^.*\/([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]).*$/', '$1', $logfile);
           $time = mktime(0,0,0, intval(substr($index, 2, 2), 10), intval(substr($index, 0, 2), 10), intval(substr($index,4,4), 10));
           $files[$time] = $logfile;
       }
       unset($temp);
       krsort($files);

       foreach ($files as $time => $logfile) {

           $data = @file_get_contents($logfile);
           if ($data !== FALSE) {

               /* Convert all newline conventions to a single "\n". Also,
                * remove all leading and trailing whitespace from each line.
               */
               $data = @preg_replace('/[\t\n\v\f\r ]*[\n\r][\t\n\v\f\r ]*/', "\n", $data);

               /* Merge any line not starting with a capital letter with the previous line. */
               $data = @preg_replace('/\n([^A-Z])/', " \\1", $data);

               /* Merge all lines not containing a colon to the previous line. */
               $data = @preg_replace('/\n([^:]*)\n/e', "fixline('\\1')", $data);

               /* Convert to an array of lines. */
               $data = @explode("\n", $data);

               $title = date('D, j M Y', $time);
               echo "  <table class=\"report\">\n";
               echo "   <tr>\n";
               echo "    <th class=\"title\" colspan=\"2\">", htmlentities($title, ENT_QUOTES, 'UTF-8'), "</th>\n";
               echo "   </tr>\n";
               foreach ($data as $entry) {
                   @list($key, $value) = @explode(':', $entry, 2);
                   echo "   <tr>\n";
                   echo "    <th>", htmlentities(trim($key), ENT_COMPAT, 'UTF-8'), "</th>\n";
                   echo "    <td>", htmlentities(trim($value), ENT_COMPAT, 'UTF-8'), "</td>\n";
                   echo "   </tr>\n";
               }
               echo "  </table>\n";
           }
       }
   }
?>
 </body>
</html>
Note that this is still not bulletproof. If you have a list of all possible headings, or a pattern that only matches a heading, you could split the line at their occurrences, instead of newlines.

Does this script work for you better?
Nominal Animal
It works fine! There was only one wierd thing:

It reported:

Found 5 log files.

That was correct. But only 3 of them were displayed. I added one more log file. This time it reported:

Found 6 log files.

But again, 2 files were not displayed.



The log files have this format: DDMMYYYY-SomeDepartment.log

SomeDepartment could be any one word value consiting of characters from A-Z and a-z such as:

DDMMYYYY-IT.log
DDMMYYYY-Messaging.log
DDMMYYYY-Server.log

More specifically:


Code:
-bash-2.05b# ls -1t /tmp/logs
11022011-Mike.log
10022011-Mariyam.log
09022011-Mariyam.log
09022011-Ken.log
09022011-Mike.log
08022011-Mike.log
So, I changed this line:

$files = glob('/tmp/logs/*.log', GLOB_NOSORT);

Everything looks fine. Not sure, why some files are being skipped. Only these reports are being displayed:

Fri, 11 Feb 2011

Thu, 10 Feb 2011

Wed, 9 Feb 2011

Tue, 8 Feb 2011


These files are being skipped:

09022011-Ken.log
09022011-Mike.log
 
Old 02-11-2011, 09:21 AM   #19
devUnix
Member
 
Registered: Oct 2010
Location: Bengaluru, India
Distribution: RHEL 5.1 on My PC, & SunOS / Sun Solaris, RHEL, SuSe, Debian, FreeBSD and other Linux flavors @ Work
Posts: 553

Original Poster
Rep: Reputation: 46
Quote:
Originally Posted by MTK358 View Post
Unless I misunderstood you, you said there are some fields that have more than one "name: value" statement per line.

Hm... Name: Value is correct. But the value could be of multiple lines.
 
Old 02-11-2011, 09:47 AM   #20
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 713Reputation: 713Reputation: 713Reputation: 713Reputation: 713Reputation: 713Reputation: 713
Quote:
Originally Posted by devUnix View Post
Hm... Name: Value is correct. But the value could be of multiple lines.
OK. I assumed that the end of the line == the end of the value. So the value ends on the next line that contains a ':' character?
 
Old 02-11-2011, 10:00 AM   #21
devUnix
Member
 
Registered: Oct 2010
Location: Bengaluru, India
Distribution: RHEL 5.1 on My PC, & SunOS / Sun Solaris, RHEL, SuSe, Debian, FreeBSD and other Linux flavors @ Work
Posts: 553

Original Poster
Rep: Reputation: 46
Quote:
Originally Posted by MTK358 View Post
OK. I assumed that the end of the line == the end of the value. So the value ends on the next line that contains a ':' character?
Consider this log:


User Name: Mr. Cracker
Group: Hackers
Detailed Biodata:
Cracked over 1000 propietory applications. Kills several processes whenever logs in. Does not let the IT world go smoothly. Never gets trapped. Blah! Blah!
Conclusion:
He is selected for the job!

What appears in bold above should also appear in bold on a web page. That's it. But the original log file does not contain any HTML tag or code whatsoever. It contains only logs! When we look at the above log, we can easily identify the fields: User Name, Group, Detailed Biodata, and Conclusion. All of them end in a colon- : and begin on a separate line. We cannot say where a particular field's value (Detailed Biodata, for example) ends- but we can identify it only by looking at it. We see that the next item (Conclusion, in the given example) begins here and we conclude that the previous item's value (the Detailed Biodata) comes to an end here.
 
Old 02-11-2011, 06:07 PM   #22
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
Quote:
Originally Posted by devUnix View Post
The log files have this format: DDMMYYYY-SomeDepartment.log
That's the reason. The file names are keyed by YYYYMMDD, so only one report per day is shown. It's easy to fix, though; just append the department to the key:
Code:
<html>
 <head>
  <title>
   Example problem report
  </title>
  <style type="text/css">
   table.report {
    width: 40em !important;
    padding: 0 0 0 0;
    border: 1px solid #cccccc;
    margin: 0 0 2em 0;
    border-collapse: collapse;
    border-spacing: 0;
   }
   table.report td {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: left;
    vertical-align: top;
    font-weight: normal;
   }
   table.report th {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: right;
    vertical-align: top;
    font-weight: bold;
   }
   table.report th.title {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border-top: 0 none;
    border-right: 0 none;
    border-bottom: 1px solid #cccccc;
    border-left: 0 none;
    background: #efefef;
    text-align: center;
    vertical-align: middle;
    font-weight: bold;
   }
  </style>
 </head>
 <body>
<?PHP

   /* Combine all whitespace into a single space. Make sure there is a space
    * at the beginning, no space at the end. Append a newline, and return the result.
   */
   function fixline($line) {
       return ' ' . trim(@preg_replace('/[\t\n\v\f\r ]+/', ' ', $line), ' ') . "\n";
   }

   $files = glob('/path/to/log/files/*.log', GLOB_NOSORT);
   echo "  <p>Found ", @count($files), " log files.</p>\n";

   if ($files !== FALSE) {

       $temp  = $files;
       $files = array();
       foreach ($temp as $logfile) {
           $index = preg_replace('/^.*\/([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])-*(.*)\.log$/', '$1 $2', $logfile);
           $time = mktime(0,0,0, intval(substr($index, 2, 2), 10), intval(substr($index, 0, 2), 10), intval(substr($index,4,4), 10));
           $files[$time  . substr($index, 8)] = $logfile;
       }
       unset($temp);
       krsort($files);

       foreach ($files as $key => $logfile) {
           list($time, $department) = explode(' ', $key, 2);
           $time = intval($time, 10);
           $department = trim($department, "\n\t\v\f\r ");

           $data = @file_get_contents($logfile);
           if ($data !== FALSE) {

               /* Convert all newline conventions to a single "\n". Also,
                * remove all leading and trailing whitespace from each line.
               */
               $data = @preg_replace('/[\t\n\v\f\r ]*[\n\r][\t\n\v\f\r ]*/', "\n", $data);

               /* Merge any line not starting with a capital letter with the previous line. */
               $data = @preg_replace('/\n([^A-Z])/', " \\1", $data);

               /* Merge all lines not containing a colon to the previous line. */
               $data = @preg_replace('/\n([^:]*)\n/e', "fixline('\\1')", $data);

               /* Convert to an array of lines. */
               $data = @explode("\n", $data);

               $title = date('D, j M Y', $time);
               echo "  <table class=\"report\">\n";
               echo "   <tr>\n";
               echo "    <th class=\"title\" colspan=\"2\">", htmlentities($department, ENT_QUOTES, 'UTF-8'),
                    " - ", htmlentities($title, ENT_QUOTES, 'UTF-8'), "</th>\n";
               echo "   </tr>\n";
               foreach ($data as $entry) {
                   @list($key, $value) = @explode(':', $entry, 2);
                   echo "   <tr>\n";
                   echo "    <th>", htmlentities(trim($key), ENT_COMPAT, 'UTF-8'), "</th>\n";
                   echo "    <td>", htmlentities(trim($value), ENT_COMPAT, 'UTF-8'), "</td>\n";
                   echo "   </tr>\n";
               }
               echo "  </table>\n";
           }
       }
   }
?>
 </body>
</html>
Only seven lines changed. The department is now listed before the date. Hope this works for you,
Nominal Animal

Last edited by Nominal Animal; 03-21-2011 at 08:11 AM.
 
Old 02-28-2011, 10:51 AM   #23
devUnix
Member
 
Registered: Oct 2010
Location: Bengaluru, India
Distribution: RHEL 5.1 on My PC, & SunOS / Sun Solaris, RHEL, SuSe, Debian, FreeBSD and other Linux flavors @ Work
Posts: 553

Original Poster
Rep: Reputation: 46
I have not cheked out this last version of your script. But, I must say you are really awesome when it comes to help others and are also very creative.

Well, it is weird. I was doing it all to facilitate the reporting work done by the team but they said they would continue sending reports on emails. I developed similar web applications for Command Center / Data Center Operations and Helpdesk in some organization and they were so happy that they cannot do anything without my applications. Really, people need to have a sense of "process enhancement". What do you say?
 
Old 02-28-2011, 02:17 PM   #24
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942Reputation: 942
Quote:
Originally Posted by devUnix View Post
Well, it is weird. I was doing it all to facilitate the reporting work done by the team but they said they would continue sending reports on emails. I developed similar web applications for Command Center / Data Center Operations and Helpdesk in some organization and they were so happy that they cannot do anything without my applications. Really, people need to have a sense of "process enhancement". What do you say?
It all depends on the workflow, as a whole. I mean, my e-mail client sorts my e-mail into categories, and if there is a class of e-mails I need to respond to on a short notice, I can have it alert me. It works for me if I'm programming, or doing some sort of creative work, when I want to concentrate on one thing without distractions. On the other hand, when I'm monitoring a large number of resources, I prefer a web interface with visuals that are easy on the eye.

It may be frustrating at times, but whether or not a tool enhances an environment or a process, is always a complex question -- no matter how simple or easy or brilliant it is. I've personally found that trying things out, even if discarded or found lacking or just torpedoed without explanation, is beneficial in the long run. The ideas tend to lurk in the subconcious, and pop out the next time they might be useful. Don't get discouraged if this solution does not fit your team. Let them know you only wanted to make everybodys workflow easier, and that you accept that that solution wasn't suitable to current needs; you'll think of something else, something better later on. You might find that that attitude impresses coworkers much more than brilliant solutions.

After all, the important thing is to keep trying to improve. Not necessarily the tools per se, but the workflow.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Will gawk extract bits of text fields from a few thousand identically structured file taskmaster Linux - Software 4 11-10-2010 08:46 PM
[SOLVED] Comparing and Formatting the text file flamingo_l Programming 13 10-13-2010 03:16 AM
How to parse text file to a set text column width and output to new text file? jsstevenson Programming 12 04-23-2008 02:36 PM
how not to print the 4th field from a text file with six fields livetoday Red Hat 3 10-02-2007 01:19 PM
Can't enter text in certain Java text fields TheBelush Linux - Software 4 04-27-2005 05:29 PM


All times are GMT -5. The time now is 06:35 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration