Formatting Fields and Text Being Displayed from Text File

MTK358 · 02-10-2011, 04:25 PM

Unless I misunderstood you, you said there are some fields that have more than one "name: value" statement per line.

Nominal Animal · 02-10-2011, 06:18 PM

The script will skip log files that do not strictly match the pattern. I assumed DDMMYYYY-IT.log format; is that correct?

As to the multi-line reports: You can do additional filtering before processing, to make sure all input lines are of format key:value. Here is a second version with some input filtering. This time, the file is read using the file_get_contents function, and filtered using the preg_replace function. Any line not containing a colon or not starting with a capital letter is merged with the previous line. If you want to keep the newlines in the report texts, that too is possible, it just takes more detailed filtering.

Code:

<html>
 <head>
  <title>
   Example problem report
  </title>
  <style type="text/css">
   table.report {
    width: 40em !important;
    padding: 0 0 0 0;
    border: 1px solid #cccccc;
    margin: 0 0 2em 0;
    border-collapse: collapse;
    border-spacing: 0;
   }
   table.report td {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: left;
    vertical-align: top;
    font-weight: normal;
   }
   table.report th {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: right;
    vertical-align: top;
    font-weight: bold;
   }
   table.report th.title {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border-top: 0 none;
    border-right: 0 none;
    border-bottom: 1px solid #cccccc;
    border-left: 0 none;
    background: #efefef;
    text-align: center;
    vertical-align: middle;
    font-weight: bold;
   }
  </style>
 </head>
 <body>
<?PHP

   /* Combine all whitespace into a single space. Make sure there is a space
    * at the beginning, no space at the end. Append a newline, and return the result.
   */
   function fixline($line) {
       return ' ' . trim(@preg_replace('/[\t\n\v\f\r ]+/', ' ', $line), ' ') . "\n";
   }

   $files = glob('/path/to/it/logs/*.log', GLOB_NOSORT);
   echo "  <p>Found ", @count($files), " log files.</p>\n";

   if ($files !== FALSE) {

       $temp  = $files;
       $files = array();
       foreach ($temp as $logfile) {
           $index = preg_replace('/^.*\/([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]).*$/', '$1', $logfile);
           $time = mktime(0,0,0, intval(substr($index, 2, 2), 10), intval(substr($index, 0, 2), 10), intval(substr($index,4,4), 10));
           $files[$time] = $logfile;
       }
       unset($temp);
       krsort($files);

       foreach ($files as $time => $logfile) {

           $data = @file_get_contents($logfile);
           if ($data !== FALSE) {

               /* Convert all newline conventions to a single "\n". Also,
                * remove all leading and trailing whitespace from each line.
               */
               $data = @preg_replace('/[\t\n\v\f\r ]*[\n\r][\t\n\v\f\r ]*/', "\n", $data);

               /* Merge any line not starting with a capital letter with the previous line. */
               $data = @preg_replace('/\n([^A-Z])/', " \\1", $data);

               /* Merge all lines not containing a colon to the previous line. */
               $data = @preg_replace('/\n([^:]*)\n/e', "fixline('\\1')", $data);

               /* Convert to an array of lines. */
               $data = @explode("\n", $data);

               $title = date('D, j M Y', $time);
               echo "  <table class=\"report\">\n";
               echo "   <tr>\n";
               echo "    <th class=\"title\" colspan=\"2\">", htmlentities($title, ENT_QUOTES, 'UTF-8'), "</th>\n";
               echo "   </tr>\n";
               foreach ($data as $entry) {
                   @list($key, $value) = @explode(':', $entry, 2);
                   echo "   <tr>\n";
                   echo "    <th>", htmlentities(trim($key), ENT_COMPAT, 'UTF-8'), "</th>\n";
                   echo "    <td>", htmlentities(trim($value), ENT_COMPAT, 'UTF-8'), "</td>\n";
                   echo "   </tr>\n";
               }
               echo "  </table>\n";
           }
       }
   }
?>
 </body>
</html>

Note that this is still not bulletproof. If you have a list of all possible headings, or a pattern that only matches a heading, you could split the line at their occurrences, instead of newlines.

Does this script work for you better?

Nominal Animal

devUnix · 02-11-2011, 09:18 AM

Quote:

Originally Posted by Nominal Animal

The script will skip log files that do not strictly match the pattern. I assumed DDMMYYYY-IT.log format; is that correct?

As to the multi-line reports: You can do additional filtering before processing, to make sure all input lines are of format key:value. Here is a second version with some input filtering. This time, the file is read using the file_get_contents function, and filtered using the preg_replace function. Any line not containing a colon or not starting with a capital letter is merged with the previous line. If you want to keep the newlines in the report texts, that too is possible, it just takes more detailed filtering.

Code:

<html>
 <head>
  <title>
   Example problem report
  </title>
  <style type="text/css">
   table.report {
    width: 40em !important;
    padding: 0 0 0 0;
    border: 1px solid #cccccc;
    margin: 0 0 2em 0;
    border-collapse: collapse;
    border-spacing: 0;
   }
   table.report td {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: left;
    vertical-align: top;
    font-weight: normal;
   }
   table.report th {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: right;
    vertical-align: top;
    font-weight: bold;
   }
   table.report th.title {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border-top: 0 none;
    border-right: 0 none;
    border-bottom: 1px solid #cccccc;
    border-left: 0 none;
    background: #efefef;
    text-align: center;
    vertical-align: middle;
    font-weight: bold;
   }
  </style>
 </head>
 <body>
<?PHP

   /* Combine all whitespace into a single space. Make sure there is a space
    * at the beginning, no space at the end. Append a newline, and return the result.
   */
   function fixline($line) {
       return ' ' . trim(@preg_replace('/[\t\n\v\f\r ]+/', ' ', $line), ' ') . "\n";
   }

   $files = glob('/path/to/it/logs/*.log', GLOB_NOSORT);
   echo "  <p>Found ", @count($files), " log files.</p>\n";

   if ($files !== FALSE) {

       $temp  = $files;
       $files = array();
       foreach ($temp as $logfile) {
           $index = preg_replace('/^.*\/([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]).*$/', '$1', $logfile);
           $time = mktime(0,0,0, intval(substr($index, 2, 2), 10), intval(substr($index, 0, 2), 10), intval(substr($index,4,4), 10));
           $files[$time] = $logfile;
       }
       unset($temp);
       krsort($files);

       foreach ($files as $time => $logfile) {

           $data = @file_get_contents($logfile);
           if ($data !== FALSE) {

               /* Convert all newline conventions to a single "\n". Also,
                * remove all leading and trailing whitespace from each line.
               */
               $data = @preg_replace('/[\t\n\v\f\r ]*[\n\r][\t\n\v\f\r ]*/', "\n", $data);

               /* Merge any line not starting with a capital letter with the previous line. */
               $data = @preg_replace('/\n([^A-Z])/', " \\1", $data);

               /* Merge all lines not containing a colon to the previous line. */
               $data = @preg_replace('/\n([^:]*)\n/e', "fixline('\\1')", $data);

               /* Convert to an array of lines. */
               $data = @explode("\n", $data);

               $title = date('D, j M Y', $time);
               echo "  <table class=\"report\">\n";
               echo "   <tr>\n";
               echo "    <th class=\"title\" colspan=\"2\">", htmlentities($title, ENT_QUOTES, 'UTF-8'), "</th>\n";
               echo "   </tr>\n";
               foreach ($data as $entry) {
                   @list($key, $value) = @explode(':', $entry, 2);
                   echo "   <tr>\n";
                   echo "    <th>", htmlentities(trim($key), ENT_COMPAT, 'UTF-8'), "</th>\n";
                   echo "    <td>", htmlentities(trim($value), ENT_COMPAT, 'UTF-8'), "</td>\n";
                   echo "   </tr>\n";
               }
               echo "  </table>\n";
           }
       }
   }
?>
 </body>
</html>

Note that this is still not bulletproof. If you have a list of all possible headings, or a pattern that only matches a heading, you could split the line at their occurrences, instead of newlines.

Does this script work for you better?

Nominal Animal

It works fine! There was only one wierd thing:

It reported:

Found 5 log files.

That was correct. But only 3 of them were displayed. I added one more log file. This time it reported:

Found 6 log files.

But again, 2 files were not displayed.

The log files have this format: DDMMYYYY-SomeDepartment.log

SomeDepartment could be any one word value consiting of characters from A-Z and a-z such as:

DDMMYYYY-IT.log
DDMMYYYY-Messaging.log
DDMMYYYY-Server.log

More specifically:

Code:

-bash-2.05b# ls -1t /tmp/logs
11022011-Mike.log
10022011-Mariyam.log
09022011-Mariyam.log
09022011-Ken.log
09022011-Mike.log
08022011-Mike.log

So, I changed this line:

$files = glob('/tmp/logs/*.log', GLOB_NOSORT);

Everything looks fine. Not sure, why some files are being skipped. Only these reports are being displayed:

Fri, 11 Feb 2011

Thu, 10 Feb 2011

Wed, 9 Feb 2011

Tue, 8 Feb 2011

These files are being skipped:

09022011-Ken.log
09022011-Mike.log

devUnix · 02-11-2011, 09:21 AM

Quote:

Originally Posted by MTK358

Unless I misunderstood you, you said there are some fields that have more than one "name: value" statement per line.

Hm... Name: Value is correct. But the value could be of multiple lines.

MTK358 · 02-11-2011, 09:47 AM

Quote:

Originally Posted by devUnix

Hm... Name: Value is correct. But the value could be of multiple lines.

OK. I assumed that the end of the line == the end of the value. So the value ends on the next line that contains a ':' character?

devUnix · 02-11-2011, 10:00 AM

Quote:

Originally Posted by MTK358

OK. I assumed that the end of the line == the end of the value. So the value ends on the next line that contains a ':' character?

Consider this log:

User Name: Mr. Cracker
Group: Hackers
Detailed Biodata:
Cracked over 1000 propietory applications. Kills several processes whenever logs in. Does not let the IT world go smoothly. Never gets trapped. Blah! Blah!
Conclusion:
He is selected for the job!

What appears in bold above should also appear in bold on a web page. That's it. But the original log file does not contain any HTML tag or code whatsoever. It contains only logs! When we look at the above log, we can easily identify the fields: User Name, Group, Detailed Biodata, and Conclusion. All of them end in a colon- : and begin on a separate line. We cannot say where a particular field's value (Detailed Biodata, for example) ends- but we can identify it only by looking at it. We see that the next item (Conclusion, in the given example) begins here and we conclude that the previous item's value (the Detailed Biodata) comes to an end here.

Nominal Animal · 02-11-2011, 06:07 PM

Quote:

Originally Posted by devUnix

The log files have this format: DDMMYYYY-SomeDepartment.log

That's the reason. The file names are keyed by YYYYMMDD, so only one report per day is shown. It's easy to fix, though; just append the department to the key:

Code:

<html>
 <head>
  <title>
   Example problem report
  </title>
  <style type="text/css">
   table.report {
    width: 40em !important;
    padding: 0 0 0 0;
    border: 1px solid #cccccc;
    margin: 0 0 2em 0;
    border-collapse: collapse;
    border-spacing: 0;
   }
   table.report td {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: left;
    vertical-align: top;
    font-weight: normal;
   }
   table.report th {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border: 0 none;
    margin: 0 0 0 0;
    text-align: right;
    vertical-align: top;
    font-weight: bold;
   }
   table.report th.title {
    padding: 0.5em 0.5em 0.5em 0.5em;
    border-top: 0 none;
    border-right: 0 none;
    border-bottom: 1px solid #cccccc;
    border-left: 0 none;
    background: #efefef;
    text-align: center;
    vertical-align: middle;
    font-weight: bold;
   }
  </style>
 </head>
 <body>
<?PHP

   /* Combine all whitespace into a single space. Make sure there is a space
    * at the beginning, no space at the end. Append a newline, and return the result.
   */
   function fixline($line) {
       return ' ' . trim(@preg_replace('/[\t\n\v\f\r ]+/', ' ', $line), ' ') . "\n";
   }

   $files = glob('/path/to/log/files/*.log', GLOB_NOSORT);
   echo "  <p>Found ", @count($files), " log files.</p>\n";

   if ($files !== FALSE) {

       $temp  = $files;
       $files = array();
       foreach ($temp as $logfile) {
           $index = preg_replace('/^.*\/([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9])-*(.*)\.log$/', '$1 $2', $logfile);
           $time = mktime(0,0,0, intval(substr($index, 2, 2), 10), intval(substr($index, 0, 2), 10), intval(substr($index,4,4), 10));
           $files[$time  . substr($index, 8)] = $logfile;
       }
       unset($temp);
       krsort($files);

       foreach ($files as $key => $logfile) {
           list($time, $department) = explode(' ', $key, 2);
           $time = intval($time, 10);
           $department = trim($department, "\n\t\v\f\r ");

           $data = @file_get_contents($logfile);
           if ($data !== FALSE) {

               /* Convert all newline conventions to a single "\n". Also,
                * remove all leading and trailing whitespace from each line.
               */
               $data = @preg_replace('/[\t\n\v\f\r ]*[\n\r][\t\n\v\f\r ]*/', "\n", $data);

               /* Merge any line not starting with a capital letter with the previous line. */
               $data = @preg_replace('/\n([^A-Z])/', " \\1", $data);

               /* Merge all lines not containing a colon to the previous line. */
               $data = @preg_replace('/\n([^:]*)\n/e', "fixline('\\1')", $data);

               /* Convert to an array of lines. */
               $data = @explode("\n", $data);

               $title = date('D, j M Y', $time);
               echo "  <table class=\"report\">\n";
               echo "   <tr>\n";
               echo "    <th class=\"title\" colspan=\"2\">", htmlentities($department, ENT_QUOTES, 'UTF-8'),
                    " - ", htmlentities($title, ENT_QUOTES, 'UTF-8'), "</th>\n";
               echo "   </tr>\n";
               foreach ($data as $entry) {
                   @list($key, $value) = @explode(':', $entry, 2);
                   echo "   <tr>\n";
                   echo "    <th>", htmlentities(trim($key), ENT_COMPAT, 'UTF-8'), "</th>\n";
                   echo "    <td>", htmlentities(trim($value), ENT_COMPAT, 'UTF-8'), "</td>\n";
                   echo "   </tr>\n";
               }
               echo "  </table>\n";
           }
       }
   }
?>
 </body>
</html>

Only seven lines changed. The department is now listed before the date. Hope this works for you,

Nominal Animal

devUnix · 02-28-2011, 10:51 AM

I have not cheked out this last version of your script. But, I must say you are really awesome when it comes to help others and are also very creative.

Well, it is weird. I was doing it all to facilitate the reporting work done by the team but they said they would continue sending reports on emails. I developed similar web applications for Command Center / Data Center Operations and Helpdesk in some organization and they were so happy that they cannot do anything without my applications. Really, people need to have a sense of "process enhancement". What do you say?

Nominal Animal · 02-28-2011, 02:17 PM

Quote:

Originally Posted by devUnix

Well, it is weird. I was doing it all to facilitate the reporting work done by the team but they said they would continue sending reports on emails. I developed similar web applications for Command Center / Data Center Operations and Helpdesk in some organization and they were so happy that they cannot do anything without my applications. Really, people need to have a sense of "process enhancement". What do you say?

It all depends on the workflow, as a whole. I mean, my e-mail client sorts my e-mail into categories, and if there is a class of e-mails I need to respond to on a short notice, I can have it alert me. It works for me if I'm programming, or doing some sort of creative work, when I want to concentrate on one thing without distractions. On the other hand, when I'm monitoring a large number of resources, I prefer a web interface with visuals that are easy on the eye.

It may be frustrating at times, but whether or not a tool enhances an environment or a process, is always a complex question -- no matter how simple or easy or brilliant it is. I've personally found that trying things out, even if discarded or found lacking or just torpedoed without explanation, is beneficial in the long run. The ideas tend to lurk in the subconcious, and pop out the next time they might be useful. Don't get discouraged if this solution does not fit your team. Let them know you only wanted to make everybodys workflow easier, and that you accept that that solution wasn't suitable to current needs; you'll think of something else, something better later on. You might find that that attitude impresses coworkers much more than brilliant solutions.

After all, the important thing is to keep trying to improve. Not necessarily the tools per se, but the workflow.