LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (http://www.linuxquestions.org/questions/programming-9/)
-   -   Flat File QC Scripting (http://www.linuxquestions.org/questions/programming-9/flat-file-qc-scripting-145392/)

mychl 02-12-2004 05:03 PM

Flat File QC Scripting
 
Hi All, need some help.

I have a linux server that runs php/mysql etc... I'm trying to automate the QC process on various Flat File formats (txt, csv).

All I really need to do is verify the total record length for aeach line of data.

Anyone have any ideas about what type of script I will be able to use to do this. Also, if it's a script I can have run via the web server, that would be great also.

Thanks for any input in advance----

mychl

Tinkster 02-13-2004 04:10 AM

I don't even know what QC is :}


Cheers,
Tink

mychl 02-13-2004 09:19 AM

Quality Control.... basically just verifying the data....

:)

jim mcnamara 02-13-2004 12:18 PM

This is a start:
Code:

#!/bin/sh
# script: linecnt
# the $1 parameter is the file name
# usage: linecnt <file name>

echo "Line counts for $1"
let counter=0
while read rec
do
    counter=`expr $counter + 1`
    echo "$counter: `echo $rec | wc -c`"   
done  < $1
# eof


mychl 02-15-2004 03:19 PM

Thanks Jim, that gave me a good start.....

Works great for TEXT format, now I'm working on CSV.

I'll post what I work out....

Thanks again!

mychl 02-20-2004 12:53 PM

Here is what I came up with, using perl...

[code]
#!/usr/bin/perl
########################################################################
open(LOGFILE, ">QC_TXT_LOG.log");
print LOGFILE "Begin Parsing All TXT Files\n";
print LOGFILE "========================================\n\n";
# Scan Parent Directory for Sub-Folders
opendir(MAIN, ".");
@dirs=(readdir(MAIN));
closedir(MAIN);
#######################################################################
# Recurse into sub-folders
foreach $dir(@dirs){
if(-d "$dir"){
print LOGFILE "Profile: $dir\n";
print LOGFILE "===============================\n";
opendir(DIR, "$dir");
@files=grep(/\.TXT$/|/\.txt$/,readdir(DIR));
closedir(DIR);
$linecount=0;
foreach $file(@files){
@fileinfo=stat("$dir/$file");
$filesize=((@fileinfo[7])/1000);
$filedate=localtime(@fileinfo[9]);
print LOGFILE "\tFilename: $file\n";
print LOGFILE "\tFile Size: $filesize Kb\n";
print LOGFILE "\tFile Date: $filedate\n";
$linecount=0;
open FILE, "$dir/$file";
while (<FILE>){
$linecount++;
if($.==1){
$first_length=length($_);
print LOGFILE "\t\tRecord Length:\t$first_length\n";
}
$lcount=length($_);
if($lcount!=$first_length){
print LOGFILE "***Line:$. Count:$lcount DOES NOT MATCH PROFILE***\n";
}
}
close FILE;
print LOGFILE "\t\tRecord Count:\t$linecount\n";
}
print LOGFILE "===============================\n";
}
}
[\code]

This script saves my staff and I about 4 hours of work every week......

wahoooooooooooooooo


All times are GMT -5. The time now is 11:54 AM.