Hi All
I have the following structure in a folder containing a very large chunk of data
Code:
Current Folder
A lot of SubFolders (maybe 2000+)
A lot more of Other Subfolders (maybe another 2000+) in each single "upper" folder from above
About 500+ text files per folder from each above folder.
In total you're looking at about 460000(more or less) text files. I want to get certain data out of those files, and output it as required into a single output file in a single line per file.
i.e. My final output.txt file will have about 460000 entries of
Blah, blah, blah, blah (from file 1)
Blah, blah, blah, blah (from file 2)
........
Blah, blah, blah, blah (from the last file)
I have the following script which doesn't work as expected because it gives me a blank output.txt file. Not able to trace where and why it's going wrong. I am pretty new (actually first script) to Perl and would appreciate your help.
Code:
#!/usr/bin/perl -w
use File::Find;
use Digest::MD5 qw(md5 md5_hex md5_base64);
$dir = `pwd`;
chomp($dir);
#globals
@directories = ($dir);
@foundFiles = ();
print("======================\n");
print("searching...\n");
foreach my $d(@directories){
find( sub { push @foundFiles, $File::Find::name if(/\.txt/) }, @directories );
print("======================\n");
print("found " . $#foundFiles . " files\n");
print("======================\n");
open(OUT,">output.txt") or die "cant create output file";
#get files to update:
foreach my $f (@foundFiles) {
open(F, $f) or die("WARNING: could not open $f\n");
$foundDate = 0;
$foundTime = 0;
$getInfo = 0;
$i = 0;
while($line = <F>) {
next if($line =~ /^\s*$/);
chomp($line);
if($line =~ /\d{4}-\d{2}-\d{2}/) {
#1992-09-01
$date = $line;
$foundDate++;
} elsif ($line =~ /\d{2}:\d{2}:\d{2}/) {
#10:59:32
$time = $line;
$foundTime++;
}
$i++ if($getInfo);
$getInfo = 1 if($foundTime == 2 && $foundDate == 2);
if ($i == 3) {
#ENT
$code = $line;
} elsif ($i == 4) {
#55
$pages = $line;
} elsif ($i == 6) {
#S/Holder Details Change in Substantial (S.43)
$info = $line;
}
}
#ENT,1992-09-01,10:59:32,55, S/Holder Details Change in Substantial (S.43)
print OUT "$code|$date|$time|$pages|$info\n" if($code && $code =~ /\d{3}/);
#system("db insert something something");
close(F);
}
}
close(OUT);