Perl guru needed
I am totally noob to perl, and I want to write an efficient redirector script for squid in perl. The script downloads the target of each url, checks the downloaded file and rewrites the url if the file contains a virus. (there are two proxies in series, so delays by duplicate download are avoided).
My script is ready and works well, however it is not very efficient in its present form, as it calls an external bash script on each file for jpeg sanity checking.
I want to do this in perl, too, e.g. as a function of my redirector script.
I found this somewhere; and I think I could use it as a start:
@stat = stat($file);
$size = $stat[7];
open HANDLE, $file;
sysread(HANDLE, $input, $size);
close HANDLE;
if ($input =~ /\xff\xfe\x00[\x00\x01]/s) {
@debug = `djpeg -debug $file 2>&1 > /dev/null`;
if (grep (/Corrupt JPEG data/i, @debug)) {
print "jpeg has trojan\n";
} else {
print "no trojan found\n";
}
}
My questions are:
1.) So far as I can judge, the above script reads a whole file into variable $file. I suppose this is not very efficient as it may require a lot of memory. Could it be modified so that it treats (checks) the file as a stream? How?
2.) Could the @debug variable be avoided, and the grep performed on the output stream of djpeg directly? How?
3.) How to modify the script so that it accepts its input from a pipe rather than processing a file? (This would be extremely advantageous in the case of jpegs in a zip archive: then I could pipe unzip's standard output to the input of the jpeg checker function.)
Squid starts several instances of the redirector perl script. Each redirector process should have read/write access to a common url blacklist/whitelist, in order to speed up things by avoiding the re-checking of urls that have already been checked. Besides, the url blacklist/whitelist should be kept in memory, as it is faster.
My question is:
4) What is the best way to share this whitelist/blacklist between several instances of the same perl script?
5) Can the lists be kept in memory efficiently?
Thanks in advance for your hints.
Last edited by J_Szucs; 10-09-2004 at 08:21 PM.
|