[Perl] Search for multiple files in multipe directories
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
[Perl] Search for multiple files in multipe directories
Hi all!
Still I'm struggling with a script to remove a lot of log file.
In a previous post I asked for the wanted function to search for files or directories.
Now I'm stuck on the next:
On a server there are a lot of log files is directories.
In (korn)shell the search is simple:
find . -name /dirname1/dirname2/*/dir/*log
I'm now looking for a solution to do the same in perl.
The code below should be driven by an Oracle datebase where each line in @rmlines is filled with the follow options:
server/path where files are:search string:days keep files:type file or directory
If i fill an wildcard in the server/path part, the search doesn't work. Does any one have a suggestion?
Code:
use strict;
use warnings;
use English;
use File::Copy;
use File::Find ();
use Env;
use Cwd ();
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
sub wanted;
my $directory;
my $filename;
my $saveday;
my $filetype;
sub wanted ($@) {
my ($dev,$ino,$mode,$nlink,$uid,$gid);
my $cwd= Cwd::cwd();
/$filename/s &&
(($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
( ( $filetype eq 'f' ) ? -f _ : -d _ ) &&
(int(-M _) > $saveday) &&
do_action( "${cwd}/$_",$filetype);
}
ub remove_dir {
my $dir = shift;
local *DIR;
opendir DIR, $dir or die "opendir $dir: $!";
for (readdir DIR) {
next if /^\.{1,2}$/;
my $path = "$dir/$_";
print "Stap 4 f verwijderen file : $_\n";
# unlink $path or print "Error removing file $path uit dir - $!\n" if -f $path ;
remove_dir($path) if -d $path;
}
closedir DIR;
print "Stap 4 d verwijderen directory : $dir\n";
#rmdir $dir or print "Error removing directory $dir - $!\n" if -d $dir ;
}
my @rmlines = ( '//server-x/tmp:^.*\.log|LOG|Log\z:30:f:',
'//server-y/d$:^Oracle/admin/\.*/adump\*.*:30:f'
) ;
foreach (@rmlines)
s/#.*//; # ignore comments by erasing them
next if /^(\s)*$/; # skip blank lines
($directory, $filename, $saveday, $filetype)=split(":");
if ( ! -d $directory ) {
print "Warning: $directory with $filename not found and cleaned\n";
} else {
File::Find::find({wanted => \&wanted }, $directory);
}
};
Still I'm struggling with a script to remove a lot of log file.
In a previous post I asked for the wanted function to search for files or directories.
Now I'm stuck on the next:
On a server there are a lot of log files is directories.
In (korn)shell the search is simple:
find . -name /dirname1/dirname2/*/dir/*log
I'm now looking for a solution to do the same in perl.
The code below should be driven by an Oracle datebase where each line in @rmlines is filled with the follow options:
server/path where files are:search string:days keep files:type file or directory
If i fill an wildcard in the server/path part, the search doesn't work. Does any one have a suggestion?
Code:
use strict;
use warnings;
use English;
use File::Copy;
use File::Find ();
use Env;
use Cwd ();
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
sub wanted;
my $directory;
my $filename;
my $saveday;
my $filetype;
sub wanted ($@) {
my ($dev,$ino,$mode,$nlink,$uid,$gid);
my $cwd= Cwd::cwd();
/$filename/s &&
(($dev,$ino,$mode,$nlink,$uid,$gid) = lstat($_)) &&
( ( $filetype eq 'f' ) ? -f _ : -d _ ) &&
(int(-M _) > $saveday) &&
do_action( "${cwd}/$_",$filetype);
}
ub remove_dir {
my $dir = shift;
local *DIR;
opendir DIR, $dir or die "opendir $dir: $!";
for (readdir DIR) {
next if /^\.{1,2}$/;
my $path = "$dir/$_";
print "Stap 4 f verwijderen file : $_\n";
# unlink $path or print "Error removing file $path uit dir - $!\n" if -f $path ;
remove_dir($path) if -d $path;
}
closedir DIR;
print "Stap 4 d verwijderen directory : $dir\n";
#rmdir $dir or print "Error removing directory $dir - $!\n" if -d $dir ;
}
my @rmlines = ( '//server-x/tmp:^.*\.log|LOG|Log\z:30:f:',
'//server-y/d$:^Oracle/admin/\.*/adump\*.*:30:f'
) ;
foreach (@rmlines)
s/#.*//; # ignore comments by erasing them
next if /^(\s)*$/; # skip blank lines
($directory, $filename, $saveday, $filetype)=split(":");
if ( ! -d $directory ) {
print "Warning: $directory with $filename not found and cleaned\n";
} else {
File::Find::find({wanted => \&wanted }, $directory);
}
};
Why do you have
Code:
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
You have a number '_' (not $_) in your code - why ?
What did you do to debug ? I.e. what is the earliest place in your code in which actual behavior differs from actual one ?
Which exactly line with wildcards doesn't work and why (based on what documentation) do you expect it to work ?
What does '//' in
Code:
my @rmlines = ( '//server-x/tmp:^.*\.log|LOG|Log\z:30:f:',
'//server-y/d$:^Oracle/admin/\.*/adump\*.*:30:f'
) ;
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
You have a number '_' (not $_) in your code - why ?
The main part of the script is coming from the command:
find2perl
where you can put the find command mentioned above.
Quote:
What did you do to debug ? I.e. what is the earliest place in your code in which actual behavior differs from actual one ?
Which exactly line with wildcards doesn't work and why (based on what documentation) do you expect it to work ?
I've cleaned up the script by removing all the print statements.
the frase where I'm looking for like
//server-x/tmp/*/otherdir/*.log
the first wildcard is the problem to solve. In the search it doesn't expand at all
Quote:
What does '//' in
Code:
my @rmlines = ( '//server-x/tmp:^.*\.log|LOG|Log\z:30:f:',
'//server-y/d$:^Oracle/admin/\.*/adump\*.*:30:f'
) ;
mean ?
The script has to run on Windows server using the unc path to the differend shares.
The @rmline contains 4 parts:
server and path
search files
days to keep the file
choice to remove files or entire directory
The main part of the script is coming from the command:
find2perl
where you can put the find command mentioned above.
I've cleaned up the script by removing all the print statements.
the frase where I'm looking for like
//server-x/tmp/*/otherdir/*.log
the first wildcard is the problem to solve. In the search it doesn't expand at all
The script has to run on Windows server using the unc path to the differend shares.
The @rmline contains 4 parts:
server and path
search files
days to keep the file
choice to remove files or entire directory
Well, if you are under Windows, how can you be sure your file paths contain '/' and not '\' ? At all. you can write portably, read
perldoc File::Spec
.
Regarding the wildcard. I guess you are want your regular expression to match. So, have you taken an example path which is supposed to match, and your regular expression, and checked that the expression really works ?
I mean, in such cases I do not trust myself and if I want to check that, say,
/foo/bar/doo.txt
matches
|/foo/bar/*\.txt$|
I am writing a junk script more or less like this:
Code:
my $line = '/foo/bar/doo.txt';
my $re = '/foo/bar/*\.txt$';
$line =~ m|$re| and warn "\$line=$line matches \$re=$re";
My point is that if you suspect the regular expression doesn't work, isolate the case and don't overload yourself and us with unrelated code.
#! C:\Perl\bin\perl.exe -w
use strict;
use File::Find ();
# for the convenience of &wanted calls, including -eval statements:
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
#my $searchdir='c:/tmp/logdir1/dir2'; # this is working!
my $searchdir="c:/tmp/*/dir2"; # like this is working too!
my $searchfile='^.*\.log\z';
sub wanted;
# Traverse desired filesystems
print "Zoekdir = $searchdir \n";
File::Find::find({wanted => \&wanted}, "$searchdir");
exit;
sub wanted {
my ($dev,$ino,$mode,$nlink,$uid,$gid);
/$searchfile/s &&
print("$name\n");
}
The base is created by find2perl:
find2perl c:/tmp/*/log -name *.log
The problem appears on the working out of the var $searchdir. It doesn't work out to the 3 separate dirs/log.
#! C:\Perl\bin\perl.exe -w
use strict;
use File::Find ();
# for the convenience of &wanted calls, including -eval statements:
use vars qw/*name *dir *prune/;
*name = *File::Find::name;
*dir = *File::Find::dir;
*prune = *File::Find::prune;
#my $searchdir='c:/tmp/logdir1/dir2'; # this is working!
my $searchdir="c:/tmp/*/dir2"; # like this is working too!
my $searchfile='^.*\.log\z';
sub wanted;
# Traverse desired filesystems
print "Zoekdir = $searchdir \n";
File::Find::find({wanted => \&wanted}, "$searchdir");
exit;
sub wanted {
my ($dev,$ino,$mode,$nlink,$uid,$gid);
/$searchfile/s &&
print("$name\n");
}
The base is created by find2perl:
find2perl c:/tmp/*/log -name *.log
The problem appears on the working out of the var $searchdir. It doesn't work out to the 3 separate dirs/log.
Regarding
Quote:
for the convenience of &wanted calls, including -eval statements
- it's really a bad idea to mess up global name space.
You can use a scalar reference, e.g.:
Code:
my $name_ref = \$File::Find::name;
...
my $name = $$name_ref; # dereferencing
Regarding
Quote:
Quote:
perl is able to handle the forward / as a backward \ running in a Windows environment
- of course it can, but the question is how path separator is reported by File::Find module.
Thanks for the remarks!
I'm rearly new to Perl and I will create nice scripts.
Quote:
Now, what doesn't work ? I.e. now your code performs just file search, so does it find the files you expect it to find ?
The script will find the files mentioned only as the complete path is given. The perpose is to make the path variable as well, eg search in several path with one statement, using a wildcard in the pathname.
Thanks for the remarks!
I'm rearly new to Perl and I will create nice scripts.
The script will find the files mentioned only as the complete path is given. The perpose is to make the path variable as well, eg search in several path with one statement, using a wildcard in the pathname.
Tanks in advance.
I don't think you can use wildcard in file name in UNIX/Windows sense, but you should rather use a regular expression.
I think you are overcomplicating your development. You have to accept the stages approach (regardless of language). I mean for your case:
1) find all the files under a given directory root and put them into an array;
2) run a loop over the array using the needed in your opinion regular expression checking whether it works as expected. The word "checking" means s comparing the array element to be matched and the regular expression and analyzing why it doesn't work as expected.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.