LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Perl: Counting Files With File::Find (https://www.linuxquestions.org/questions/programming-9/perl-counting-files-with-file-find-851448/)

Kebman 12-20-2010 04:44 PM

Perl: Counting Files With File::Find
 
So this is my code:
Code:

#!/usr/bin/perl
# Count every occurence of a certain type of file (*.old) recursively
use strict;
use File::Find;

my $dir = ".";
my $count;

find( sub {if ("$/"!=(/\.old$/)){$count++}},$dir);

print "$count\n";

Modification of code I found here. It works, but I don't really know why.

Q1: Why is each filter hit counted only when the conditional is not true?

Q2: I've tried taking the file type, (.old), and put it into a variable for better usability, but then the script fails.

Sergei Steshenko 12-20-2010 06:42 PM

Quote:

Originally Posted by Kebman (Post 4198186)
So this is my code:
Code:

#!/usr/bin/perl
# Count every occurence of a certain type of file (*.old) recursively
use strict;
use File::Find;

my $dir = ".";
my $count;

find( sub {if ("$/"!=(/\.old$/)){$count++}},$dir);

print "$count\n";

Modification of code I found here. It works, but I don't really know why.

Q1: Why is each filter hit counted only when the conditional is not true?

Q2: I've tried taking the file type, (.old), and put it into a variable for better usability, but then the script fails.

First, add

Code:

use warnings;
just before or after

Code:

use strict;
and fix the warnings if any.

Secondly, publish here full non-working code.

Sergei Steshenko 12-20-2010 06:49 PM

And looking at code at given by the OP http://forums.devshed.com/showthread...24#post1473024 I do not see usage of $/ Perl built-in variable; I do no understand what it has to do with the task.

Telemachos 12-21-2010 10:11 AM

Quote:

Originally Posted by Sergei Steshenko (Post 4198291)
And looking at code at given by the OP http://forums.devshed.com/showthread...24#post1473024 I do not see usage of $/ Perl built-in variable; I do no understand what it has to do with the task.

The code on Devshed uses $/ in place of \n in its call to print:
Code:

print "$File::Find::name$/"
I think the OP has misunderstood the original code, and his use of $/ doesn't make any sense at all.

Telemachos 12-21-2010 10:22 AM

Quote:

Originally Posted by Kebman (Post 4198186)
So this is my code:
Code:

#!/usr/bin/perl
# Count every occurence of a certain type of file (*.old) recursively
use strict;
use File::Find;

my $dir = ".";
my $count;

find( sub {if ("$/"!=(/\.old$/)){$count++}},$dir);

print "$count\n";

Modification of code I found here. It works, but I don't really know why.

Q1: Why is each filter hit counted only when the conditional is not true?

Q2: I've tried taking the file type, (.old), and put it into a variable for better usability, but then the script fails.

I think you've tried rewriting someone else's Perl code without really knowing Perl. $/ is a builtin Perl variable for the default input record separator (it's a newline \n by default). So it makes no sense to test that against a regular expression looking for '.old' at the end of a filename. (See perldoc perlvar for more information on Perl's many builtin variables.)

If you are familiar with Perl and just don't quite get File::Find - which does have an odd API at first - you might check out this article on it: http://www.stonehenge.com/merlyn/LinuxMag/col45.html

Sergei Steshenko 12-21-2010 11:10 AM

Quote:

Originally Posted by Telemachos (Post 4199002)
The code on Devshed uses $/ in place of \n in its call to print:
Code:

print "$File::Find::name$/"
I think the OP has misunderstood the original code, and his use of $/ doesn't make any sense at all.

I mean the devshed code does not use $/ in 'if' condition/regular expressions.
...
FWIW, if one needs to count files only, he/she should make sure that directories are not counted.

Kebman 12-23-2010 09:15 AM

First of all, use warnings does not answer my questions.

Second of all, I'd still love to have the questions answered. :)

Thirdly, thank you very much Telemachos, for giving me a heads up on the $/. I'm now in the process of reading the article you recommended.

Sergei Steshenko 12-23-2010 11:40 AM

Quote:

Originally Posted by Kebman (Post 4201147)
First of all, use warnings does not answer my questions.

Second of all, I'd still love to have the questions answered. :)

Thirdly, thank you very much Telemachos, for giving me a heads up on the $/. I'm now in the process of reading the article you recommended.

The official documentation: http://perldoc.perl.org/File/Find.html was good enough for me to make me able to use the function.

If you do not understand the documentation, ask specific questions about the things you do not understand.

"use warnings;" is must - Perl runtime will tell you about things which might be wrong (if any). If you do not use "use warnings;", you assume your code has no runtime problems - which is not a good assumption.

You shouldn't be even asking questions unless you have

Code:

use strict;
use warnings;

in place and everything is clean.

Telemachos 12-24-2010 10:31 PM

In a nutshell, when you call the find method, you pass it two parameters: first a reference to a subroutine (usually) and second a directory location.

The File::Find module starts from the directory location you specify and travels downward recursively (that is into every sub-dir and sub-dir of that sub-dir etc.). The module then calls the code in the subroutine on every item (file or directory) that is found in that downward spiral.

So here's a case somewhat like yours: let's imagine that I want to find all the files ending in '.rb' in the current directory ('.') or any subdirectory of the current directory. I could do this:

Code:

#!/usr/bin/env perl
use strict;
use warnings;
use File::Find;

my $dir = '.';
my $count;

find(sub{$count++ if $File::Find::name =~ /\.rb$/}, $dir);

print $count, "\n";

The File::Find module exports $File::Find::name as a special variable: it contains the full name and pathname of each item seen as File::Find works. The subroutine above increases $count (that's the effect of $count++) if the item has '.rb' at the end of its name. (That's the effect of the if and the regular expression.)

If the subroutine is simple like this, you can include it inline in the call to find, but if it's more complicated, you can store it separately and pass a reference to the subroutine. For example:

Code:

find(\&find_rubies, $dir);

sub find_rubies {
    # imagine this were long and complicated
    $count++ if $File::Find::name =~ /\.rb$/
}

or

Code:

my $find_rubies = sub {
    # imagine this were long and complicated
    $count++ if $File::Find::name =~ /\.rb$/
};

find($find_rubies, $dir);

Those two are effectively the same thing in slightly different ways. I prefer the second style (storing an anonymous subroutine as a reference), but tastes vary.

Sergei Steshenko 12-24-2010 11:54 PM

Quote:

Originally Posted by Telemachos (Post 4202491)
In a nutshell, when you call the find method, you pass it two parameters: first a reference to a subroutine (usually) and second a directory location.

The File::Find module starts from the directory location you specify and travels downward recursively (that is into every sub-dir and sub-dir of that sub-dir etc.). The module then calls the code in the subroutine on every item (file or directory) that is found in that downward spiral.

So here's a case somewhat like yours: let's imagine that I want to find all the files ending in '.rb' in the current directory ('.') or any subdirectory of the current directory. I could do this:

Code:

#!/usr/bin/env perl
use strict;
use warnings;
use File::Find;

my $dir = '.';
my $count;

find(sub{$count++ if $File::Find::name =~ /\.rb$/}, $dir);

print $count, "\n";

The File::Find module exports $File::Find::name as a special variable: it contains the full name and pathname of each item seen as File::Find works. The subroutine above increases $count (that's the effect of $count++) if the item has '.rb' at the end of its name. (That's the effect of the if and the regular expression.)

If the subroutine is simple like this, you can include it inline in the call to find, but if it's more complicated, you can store it separately and pass a reference to the subroutine. For example:

Code:

find(\&find_rubies, $dir);

sub find_rubies {
    # imagine this were long and complicated
    $count++ if $File::Find::name =~ /\.rb$/
}

or

Code:

my $find_rubies = sub {
    # imagine this were long and complicated
    $count++ if $File::Find::name =~ /\.rb$/
};

find($find_rubies, $dir);

Those two are effectively the same thing in slightly different ways. I prefer the second style (storing an anonymous subroutine as a reference), but tastes vary.

All this is written in the official documentation.

Telemachos 12-25-2010 02:49 PM

Quote:

Originally Posted by Sergei Steshenko (Post 4202536)
All this is written in the official documentation.

Yes, I'm well aware of that. We've had this conversation before, but the documentation can sometimes be overwhelming for new programmers.

What I wrote is all in the docs, but I didn't include everything in the docs. Sometimes it helps to start with a short, somewhat simplified version.

Sergei Steshenko 12-25-2010 09:55 PM

Quote:

Originally Posted by Telemachos (Post 4202934)
... the documentation can sometimes be overwhelming for new programmers. ...

English comprehension skills are taken care of in other than this forum places.

...

It looks to me that the OP's problem is not specifically File::Find module, but much more basic things in Perl. Or even in programming in general - like the issue of types and comparing apples to oranges.

Kebman 01-04-2011 08:12 PM

Please stop trolling, Sergei. Telemachos is doing a great job explaining. Thanks T! :)

Here's what use warnings; returns:
Code:

Argument "" isn't numeric in numeric ne (!=) at countfilesoftype.pl line 9.
I don't understand what that means.

Again, the (quite specific) questions are:

Q1: Why is each filter hit counted only when the conditional is not true?

You should think it would work when it was set to true. (I've tested the script, and it always returns the correct number of files.)

Q2: I've tried taking the file type, (.old), and put it into a variable for better usability, but then the script fails.

I gather this is an issue with how Regular Expressions are used, and I have a hunch it may have something to do with line breaks or something in that order, but I just don't know for sure - or how to fix it. I will be very greatful for help with this. :)

Sergei Steshenko 01-04-2011 08:40 PM

Quote:

Originally Posted by Kebman (Post 4213417)
Please stop trolling, Sergei. Telemachos is doing a great job explaining. Thanks T! :)

Here's what use warnings; returns:
Code:

Argument "" isn't numeric in numeric ne (!=) at countfilesoftype.pl line 9.
I don't understand what that means.

Again, the (quite specific) questions are:

Q1: Why is each filter hit counted only when the conditional is not true?

You should think it would work when it was set to true. (I've tested the script, and it always returns the correct number of files.)

Q2: I've tried taking the file type, (.old), and put it into a variable for better usability, but then the script fails.

I gather this is an issue with how Regular Expressions are used, and I have a hunch it may have something to do with line breaks or something in that order, but I just don't know for sure - or how to fix it. I will be very greatful for help with this. :)


So, what exactly you don't understand in the item in bold ? It's a sentence in English. The warning illustrates your fundamental problem WRT types, i.e. what types (in programming) are for, which types are compatible and which are not, etc. Or apples vs oranges.

All the rest of your questions are secondary - because the warning tells you that a particular line of your code is senseless, and the senseless line is the main one in a sense it is supposed to produce the correct hit.

And, by the way, are familiar with the concept of WEB search ? I.e. have you tried to enter

Argument "" isn't numeric in numeric ne (!=)

into a WEB search engine ? Third match from Yahoo appears to be relevant.
...
Are you familiar with the RTFM concept ? How about 'man perl' for starters ? In my case it produces among other things:

Code:

    67            perldiag            Perl diagnostic messages
    68            perllexwarn        Perl warnings and their control

and 'perldoc perldiag' does have an explanation of the warning.

Kebman 02-03-2011 08:30 PM

Sergei, When people get pissed off by your posts or the way you act, do you really think your advice is wanted? If you stopped being so rude and condescending, I might choose to listen to you. Heck, if you just bothered to answer the questions plain and simple, instead of complaining about my mental abilities, when it's obviously you who's lacking in mental abilities for explaining things in an understandable and polite fashion, then I might listen to you. However as it is, I choose to ignore you. :)

Sergei Steshenko 02-04-2011 04:32 AM

Quote:

Originally Posted by Kebman (Post 4247770)
Sergei, When people get pissed off by your posts or the way you act, do you really think your advice is wanted? If you stopped being so rude and condescending, I might choose to listen to you. Heck, if you just bothered to answer the questions plain and simple, instead of complaining about my mental abilities, when it's obviously you who's lacking in mental abilities for explaining things in an understandable and polite fashion, then I might listen to you. However as it is, I choose to ignore you. :)

If you just bothered to first read the documentation at hand. I.e. you are doing the following:
  1. effectively declare you won't make even a minimum decency effort to read and learn;
  2. instead you ask those who did bother to read an learn.

Alas, I had a very unpleasant experience of working with people like you - a big pain from neck to lower back.

Regarding mental abilities - do you understand RTFM, i.e. "Read The Fine Manual" ? If not, programming is not for you, if yes, why haven't you read it ?


All times are GMT -5. The time now is 05:19 PM.