LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Small script optimization (https://www.linuxquestions.org/questions/linux-software-2/small-script-optimization-911514/)

tunilopez 11-02-2011 11:58 AM

Small script optimization
 
Hello everyone,

I have a simple script that searches for folders below the base parameter and prints the quantity of files inside them.

The problem is that I have around 900.000 files in some folders and this process is very slow.

Code:

#!/bin/bash
base=$1
DIRS=$(find "$base" -type d)
for d in $DIRS
do
        echo "$d        $(find $d -type f | wc -l)"
done

I am not a linux experienced user, so, if any of you have any ideas to make it perform faster it's very appreciated.

Thanks

PTrenholme 11-02-2011 12:54 PM

Here's another approach that might be faster:
Code:

#!/bin/bash
[ -d "${1}" ] && echo \"${1}\" is not a directory. Aborting. >&2 && exit 1
sudo updatedb -U "${1}" -o /tmp/"${1}.db"
echo "$(locate -Sd /tmp/"${1}.db" | grep files) exist in or below ${1}."
# rm -f /tmp/"${1}.db"

Note 1: This is untested code.
Note 2: The deletion of the temporary data base file is commented out because you might find the locate command useful for other reasons, and you might, therefore, want to keep it around. The creation of the db file in /tmp is, of course, arbitrary. It could be placed anywhere you wanted it, although placing in the the tree to wanted to count might be counter-productive.

<edit>
Here's a version that worked for me:
Code:

#!/bin/bash
if [ $# -lt 1 ] || [ "${1,,*}" == "-h" ] || [ "${1,,*}" == "--help" ]
then
  cat <<EOF >&2
$0: Count the number of files in or below a specified directory.

Argument: Root directory
EOF
  exit
fi
[ ! -d "${1}" ] && echo \"${1}\" is not a directory. Aborting. >&2 && exit 1
tmpfile=$(mktemp /tmp/locdb-XXXXX)
sudo updatedb -U "${1}" -o ${tmpfile}
echo "$(locate -Sd ${tmpfile} | grep files) exist in or below ${1}."
#rm -f ${tmpfile}


chrism01 11-02-2011 07:16 PM

If that isn't fast enough, consider Perl. It calls the underlying C libs directly and runs almost as fast as C (its compiled on the fly before being run). Should be much quicker than calling shell level programs.


All times are GMT -5. The time now is 08:44 PM.