-   Linux - Software (
-   -   Small script optimization (

tunilopez 11-02-2011 11:58 AM

Small script optimization
Hello everyone,

I have a simple script that searches for folders below the base parameter and prints the quantity of files inside them.

The problem is that I have around 900.000 files in some folders and this process is very slow.


DIRS=$(find "$base" -type d)
for d in $DIRS
        echo "$d        $(find $d -type f | wc -l)"

I am not a linux experienced user, so, if any of you have any ideas to make it perform faster it's very appreciated.


PTrenholme 11-02-2011 12:54 PM

Here's another approach that might be faster:

[ -d "${1}" ] && echo \"${1}\" is not a directory. Aborting. >&2 && exit 1
sudo updatedb -U "${1}" -o /tmp/"${1}.db"
echo "$(locate -Sd /tmp/"${1}.db" | grep files) exist in or below ${1}."
# rm -f /tmp/"${1}.db"

Note 1: This is untested code.
Note 2: The deletion of the temporary data base file is commented out because you might find the locate command useful for other reasons, and you might, therefore, want to keep it around. The creation of the db file in /tmp is, of course, arbitrary. It could be placed anywhere you wanted it, although placing in the the tree to wanted to count might be counter-productive.

Here's a version that worked for me:

if [ $# -lt 1 ] || [ "${1,,*}" == "-h" ] || [ "${1,,*}" == "--help" ]
  cat <<EOF >&2
$0: Count the number of files in or below a specified directory.

Argument: Root directory
[ ! -d "${1}" ] && echo \"${1}\" is not a directory. Aborting. >&2 && exit 1
tmpfile=$(mktemp /tmp/locdb-XXXXX)
sudo updatedb -U "${1}" -o ${tmpfile}
echo "$(locate -Sd ${tmpfile} | grep files) exist in or below ${1}."
#rm -f ${tmpfile}

chrism01 11-02-2011 07:16 PM

If that isn't fast enough, consider Perl. It calls the underlying C libs directly and runs almost as fast as C (its compiled on the fly before being run). Should be much quicker than calling shell level programs.

All times are GMT -5. The time now is 06:30 PM.