Quote:
Originally Posted by shane_kerr
$ find . -type f -print0 | xargs -0l basename | awk -F . '/\./{print $NF}' | sort --unique
What could be simpler? 
|
I guess from my noobie perspective, the original was a little simpler for two reasons.
- basename: I did not know about this Thanks for sharing the tool.
- regular expressions: I am way below understanding regular expressions. They come up a lot on the web though, so I often just cut and paste such things when I find a cure for a problem. I hope some may find this one too in the future.
I timed the original discovery against this idea.
Reg-ex version:
Code:
user@system $ time find . -type f -print0 | xargs -0l basename | awk -F . '/\./{print $NF}' | sort --unique
.... file types ....
real 0m0.390s
user 0m0.008s
sys 0m0.028s
The execution delay was noticable. Maybe if there is a huge directory with many, many files this command could eat resources. (If that matters any more.)
Original:
Code:
user@system $ time find ./ -type f | awk -F . '{print $NF}' | sort --unique | awk -F / '{print $NF}'
.... file types ....
real 0m0.008s
user 0m0.004s
sys 0m0.000s
For comparison I also put the -printf option in the mix.
Code:
user@system $ time find ./ -type f -printf '%f\n' | awk -F . '{print $NF}' | sort --unique
.... file types ....
real 0m0.007s
user 0m0.004s
sys 0m0.000s
It looks like the winner!
However, the same -printf should work in the reg-ex version too, which can eliminate the pipe to basename. Let's see how it does.
Code:
user@system $ time find . -type f -printf '%f\n' | awk -F . '/\./{print $NF}' | sort --unique
.... file types ....
real 0m0.007s
user 0m0.000s
sys 0m0.004s
Well, after all that, I realized that I had already deleted my typeless files from the directory. When I put them back two of the improved versions failed to find the typeless files.
example:
Code:
user@system $ touch README
user@system $ touch inc/SOMEFILE
time find . -type f -printf '%f\n' | awk -F . '/\./{print $NF}' | sort --unique
css
gif
htaccess
jpg
js
patch
php
png
sql
swf
txt
real 0m0.007s
user 0m0.000s
sys 0m0.004s
The original idea and ntubski's -printf option work. So perhaps the winner is:
Code:
user@system $ time find ./ -type f -printf '%f\n' | awk -F . '{print $NF}' | sort --unique
css
gif
htaccess
jpg
js
patch
php
png
README
SOMEFILE
sql
swf
txt
real 0m0.007s
user 0m0.000s
sys 0m0.004s
It is fast as the original and finds all the file types, including those with no type.