A folder contains about 290,000 html files, all residing in monthly subfolders and following the same naming convention. Id's are numeric and of varying length.
I file idnew.log
contains 20,000 id's, one per line:
I need to build up a list of files matching this list of id's.
First attempt time: 0.44s
find data/ -type f |grep -F -f <(sed -r 's/(.*)/_\1./' idnew.log)
I use sed
to prefix and suffix the id: _<id>.
to ensure only filenames containing the whole id are found.
I think this is pretty fast. I am just wondering whether there is a more elegant way of achieving the result, particularly considering that I need to return all
files in data/
does not exist:
The only solution I can think of:
find data/ -type f >fileall.log
cp fileall.log filesome.log
if [[ -e idnew.log ]] ; then
grep -F -f <(sed -r 's/(.*)/_\1./' idnew.log) fileall.log >filesome.log
Well, this is pretty ugly!