LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Search all tar.gz files in subdirectories for file (https://www.linuxquestions.org/questions/linux-general-1/search-all-tar-gz-files-in-subdirectories-for-file-936552/)

scambro 03-26-2012 03:04 PM

Search all tar.gz files in subdirectories for file
 
I have subfolders of 365 tar.gz files from a webserver across the span of a year. I now need to search all of these tar.gz files for the most recent copy of a few php files. I tried this, but I'm not getting the desired results (the command comes back with Not found in archive, when I know it is):

Code:

www_backups$ tar -ztvf */www*.* | grep ".php"
Code:

tar: 20120324/www-20120324.tar.gz: Not found in archive
But if I cd to 20120324 and do the same command, I get all kinds of matches (as expected). What am I missing in this simple line? Thanks!

colucix 03-26-2012 03:34 PM

After the shell filename expansion, your tar command results in something like:
Code:

tar -ztvf 20120323/www-20120323.tar.gz 20120325/www-20120325.tar.gz 20120326/www-20120326.tar.gz
The name of the archive to test is that one highlighted in blue and given as argument of the -f option. The other names are arguments of the tar command and they are searched inside the previous archive. If you use -x instead of -t, this mechanism serves to extract only some files from a given archive.

If you want to look for a file inside multiple archives, better to use a loop, e.g.
Code:

for archive in */www-*.tar.gz
do
  tar --wildcards -tvf $archive *.php
done

As you can see, the grep command is not necessary, since we want to look for php files inside the archive, hence we pass them as arguments. The --wildcars option is necessary to enable pattern matching with *. Hope this helps.

scambro 03-26-2012 04:28 PM

Quote:

Originally Posted by colucix (Post 4636917)
After the shell filename expansion, your tar command results in something like:
Code:

tar -ztvf 20120323/www-20120323.tar.gz 20120325/www-20120325.tar.gz 20120326/www-20120326.tar.gz
The name of the archive to test is that one highlighted in blue and given as argument of the -f option. The other names are arguments of the tar command and they are searched inside the previous archive. If you use -x instead of -t, this mechanism serves to extract only some files from a given archive.

If you want to look for a file inside multiple archives, better to use a loop, e.g.
Code:

for archive in */www-*.tar.gz
do
  tar --wildcards -tvf $archive *.php
done

As you can see, the grep command is not necessary, since we want to look for php files inside the archive, hence we pass them as arguments. The --wildcars option is necessary to enable pattern matching with *. Hope this helps.

aaaaah, ok. I see what you're saying. That's why I was getting a huge list of tar files not found except for the first day, because it was searching in that file, but then also searching in that file for every subdirectory. Makes sense. Thanks for the explanation and assistance!


All times are GMT -5. The time now is 04:23 PM.