[SOLVED] Search for the file name inside the same file
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I want to go through a directory recursively and to search for name of the specific file ie.. the same file name and record the matches and non matches in another file. Is there a smart way to do it without opening each file manually and verifying? Pl. help
I want to go through a directory recursively and to search for name of the specific file ie.. the same file name and record the matches and non matches in another file. Is there a smart way to do it without opening each file manually and verifying? Pl. help
koshy
Are you looking for repeated occurrences of the same filename in different subdirectories, or for the occurrence of a specific filename as part of the text content of other files?
If the latter, something like
Code:
~ $ mkdir fred
~ $ cd fred
~/fred $ touch a.txt b.txt c.txt
~/fred $ x="
> "
~/fred $ echo "a.txt$x"b.txt$x > d.txt
~/fred $ echo "c.txt$x"b.txt$x > e.txt
~/fred $ z=$(ls)
~/fred $ for ((i=0; i<${#z[@]}; i++)); do grep a.txt ${z[$i]}; done
d.txt:a.txt
~/fred $
I'm not really sure what you're asking, could you clarify?
You have a directory, buried in subdirectories in this directory you have a bunch of text files. Now are you trying to search for a SINGLE file name in the contents of all of these text files, or are you trying to find which text files contain their OWN name in the contents?
Either way, you then want to create a new file with a list of which of those text files contained the name you were looking for and which didn't? How do you want this new file formatted?
I'm not sure I understand the question, but I'll give it a shot.
If you want to check each file in a folder hierarchy to see if it contains its own name, do this:
find /path -type f -exec grep -q {} {} \; -fprint matched.txt -o -type f -print >unmatched.txt
The find command looks recursively at everything under the starting path you give it. If you start at the root, you might get screwed if you have loops in your filesystem (for example in /sys or /proc, or if you abuse shootsnap.sh) so be careful to choose a sane starting path.
The remaining switches and options to find are processed left to right with implicit AND operators. Each one is evaluated for success or failure sequentially.
The -type f switch fails for links, directories, devices, etc. and succeeds for regular plain-jane files.
The -exec spawns a grep in quiet mode, which is the quickest, most efficient way to look inside files for fixed patterns. Grep returns failure if the string is not found or an error occurs, otherwise it returns success.
The name of the file currently being looked at will replace each set of paired curly braces, and the slash-semicolon ends the grep command we told -exec to use.
The -fprint prints the name of the file currently being worked with, if the current status is success, into an output file. (If the output file already exists, you'll append on to it, so you probably want to delete matched.txt before you start.)
The -o stands for OR (remember how everything before this is considered to be joined by an AND?) so it succeeds if anything else has failed. This is cool because it means the grep failing to find the string is going to trigger it, but the -type f will also trigger it when you're recursing through directories or links, so we need to do the -type f again if we only want regular files.
The -print prints the name of the file currently being worked with, if the current status is success (which it will be, if it's a regular file and the pattern wasn't matched) and we redirect the output to a file using normal shell I/O redirection.
This scales extremely well, but it handles loony file names poorly, so you should read the find and grep man pages if you have lunatics naming your files.
Last edited by Medievalist; 02-28-2013 at 12:39 PM.
Reason: missed a step in explanation
Are you looking for repeated occurrences of the same filename in different subdirectories, or for the occurrence of a specific filename as part of the text content of other files?
If the latter, something like
Code:
~ $ mkdir fred
~ $ cd fred
~/fred $ touch a.txt b.txt c.txt
~/fred $ x="
> "
~/fred $ echo "a.txt$x"b.txt$x > d.txt
~/fred $ echo "c.txt$x"b.txt$x > e.txt
~/fred $ z=$(ls)
~/fred $ for ((i=0; i<${#z[@]}; i++)); do grep a.txt ${z[$i]}; done
d.txt:a.txt
~/fred $
I'm not really sure what you're asking, could you clarify?
You have a directory, buried in subdirectories in this directory you have a bunch of text files. Now are you trying to search for a SINGLE file name in the contents of all of these text files, or are you trying to find which text files contain their OWN name in the contents?
Either way, you then want to create a new file with a list of which of those text files contained the name you were looking for and which didn't? How do you want this new file formatted?
Dear suicidaleggroll
I had forgotten to mention that I wanted to look for the name of the file being checked inside the file itself. (My files are xml files). Plain text file is OK
Dear suicidaleggroll
I had forgotten to mention that I wanted to look for the name of the file being checked inside the file itself. (My files are xml files). Plain text file is OK
In that case you should read Medievalist's post, looks like a good solution.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.