LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices



Reply
 
Search this Thread
Old 02-28-2013, 10:25 AM   #1
koshy
LQ Newbie
 
Registered: Mar 2010
Posts: 17

Rep: Reputation: 0
Search for the file name inside the same file


I want to go through a directory recursively and to search for name of the specific file ie.. the same file name and record the matches and non matches in another file. Is there a smart way to do it without opening each file manually and verifying? Pl. help

koshy
 
Old 02-28-2013, 01:05 PM   #2
porphyry5
Member
 
Registered: Jul 2010
Location: oregon usa
Distribution: Slackware 14.1, Arch
Posts: 424

Rep: Reputation: 18
Quote:
Originally Posted by koshy View Post
I want to go through a directory recursively and to search for name of the specific file ie.. the same file name and record the matches and non matches in another file. Is there a smart way to do it without opening each file manually and verifying? Pl. help

koshy
Are you looking for repeated occurrences of the same filename in different subdirectories, or for the occurrence of a specific filename as part of the text content of other files?

If the latter, something like
Code:
~ $ mkdir fred
~ $ cd fred
~/fred $ touch a.txt b.txt c.txt
~/fred $ x=" 
> "
~/fred $ echo "a.txt$x"b.txt$x > d.txt
~/fred $ echo "c.txt$x"b.txt$x > e.txt
~/fred $ z=$(ls)
~/fred $ for ((i=0; i<${#z[@]}; i++)); do grep a.txt ${z[$i]}; done
d.txt:a.txt
~/fred $
If the former, something like
Code:
~/fred $ cat $(find . -type f) > ofile.txt
~/fred $ grep a.txt ofile.txt
a.txt
~/fred $
 
Old 02-28-2013, 01:18 PM   #3
suicidaleggroll
Senior Member
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 3,221

Rep: Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155
I'm not really sure what you're asking, could you clarify?

You have a directory, buried in subdirectories in this directory you have a bunch of text files. Now are you trying to search for a SINGLE file name in the contents of all of these text files, or are you trying to find which text files contain their OWN name in the contents?

Either way, you then want to create a new file with a list of which of those text files contained the name you were looking for and which didn't? How do you want this new file formatted?
 
Old 02-28-2013, 01:34 PM   #4
Medievalist
Member
 
Registered: Aug 2003
Distribution: Dead Rat
Posts: 175

Rep: Reputation: 37
Use find to call grep in quiet mode

I'm not sure I understand the question, but I'll give it a shot.

If you want to check each file in a folder hierarchy to see if it contains its own name, do this:

find /path -type f -exec grep -q {} {} \; -fprint matched.txt -o -type f -print >unmatched.txt

The find command looks recursively at everything under the starting path you give it. If you start at the root, you might get screwed if you have loops in your filesystem (for example in /sys or /proc, or if you abuse shootsnap.sh) so be careful to choose a sane starting path.

The remaining switches and options to find are processed left to right with implicit AND operators. Each one is evaluated for success or failure sequentially.

The -type f switch fails for links, directories, devices, etc. and succeeds for regular plain-jane files.

The -exec spawns a grep in quiet mode, which is the quickest, most efficient way to look inside files for fixed patterns. Grep returns failure if the string is not found or an error occurs, otherwise it returns success.

The name of the file currently being looked at will replace each set of paired curly braces, and the slash-semicolon ends the grep command we told -exec to use.

The -fprint prints the name of the file currently being worked with, if the current status is success, into an output file. (If the output file already exists, you'll append on to it, so you probably want to delete matched.txt before you start.)

The -o stands for OR (remember how everything before this is considered to be joined by an AND?) so it succeeds if anything else has failed. This is cool because it means the grep failing to find the string is going to trigger it, but the -type f will also trigger it when you're recursing through directories or links, so we need to do the -type f again if we only want regular files.

The -print prints the name of the file currently being worked with, if the current status is success (which it will be, if it's a regular file and the pattern wasn't matched) and we redirect the output to a file using normal shell I/O redirection.

This scales extremely well, but it handles loony file names poorly, so you should read the find and grep man pages if you have lunatics naming your files.

Last edited by Medievalist; 02-28-2013 at 01:39 PM. Reason: missed a step in explanation
 
1 members found this post helpful.
Old 02-28-2013, 08:30 PM   #5
koshy
LQ Newbie
 
Registered: Mar 2010
Posts: 17

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by porphyry5 View Post
Are you looking for repeated occurrences of the same filename in different subdirectories, or for the occurrence of a specific filename as part of the text content of other files?

If the latter, something like
Code:
~ $ mkdir fred
~ $ cd fred
~/fred $ touch a.txt b.txt c.txt
~/fred $ x=" 
> "
~/fred $ echo "a.txt$x"b.txt$x > d.txt
~/fred $ echo "c.txt$x"b.txt$x > e.txt
~/fred $ z=$(ls)
~/fred $ for ((i=0; i<${#z[@]}; i++)); do grep a.txt ${z[$i]}; done
d.txt:a.txt
~/fred $
If the former, something like
Code:
~/fred $ cat $(find . -type f) > ofile.txt
~/fred $ grep a.txt ofile.txt
a.txt
~/fred $
I'm sorry I forgot to mention that I wanted to check for the name of the opened file inside the file
 
Old 02-28-2013, 08:35 PM   #6
koshy
LQ Newbie
 
Registered: Mar 2010
Posts: 17

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by suicidaleggroll View Post
I'm not really sure what you're asking, could you clarify?

You have a directory, buried in subdirectories in this directory you have a bunch of text files. Now are you trying to search for a SINGLE file name in the contents of all of these text files, or are you trying to find which text files contain their OWN name in the contents?

Either way, you then want to create a new file with a list of which of those text files contained the name you were looking for and which didn't? How do you want this new file formatted?
Dear suicidaleggroll
I had forgotten to mention that I wanted to look for the name of the file being checked inside the file itself. (My files are xml files). Plain text file is OK
 
Old 02-28-2013, 08:44 PM   #7
suicidaleggroll
Senior Member
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 3,221

Rep: Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155Reputation: 1155
Quote:
Originally Posted by koshy View Post
Dear suicidaleggroll
I had forgotten to mention that I wanted to look for the name of the file being checked inside the file itself. (My files are xml files). Plain text file is OK
In that case you should read Medievalist's post, looks like a good solution.
 
1 members found this post helpful.
Old 02-28-2013, 09:52 PM   #8
sag47
Senior Member
 
Registered: Sep 2009
Location: Philly, PA
Distribution: Kubuntu x64, RHEL, Fedora Core, FreeBSD, Windows x64
Posts: 1,509
Blog Entries: 35

Rep: Reputation: 384Reputation: 384Reputation: 384Reputation: 384
Here's a solution similar to Medievalist except using the base name of the file in the grep rather than the full path of the file.

Code:
find /path -type f | while read line;do if grep -ql "$(basename "$line")" "$line";then echo "$line";else echo "$line" > /dev/stderr;fi;done 1> matched.txt 2> unmatched.txt
grep -ql will search even faster than just grep -q because grep -l will stop searching the file upon the first match.

Here's the above one liner again in a more human readable expanded format.
Code:
find /path -type f | while read line;do
  if grep -ql "$(basename "$line")" "$line";then
    echo "$line"
  else
    echo "$line" > /dev/stderr
  fi
done 1> matched.txt 2> unmatched.txt

Last edited by sag47; 02-28-2013 at 10:03 PM.
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Modify name of file that are inside zip file Perseus Programming 6 02-09-2012 04:06 PM
How to do search & replace on a text file--need to extract URLs from a sitemap file Mountain Linux - General 3 04-05-2009 02:22 PM
I need grep options to search inside a .tar.gz file. ZAMO Linux - General 2 06-25-2008 12:55 AM
unable to put file 4GB file INSIDE iso saravkrish Linux - Software 2 03-23-2005 04:41 PM
Find File broken, need search utility, where does WineX install, KDE file roller? Ohmn Mandriva 6 07-05-2004 11:34 PM


All times are GMT -5. The time now is 03:06 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration