LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-31-2013, 11:49 PM   #1
future_computer
Member
 
Registered: Apr 2012
Distribution: Pinguy OS
Posts: 392

Rep: Reputation: Disabled
Post search file's contents


We can use konquerer to search file's name,
how about the contents?

My problem is I recovered my flash drive, many files names were changed, i am tired to open one by one to check their contents.

Anything can help me to identify the files contents?
 
Old 08-01-2013, 12:38 AM   #2
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
if you must use konquerer, then

Alt+v Alt+v Alt+d

or

View --> View Mode --> details.

You should see a "type" column
Which will tell you the file type

not all that useful, but it is a start
 
Old 08-01-2013, 12:41 AM   #3
barracuda0
LQ Newbie
 
Registered: Aug 2013
Posts: 2

Rep: Reputation: Disabled
I use grep to search for a string of text in one line of a file like this:

Code:
grep <text> <filename>
if you want to search through a whole directory of files, running grep on each one to check if it contains the text you're looking for then you can use this command:

Code:
find <directory> -type f -exec grep <text> {} +
and it should print out a list of all the files with it. The find command replaces {} with each filename and + delimits the grep command.
 
Old 08-01-2013, 01:23 AM   #4
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
assumes they are text files

but yes perfect for finding text files with known content

Another cli spin on this would be

Code:
find <directory> -type f -exec file {} +
which would give you the types, based on that move things around, e.g. restore human readable file extensions, further examine with mediainfo ( video data ) exif ( images )

text -- grep for document title
binary document formats?, might be able to get something useful from strings


Really depends on the number of files, and what they are
 
Old 08-01-2013, 08:05 AM   #5
future_computer
Member
 
Registered: Apr 2012
Distribution: Pinguy OS
Posts: 392

Original Poster
Rep: Reputation: Disabled
They are pdf, excel and Word files.
So, how to do it?
 
Old 08-01-2013, 10:45 AM   #6
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by future_computer View Post
They are pdf, excel and Word files.
So, how to do it?
at the moment the only sure thing I can say is carry on as you are..

but I see your predicament, I imagine you have quite a few files to deal with...

grep *_might_* work, but you don't say you are looking for a particular file, just that you want to know the contents of all the files I assume so you can give them proper names..

But I shall have a think
 
Old 08-01-2013, 01:25 PM   #7
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
ok, a little follow up for PDF
( I will have to find some excel/word files )

this is POC ( proof of concept )

Code:
find /path/to/dir -type f -exec file {} ';' \
| awk -F\: '/PDF document,/{"pdftotext "$1" - \
| head -n1"|getline Newfile;printf "cp \""$1"\" \"/path/to/output/"Newfile".pdf\"\n"}'
But it makes a number of assumptions
The first being that you have pdftotext installed
Second .. that the first line of the pdf is a suitable file name ( both descriptive and with "valid" characters , at the moment "/" will break it... )

and others I have not thought about

it won't actually do anything, just print commands

obviously you should substitute the /path/to/.. to existing directories,


pdfinfo instead of pdftotext might get "better" results if the title is embedded



how many files do you have?
that is an important question
 
Old 08-01-2013, 05:56 PM   #8
future_computer
Member
 
Registered: Apr 2012
Distribution: Pinguy OS
Posts: 392

Original Poster
Rep: Reputation: Disabled
there are thousands of files, mainly pdf, docx, excel .

Let's say I want to find a file containing "LIM YING CHING", in a directory named C:\Backup , how to do it?
 
Old 08-01-2013, 08:35 PM   #9
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
1000s ?
ok yeah

a lot

I'm sure it can be done,
not perfect
but OK

can it be done without access to files?
not sure


LIM YING CHING

seems to be a name
so you want all files that contain than name "LIM YING CHING"

you could try the grep from earlier, but it will probably fail ( as pdf xls and doc are binary format )

I gave an example were I converted PDF documents to text, instead of using head -n, you could use grep "LIM YING CHING"
but it is more complicated

I can probably get something, but seeing as it it 1000s of files you might be better with a local data recovery service.

Face to Face, same language.

I can probably do it, but would be looking for some form of reciprocation.
if someone local is already "setup" it will be cheaper ( and if you can stand over them, more private )

Just how important is the data?

You probably don't want to hear this now, but usb drives are good for moving stuff, not backup or exclusive storage.
 
Old 08-01-2013, 09:09 PM   #10
future_computer
Member
 
Registered: Apr 2012
Distribution: Pinguy OS
Posts: 392

Original Poster
Rep: Reputation: Disabled
This will help me?

Quote:
find <directory> -type f -exec grep <text> {} +
 
Old 08-01-2013, 09:42 PM   #11
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by future_computer View Post
This will help me?
no, not really
their assumption was you were interested in plain text files, and looking for something specific.

from the start I knew ( ok felt ) you were not

we are in an almost imposible situation

I can not give you a magic spell for what you want
I can offer a service, but it is a service I can not give a 100% guarantee for

pragmatically processing 1000s of pdf,doc, xls files is a little hit and miss

I can probably make some sense , and get something "useful",
then again maybe not,
I have no idea what is in the files ( first line is a wile guess ) , and thus far I have not even looked at an excel or word document.

From my view , these files are not worth much, since they were only on a usb flash drive.
But this is because I know that USB flash drives are not for primary storage or backup.. So I know they are perhaps important

Right now you have the files, they work bit have silly names

You have 1000s, do you actually use them?

look back over what you have posted
Pretty much nothing..
I have been guessing at your needs
and still you come back with nothing
 
Old 08-01-2013, 09:47 PM   #12
frankbell
LQ Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Ubuntu MATE, Mageia, and whatever VMs I happen to be playing with
Posts: 19,272
Blog Entries: 28

Rep: Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123
In Konqueror 4.8.x (the one on this computer), when you select "Tools-->Find Files" and the Find Files dialog opens, there is a tab labeled "Contents."

You can enter text you are searching for in "Containing Text" field.

It's ultimately quicker to learn how to do this with command line tools, but this is a nice GUI tool for searching file contents.
 
Old 08-02-2013, 01:37 AM   #13
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,348

Rep: Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749Reputation: 2749
Try Perl; either use binmode directly eg http://www.cs.cf.ac.uk/Dave/PERL/node73.html or from CPAN, grab the pre-made modules to read each file type search.cpan.org
 
Old 08-02-2013, 04:44 AM   #14
Firerat
Senior Member
 
Registered: Oct 2008
Distribution: Debian sid
Posts: 2,683

Rep: Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783Reputation: 783
Quote:
Originally Posted by frankbell View Post
In Konqueror 4.8.x (the one on this computer), when you select "Tools-->Find Files" and the Find Files dialog opens, there is a tab labeled "Contents."

You can enter text you are searching for in "Containing Text" field.

It's ultimately quicker to learn how to do this with command line tools, but this is a nice GUI tool for searching file contents.
will that work with binary formats like excel and word?
 
Old 08-02-2013, 08:01 PM   #15
frankbell
LQ Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Ubuntu MATE, Mageia, and whatever VMs I happen to be playing with
Posts: 19,272
Blog Entries: 28

Rep: Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123Reputation: 6123
Quote:
will that work with binary formats like excel and word?
I just tested it on the contents of my documents folder, which has a rich array of Open Office, LibreOffice, and MSOffice formats on it.

Yes, it does appear to work on them.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How to search contents of multiple pdf files and return the pdf's file name? Hoxygen232 Linux - Newbie 4 04-28-2013 09:39 AM
[SOLVED] search a directory and add file contents when find string j-me Linux - General 3 01-11-2013 08:54 AM
Shell script to search one file for contents of another and replace text? kmkocot Linux - Newbie 6 10-28-2011 02:09 PM
how to search files with specific contents ? sachinh Linux - Security 4 07-22-2004 08:00 AM
search all book *contents* @ amazon.com jimveta General 2 10-24-2003 12:37 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 05:05 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration