LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   search file's contents (https://www.linuxquestions.org/questions/linux-newbie-8/search-files-contents-4175471693/)

future_computer 07-31-2013 11:49 PM

search file's contents
 
We can use konquerer to search file's name,
how about the contents?

My problem is I recovered my flash drive, many files names were changed, i am tired to open one by one to check their contents.

Anything can help me to identify the files contents?

Firerat 08-01-2013 12:38 AM

if you must use konquerer, then

Alt+v Alt+v Alt+d

or

View --> View Mode --> details.

You should see a "type" column
Which will tell you the file type

not all that useful, but it is a start

barracuda0 08-01-2013 12:41 AM

I use grep to search for a string of text in one line of a file like this:

Code:

grep <text> <filename>
if you want to search through a whole directory of files, running grep on each one to check if it contains the text you're looking for then you can use this command:

Code:

find <directory> -type f -exec grep <text> {} +
and it should print out a list of all the files with it. The find command replaces {} with each filename and + delimits the grep command.

Firerat 08-01-2013 01:23 AM

assumes they are text files

but yes perfect for finding text files with known content

Another cli spin on this would be

Code:

find <directory> -type f -exec file {} +
which would give you the types, based on that move things around, e.g. restore human readable file extensions, further examine with mediainfo ( video data ) exif ( images )

text -- grep for document title
binary document formats?, might be able to get something useful from strings


Really depends on the number of files, and what they are

future_computer 08-01-2013 08:05 AM

They are pdf, excel and Word files.
So, how to do it?

Firerat 08-01-2013 10:45 AM

Quote:

Originally Posted by future_computer (Post 5000925)
They are pdf, excel and Word files.
So, how to do it?

at the moment the only sure thing I can say is carry on as you are..

but I see your predicament, I imagine you have quite a few files to deal with...

grep *_might_* work, but you don't say you are looking for a particular file, just that you want to know the contents of all the files I assume so you can give them proper names..

But I shall have a think

Firerat 08-01-2013 01:25 PM

ok, a little follow up for PDF
( I will have to find some excel/word files )

this is POC ( proof of concept )

Code:

find /path/to/dir -type f -exec file {} ';' \
| awk -F\: '/PDF document,/{"pdftotext "$1" - \
| head -n1"|getline Newfile;printf "cp \""$1"\" \"/path/to/output/"Newfile".pdf\"\n"}'

But it makes a number of assumptions
The first being that you have pdftotext installed
Second .. that the first line of the pdf is a suitable file name ( both descriptive and with "valid" characters , at the moment "/" will break it... )

and others I have not thought about

it won't actually do anything, just print commands

obviously you should substitute the /path/to/.. to existing directories,


pdfinfo instead of pdftotext might get "better" results if the title is embedded



how many files do you have?
that is an important question

future_computer 08-01-2013 05:56 PM

there are thousands of files, mainly pdf, docx, excel .

Let's say I want to find a file containing "LIM YING CHING", in a directory named C:\Backup , how to do it?

Firerat 08-01-2013 08:35 PM

1000s ?
ok yeah

a lot

I'm sure it can be done,
not perfect
but OK

can it be done without access to files?
not sure


LIM YING CHING

seems to be a name
so you want all files that contain than name "LIM YING CHING"

you could try the grep from earlier, but it will probably fail ( as pdf xls and doc are binary format )

I gave an example were I converted PDF documents to text, instead of using head -n, you could use grep "LIM YING CHING"
but it is more complicated

I can probably get something, but seeing as it it 1000s of files you might be better with a local data recovery service.

Face to Face, same language.

I can probably do it, but would be looking for some form of reciprocation.
if someone local is already "setup" it will be cheaper ( and if you can stand over them, more private )

Just how important is the data?

You probably don't want to hear this now, but usb drives are good for moving stuff, not backup or exclusive storage.

future_computer 08-01-2013 09:09 PM

This will help me?

Quote:

find <directory> -type f -exec grep <text> {} +

Firerat 08-01-2013 09:42 PM

Quote:

Originally Posted by future_computer (Post 5001290)
This will help me?

no, not really
their assumption was you were interested in plain text files, and looking for something specific.

from the start I knew ( ok felt ) you were not

we are in an almost imposible situation

I can not give you a magic spell for what you want
I can offer a service, but it is a service I can not give a 100% guarantee for

pragmatically processing 1000s of pdf,doc, xls files is a little hit and miss

I can probably make some sense , and get something "useful",
then again maybe not,
I have no idea what is in the files ( first line is a wile guess ) , and thus far I have not even looked at an excel or word document.

From my view , these files are not worth much, since they were only on a usb flash drive.
But this is because I know that USB flash drives are not for primary storage or backup.. So I know they are perhaps important

Right now you have the files, they work bit have silly names

You have 1000s, do you actually use them?

look back over what you have posted
Pretty much nothing..
I have been guessing at your needs
and still you come back with nothing

frankbell 08-01-2013 09:47 PM

In Konqueror 4.8.x (the one on this computer), when you select "Tools-->Find Files" and the Find Files dialog opens, there is a tab labeled "Contents."

You can enter text you are searching for in "Containing Text" field.

It's ultimately quicker to learn how to do this with command line tools, but this is a nice GUI tool for searching file contents.

chrism01 08-02-2013 01:37 AM

Try Perl; either use binmode directly eg http://www.cs.cf.ac.uk/Dave/PERL/node73.html or from CPAN, grab the pre-made modules to read each file type search.cpan.org

Firerat 08-02-2013 04:44 AM

Quote:

Originally Posted by frankbell (Post 5001310)
In Konqueror 4.8.x (the one on this computer), when you select "Tools-->Find Files" and the Find Files dialog opens, there is a tab labeled "Contents."

You can enter text you are searching for in "Containing Text" field.

It's ultimately quicker to learn how to do this with command line tools, but this is a nice GUI tool for searching file contents.

will that work with binary formats like excel and word?

frankbell 08-02-2013 08:01 PM

Quote:

will that work with binary formats like excel and word?
I just tested it on the contents of my documents folder, which has a rich array of Open Office, LibreOffice, and MSOffice formats on it.

Yes, it does appear to work on them.


All times are GMT -5. The time now is 04:55 AM.