Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I use grep to search for a string of text in one line of a file like this:
Code:
grep <text> <filename>
if you want to search through a whole directory of files, running grep on each one to check if it contains the text you're looking for then you can use this command:
Code:
find <directory> -type f -exec grep <text> {} +
and it should print out a list of all the files with it. The find command replaces {} with each filename and + delimits the grep command.
but yes perfect for finding text files with known content
Another cli spin on this would be
Code:
find <directory> -type f -exec file {} +
which would give you the types, based on that move things around, e.g. restore human readable file extensions, further examine with mediainfo ( video data ) exif ( images )
text -- grep for document title
binary document formats?, might be able to get something useful from strings
Really depends on the number of files, and what they are
They are pdf, excel and Word files.
So, how to do it?
at the moment the only sure thing I can say is carry on as you are..
but I see your predicament, I imagine you have quite a few files to deal with...
grep *_might_* work, but you don't say you are looking for a particular file, just that you want to know the contents of all the files I assume so you can give them proper names..
But it makes a number of assumptions
The first being that you have pdftotext installed
Second .. that the first line of the pdf is a suitable file name ( both descriptive and with "valid" characters , at the moment "/" will break it... )
and others I have not thought about
it won't actually do anything, just print commands
obviously you should substitute the /path/to/.. to existing directories,
pdfinfo instead of pdftotext might get "better" results if the title is embedded
how many files do you have?
that is an important question
seems to be a name
so you want all files that contain than name "LIM YING CHING"
you could try the grep from earlier, but it will probably fail ( as pdf xls and doc are binary format )
I gave an example were I converted PDF documents to text, instead of using head -n, you could use grep "LIM YING CHING"
but it is more complicated
I can probably get something, but seeing as it it 1000s of files you might be better with a local data recovery service.
Face to Face, same language.
I can probably do it, but would be looking for some form of reciprocation.
if someone local is already "setup" it will be cheaper ( and if you can stand over them, more private )
Just how important is the data?
You probably don't want to hear this now, but usb drives are good for moving stuff, not backup or exclusive storage.
no, not really
their assumption was you were interested in plain text files, and looking for something specific.
from the start I knew ( ok felt ) you were not
we are in an almost imposible situation
I can not give you a magic spell for what you want
I can offer a service, but it is a service I can not give a 100% guarantee for
pragmatically processing 1000s of pdf,doc, xls files is a little hit and miss
I can probably make some sense , and get something "useful",
then again maybe not,
I have no idea what is in the files ( first line is a wile guess ) , and thus far I have not even looked at an excel or word document.
From my view , these files are not worth much, since they were only on a usb flash drive.
But this is because I know that USB flash drives are not for primary storage or backup.. So I know they are perhaps important
Right now you have the files, they work bit have silly names
You have 1000s, do you actually use them?
look back over what you have posted
Pretty much nothing..
I have been guessing at your needs
and still you come back with nothing
In Konqueror 4.8.x (the one on this computer), when you select "Tools-->Find Files" and the Find Files dialog opens, there is a tab labeled "Contents."
You can enter text you are searching for in "Containing Text" field.
It's ultimately quicker to learn how to do this with command line tools, but this is a nice GUI tool for searching file contents.
In Konqueror 4.8.x (the one on this computer), when you select "Tools-->Find Files" and the Find Files dialog opens, there is a tab labeled "Contents."
You can enter text you are searching for in "Containing Text" field.
It's ultimately quicker to learn how to do this with command line tools, but this is a nice GUI tool for searching file contents.
will that work with binary formats like excel and word?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.