How can I do regex recursive file searches that include LF chars?
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
How can I do regex recursive file searches that include LF chars?
Hello,
I need to use a regex that includes LF to locate some files in a large dir/file set; recursive searches are required. My understanding with grep is that is will not search across line feeds; grep only finds patterns on one line but will show if found on any line in a file. The languages like Perl and awk look workable but I don't know them yet.
Is a short program the best I can do or is there a command I have not learned/found that will recursively search files for a regex that includes LF?
like:
\x0A\x09Material\x20\x20\x0A\x7B
in text it would look like
LF
TABMaterialsSPCSPCLF
{
It would be helpful if you provided an exact sample of the file (please use proper formatting), and exactly what it is you want to match with your regular expression.
Thanks for the reply. The search string in my post is as exact as I can get. I have suspect files but am not sure if the string is in any of them. Otherwise I would provide one.
It's searching for a generalized hex sequence that I need. It seems that there isn't a Linus command or typical tool that will do this but I should look for a hex editor that supports regex is some form.
This last thought gave me the idea to dd the directory tree to an image and then use a hex editor to search for a hex string and not use a regex or file tool.
Does anyone know of a stable hex editor that that can handle 50GiB files?
My first choice is for Slackware 14 but I can to the basic configure and make process.
Sounds like you need help learning awk or Perl. I recommend awk, and I usually learn by doing. Therefore you already have an existing regex, and awk uses regex, so try to make an awk string which will suit your needs. Have it first findi the sequences you need, and then changing them or processing them, based on what the next step is.
Suggest you choose which language or tool you wish to use and then start researching them.
Once you've started, you can update this thread with information indicating where you're stuck with the particular option you've chosen. Also from that point people can offer you better recommendations.
Also gnuemacs will handle the files and be capable of showing you hex ascii output; however what you should decide first is whether or not you wish an editor or a search tool. Not sure if gedit will similarly work in this mode. VI is also a very capable editor. You should cite what editors you have tried.
Searching hex is easy - simple searching across lines is harder.
Perl (and perl mode in grep) can do it simply, but will slurp the entire file - not a good idea for 50Gig (plus ?) files.
Strictly speaking you don't care about the first \n - start search for lines beginning with "\tMat" and check the next line. I see mgrep on sourceforge that should do for a simple sequence like that - the homepage even has an applicable example.
Thanks all, I got busy with work but will work this over the weekend. I'll post back with what I learn and am able to make work.
An awk intro sounds like fun just because I haven't tried it before and used Perl, Ruby, and Java 10-20 years ago.
I found and editor named wxHexEditor. It supports files up to 2^64 byte. It is a beta release.
Happy Trails
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.