Searching .txt file for (specific) strings and printing them to new file
Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Searching .txt file for (specific) strings and printing them to new file
Hey, I know the title was too long but basically, what I'm trying to do is take a specific string from a text file and print the strings to a seperate file and on a new line. May seem confusing but here's an example.
And I have a file with this type of formatting; if I was using the above example, I would like to take only the Emails from the file and print them to a completely seperate file by > seperate-file.txt but the problem is, I don't know how to go about that.
I tried using grep to search for a string of "***@***.com" but that would only print the entire line including Name, PhoneNumber, etcetera and after reading the man pages, found out that it's for whole lines.
I was wondering if anybody knew of a command that I could do this with? It's beginning to give me a bit of a headache.
This looks like homework to me, is it? You're not going to find a lot of help for homework on LQ, since it's against the rules. Anyway, if it is, you should look into awk and loops to obtain the result you want. If it's not homework post what you've already tried and we'll take it from there.
No, I don't go to school or college at the minute, it has nothing to do with any homework or course work of the sort. All it is, was I have been trying to teach myself a bit more about BASH and its commands by looking into some of the man pages and making practice text files and other stuff and try to manipulate the way they work and what they contain and stuff from the terminal. I worked up this list of random nobodies with some random numbers, addresses, etc and was just wondering if this is the type of thing can be done with the terminal but being pretty sure that it was, it began giving me a headache so I wanted to ask on a forum.
I've tried such things as grep but grep could only print the whole lines and that's about as near as I could. The latest commands I have been trying are
I have already looked into awk but when I was reading through the man pages, it was explaining about (mawk? And) how it's used with programs, which I didn't really understand how I would use awk to process the .txt file like so.
That's pretty much all I've been able to think of at the minute though.
That would give you an output with the trailing comma included, so you can use sed to 'clip' it off and replace it by a newline character to get all the email addresses in a list.
I'm sure there are even other ways, as other users more acquainted with Bash will surely point out.
Hi, I think i might have the solution to your problem. I have found that your best friend in the linux world for text problems like this is grep, cut, and awk.
Grep is used to search for a whole line in a text file
Cut is used a lot of times with grep to separate the line, this is the one i used here, and awk does this and a whole bunch more. Takes a bit of practice but is a life saver in the end.
The way would do this is either separating by the colon : or the , first
1. cut -d: -f4 textfile
The -d is the delimeter or separator. In this case it is a : but it could be anything.
The -f is the field separator or the number of : in. Name is one, Bob, Phonenumber is two, 12345,MobileNumber is three, 1939493,Email is four and ***@**.com, is five etc
This will return us ***@**.com,
But the problem is the emails still have a comma so we will use it again this time with the delimeter as a comma.
I've tried such things as grep but grep could only print the whole lines and that's about as near as I could.
Not really true. Look at the -o option: it will print only the matching string. If more matches are on the same line, they will be printed on separate lines.
In a more general case, where the file format is strictly the one you've posted in the OP (that is a CSV format and lines made of key/value pairs) you can try something like this:
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.