LinuxQuestions.org
Go Job Hunting at the LQ Job Marketplace
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices



Reply
 
Search this Thread
Old 10-21-2011, 04:52 AM   #1
zonah12
LQ Newbie
 
Registered: Oct 2011
Posts: 5

Rep: Reputation: Disabled
grep multiple words and linking with other file


just an e.g. i have two files, in one file (top.txt) 1000 random words are present which i want to use. In second file (All.txt), i have the 10000 words and their meanings in two fields. Now, what i want to do is that i want to use the output of my top.txt file in a way that after greping the words from top file i get the meaning of all those words from my other file All.txt. if i use the command % grep -e "(foul|.....|zeal) top.txt i grep all the words but now how to compare it with the other file. Kindly let me know
 
Old 10-21-2011, 05:58 AM   #2
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,698

Rep: Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988
I think you might need to demonstrate with some small examples as I do not follow what you require?
 
Old 10-21-2011, 06:53 AM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
As grail said. Vague questions tend to get you vague answers, and one concrete example is worth a thousand lines of explanation. Show us a sample of each file, at least, along with what you want the output to look like.

But in any case, it's possible to use a file as a collection of patterns to search for. As long as the first file has only a single search word per line, you can try this:

Code:
grep -f top.txt All.txt
You can also use other options like -F, to search for fixed strings only, and -i to make them case-insensitive. See the grep man and info pages for more options.
 
Old 10-21-2011, 07:12 AM   #4
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,604

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446
My guess would be that the OP wants to grep for a certain pattern(s) in top.txt first and then use the results to grep them in All.txt. Some sort of cascaded (?) filtering process?
Code:
grep -E "(foul|.....|zeal)" top.txt > tmpfile
grep -f tmpfile All.txt
@OP: As stated before please provide an example with some representative sample data.

Last edited by crts; 10-21-2011 at 09:01 AM. Reason: corrected Syntax
 
Old 10-21-2011, 08:12 AM   #5
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
Well if that's the case, then we could also use a process substitution as the "file" to search with.

http://mywiki.wooledge.org/ProcessSubstitution

Code:
grep -f <( grep -E -e "(foul|.....|zeal)" top.txt ) All.txt
Not that it really changes anything from the above other than bypassing the need for a tempfile. It still requires two grep processes. If we knew more about the actual requirements, perhaps we could even come up with a single-step solution.

Also be aware that P.S. is a bash-only extension.

PS: You need grep -E/egrep for a complex regex like that.

Last edited by David the H.; 10-21-2011 at 08:14 AM.
 
1 members found this post helpful.
Old 10-21-2011, 09:03 AM   #6
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,604

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446
Quote:
Originally Posted by David the H. View Post
PS: You need grep -E/egrep for a complex regex like that.
Thanks for the hint. I also noticed that I forgot to close the quote in my previous post. I simply copy+pasted that part from the OP's solution without further examining it. Corrected it now.
 
Old 10-26-2011, 02:47 AM   #7
zonah12
LQ Newbie
 
Registered: Oct 2011
Posts: 5

Original Poster
Rep: Reputation: Disabled
grep multiple words

Thankyou so much for the solutions but its not working. I will try to elaborate by giving more examples.
file TOP.txt

Foul
Tall
blot
grail
House
System
Galaxy
jar
trophy
laptop

This file contains 10 words

Second file all.txt


system ns01
broad ns02
house ns03
laptop ns04
trophy ns05
ginger ns06
foul ns07
dustbin ns08
mugs ns09
blot ns10
pack ns11
butter ns12
jar ns13
knife ns14
kangroo ns15
galaxy ns16
kind ns17
heart ns18
grail ns19
short ns20
tall ns21
table ns22
chair ns23
blot ns24
onion ns25
foul ns26

this file contains 26words with their codes, now what i want to do is to relate the top file words with the codes in all.txt files omiting the words which are not present in the top file. that is i want the result to look like this
Result
Foul ns26
Tall ns21
blot ns24
grail ns19
House ns03
System ns01
Galaxy ns16
jar ns13
trophy ns05
laptop ns04

two things important firstly words are not arranged alphabetically and secondly they are not case sensitive that is similar words might be in small alphabet in top file where as in capital in all file.
I hape i have given the clear e.g. now
 
Old 10-26-2011, 05:19 AM   #8
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,604

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446
Hi,

your example still needs a bit more explanation. What happens with multiple matches? Do you want to keep the first match or the last match? Or maybe something else? Your sample output suggests that you want to print only the last match.
Code:
grep -i -f top.txt all.txt|tac| awk '(a[$1]++ == 0) {print}'
However, the order is not the same as in your sample. If you wish to keep the first match:
Code:
grep -i -f top.txt all.txt | awk '(a[$1]++ == 0) {print}'
And to keep all matches:
Code:
grep -i -f top.txt all.txt
If none of the above works then you need to provide some more criteria for the filtering process.
 
Old 11-02-2011, 12:28 AM   #9
zonah12
LQ Newbie
 
Registered: Oct 2011
Posts: 5

Original Poster
Rep: Reputation: Disabled
hi still not working

Hi,
Its still not working. No i dont have multiple matches and i have single word in all.txt file with a single code no repeatation in the top.txt or all.txt files. Since i am a new user so if possible do guide me about using commands as well. Thankyou so much for your answers.
 
Old 11-02-2011, 01:59 AM   #10
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.6, Centos 5.10
Posts: 16,324

Rep: Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041Reputation: 2041
Quote:
Its still not working.
... in what way? We need exact details/example
 
Old 11-02-2011, 02:44 AM   #11
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,698

Rep: Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988
So to confirm, the data you provided is wrong as it does have duplicates?
Code:
system ns01
broad ns02
house ns03
laptop ns04 
trophy ns05
ginger ns06
foul ns07
dustbin ns08
mugs ns09
blot ns10 
pack ns11
butter ns12
jar ns13 
knife ns14 
kangroo ns15
galaxy ns16 
kind ns17
heart ns18
grail ns19
short ns20 
tall ns21
table ns22
chair ns23 
blot ns24
onion ns25 
foul ns26
 
Old 11-17-2011, 11:29 PM   #12
zonah12
LQ Newbie
 
Registered: Oct 2011
Posts: 5

Original Poster
Rep: Reputation: Disabled
reply

system ns01
broad ns02
house ns03
laptop ns04
trophy ns05
ginger ns06
foul ns07
dustbin ns08
mugs ns09
blot ns10
pack ns11
butter ns12
jar ns13
knife ns14
kangroo ns15
galaxy ns16
kind ns17
heart ns18
grail ns19
short ns20
tall ns21
table ns22
chair ns23
blot ns24
onion ns25
jacket ns26

I am sorry for it. I have changed the repeated word, if any more repeatation then i wish to keep the first match. Well, i tried the commands but in top.txt file i didnt get any codes after my words with out any errors. I checked all.txt file as well but it remained same, and no new file was created as well.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] grep multiple file divyashree Programming 5 10-26-2011 05:51 AM
grep multiple values in single pass through log file. 1ankit1 Programming 2 11-13-2010 05:45 PM
grep multiple words any order (AND, not OR) single line, from many files cedardoc Linux - Newbie 7 07-29-2010 11:23 AM
Grep an entire file but must contain multiple words wakeboarder3780 Linux - Newbie 10 02-19-2009 05:46 PM
Can grep filter out words? extrasolar Linux - General 1 07-20-2006 04:14 PM


All times are GMT -5. The time now is 11:50 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration