LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 12-13-2019, 11:09 AM   #1
Ikebukuro
LQ Newbie
 
Registered: Dec 2019
Posts: 4

Rep: Reputation: Disabled
Question Using grep to compare two files and make a filter on key words


Hello experts,

I have a problem with two files.
File 1 : one word by line; it is the name of Oracle's tables.
File 2 : many words by line; it is the SELECT * from Oracle's views.

I need to find, in the file 1, all the words that are not in the file 2.
I tried with grep but I failed...

Can you tell me how to do?

Here an extract of File 1 : there are many space characters at the end of the line, we have to remove them in the search.
Quote:
WRH$_ACTIVE_SESSION_HISTORY
WRH$_ACTIVE_SESSION_HISTORY_BL
WRH$_ASM_BAD_DISK
WRH$_ASM_DISKGROUP
WRH$_ASM_DISKGROUP_STAT
WRH$_BG_EVENT_SUMMARY
Here an extract I simplified and modified of File 2.
PHP Code:
SELECT from WRH$_ACTIVE_SESSION_HISTORY                                                                                                                                                                             
SELECT 
from WRH$_ASM_BAD_DISK                                                                                                                                                                                       
SELECT col12
col34 from WRH$_ASM_DISKGROUPWRH$_ASM_BAD_DISK etc etc                                                                                                                                                                                      
SELECT 
from WRH$_ASM_DISKGROUP_STAT ORDER BY 1 
The result I want to see:
Quote:
WRH$_ACTIVE_SESSION_HISTORY_BL
WRH$_BG_EVENT_SUMMARY
Sorry, I don't see where I can insert a whole file (is it possible?) I don't want to copy ALL the content of my files, it is too big...
And sorry if I made mistakes for my first post here ...

Have a very nice day.

Last edited by Ikebukuro; 12-13-2019 at 11:20 AM.
 
Old 12-13-2019, 11:15 AM   #2
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,312
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
You're probably going to need an awk or perl script with an associative array. Read the shorter word list into the array and then run the second, larger file through it.
 
Old 12-13-2019, 11:17 AM   #3
Farcrada
LQ Newbie
 
Registered: Dec 2019
Posts: 4

Rep: Reputation: Disabled
You could write a small Python script/program that takes two files as arguments and spits out a difference file and a file with everything that matches. I feel like that would be the best short-term solution, assuming you need to do this for the mentioned big file(s).
 
Old 12-13-2019, 11:21 AM   #4
Ikebukuro
LQ Newbie
 
Registered: Dec 2019
Posts: 4

Original Poster
Rep: Reputation: Disabled
Thank you for yours answers but I am an Oracle DBA, not an expert Linux... I don't know how to use Python or Perl or Awk :-(
 
Old 12-13-2019, 11:34 AM   #5
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,312
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
No time like the present. It's pretty much not possible to do system adminsitration without periodically needing some awk or perl or (maybe) python.

AWK is a full language and would take ages to master but getting an idea of the basics can take but a few minutes:

https://www.grymoire.com/Unix/Awk.html

AWK in its most elementary form is just a buch of abbreviated if-then statements:

Code:
awk '
NR==FNR {
        a[$1]++;
        next;
} 

a[$4] {
        delete a[$4];
} 

END {
        for (i in a) { 
                print i;
        }
}
' file1.txt file2.txt | sort
$1, $4, NR, and FNR are built-in variables. END {} is a clause which gets run once after there is no more input.
 
Old 12-13-2019, 02:04 PM   #6
Ikebukuro
LQ Newbie
 
Registered: Dec 2019
Posts: 4

Original Poster
Rep: Reputation: Disabled
Thank you Turbocapitalist for Awk, I've forgotten that I have a book about it...
Tomorrow I will read it, I think it will be very useful to solve my problem.
 
Old 12-13-2019, 10:28 PM   #7
Turbocapitalist
LQ Guru
 
Registered: Apr 2005
Distribution: Linux Mint, Devuan, OpenBSD
Posts: 7,312
Blog Entries: 3

Rep: Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722Reputation: 3722
Having the pattern in an unpredicable place on each line and possibly multiple times means you'll probably have to work out something with the match() function or similar.
 
Old 12-14-2019, 03:07 PM   #8
MadeInGermany
Senior Member
 
Registered: Dec 2011
Location: Simplicity
Posts: 2,797

Rep: Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201Reputation: 1201
Unfortunately the following simple word grep returns the non-matching lines from file2
Code:
grep -vwf file1 file2
And the vice versa won't match as words.

Last edited by MadeInGermany; 12-14-2019 at 03:19 PM. Reason: Does not work
 
Old 12-14-2019, 03:44 PM   #9
boughtonp
Senior Member
 
Registered: Feb 2007
Location: UK
Distribution: Debian
Posts: 3,603

Rep: Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546Reputation: 2546
Doing this with grep isn't difficult:
Code:
while read -r line; do
	grep -wFq "$line" queries.txt || echo "$line"
done < tables.txt
tables.txt is your file 1 - the list of tables (needles) to search for, with $line being each one.
queries.txt is your file 2 - the queries file (haystack) to search within.

The first and last lines are for looping through the file (for variations see BashFAQ/001).

Grep flags:
-w matches whole words (addresses suffixed names)
-F is for Fixed strings - i.e. disables regex matching
-q is for quiet - i.e. don't output when matches found (we want the opposite)

The || is so that when grep doesn't match, the line searched for is then output.

Last edited by boughtonp; 12-14-2019 at 03:50 PM.
 
1 members found this post helpful.
Old 12-15-2019, 06:36 AM   #10
Ikebukuro
LQ Newbie
 
Registered: Dec 2019
Posts: 4

Original Poster
Rep: Reputation: Disabled
boughtonp, you are GREAT!
It works

Thank you everybody for your help, I was reading a book about Awk but I see we can manage my problem with grep!

Have a nice day :-)
 
Old 12-15-2019, 07:47 PM   #11
frankbell
LQ Guru
 
Registered: Jan 2006
Location: Virginia, USA
Distribution: Slackware, Ubuntu MATE, Mageia, and whatever VMs I happen to be playing with
Posts: 19,328
Blog Entries: 28

Rep: Reputation: 6142Reputation: 6142Reputation: 6142Reputation: 6142Reputation: 6142Reputation: 6142Reputation: 6142Reputation: 6142Reputation: 6142Reputation: 6142Reputation: 6142
See man diff.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: Words, Words, Words--Introducing OpenSearchServer LXer Syndicated Linux News 0 08-07-2019 02:13 PM
[SOLVED] How to compare a list of files in two directories: compare content and print size Batistuta_g_2000 Linux - Newbie 9 03-24-2013 07:05 AM
Compare two files using sed/grep xpto09 Linux - General 4 09-23-2011 10:20 AM
Can grep filter out words? extrasolar Linux - General 1 07-20-2006 03:14 PM
How to filter files in files and files which are in a sub-directory with "grep"? Piero Linux - Newbie 9 08-29-2003 02:38 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 03:47 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration