LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 03-28-2011, 10:40 AM   #1
s_linux
Member
 
Registered: Jul 2009
Posts: 83

Rep: Reputation: 15
script to compare users in files


All,
I have two files with user DN's that exported from two different LDAP directories. I wanted to write a script that reads(checks) users (cn=user1) in file A and check to see if users(cn=user1) exists in file B and give me nice output with what users are missing in file B.
I have around 30k users in file A with following format..
Quote:
cn=user1,ou=some,o=org
cn=user2,ou=some,o=org
cn=user3,ou=some,o=org
cn=user4,ou=some,o=org
-
-
-
etc
I have same format in file B.
Anyone have an idea how I can do that with shell script.
Thanks
 
Old 03-28-2011, 11:55 AM   #2
rtmistler
Moderator
 
Registered: Mar 2011
Location: USA
Distribution: MINT Debian, Angstrom, SUSE, Ubuntu, Debian
Posts: 9,882
Blog Entries: 13

Rep: Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930Reputation: 4930
See this thread to determine how to write a loop in your script and then channel the line you read from file A into a grep command in file B, use the -c to get a count.

http://www.linuxquestions.org/questi...aratly-364259/

To do something this complex, I'd save the user name from file A and the "found count" from file B into an array defined in your script, and then process that array and create an output file using the entries where the found count is zero on a per-user basis.

A suggestion is to add:

set -x
set -v

Near the top of your script to output to stdout the flow of the script in order to debug it, and then later comment out those lines as you put the script into use.

Further, use functions instead of writing one big, hard to read script. If you aren't familiar with functions in a script, search for some examples, there are plenty.
 
Old 04-01-2011, 04:11 PM   #3
s_linux
Member
 
Registered: Jul 2009
Posts: 83

Original Poster
Rep: Reputation: 15
Thanks.
I know how that for like do something works. But what I need is

for line in filea;do
get the "cn=user1"
and store in a variable
then
check fileb to see that variable exists
if not write the whole dn to filec.

But not sure how I can get the only cn value and store it in a variable and then check to see if that cn value exists in fileb.

I have some users like cn=user one,ou=some,o=org
Thanks again..
 
Old 04-01-2011, 04:22 PM   #4
s_linux
Member
 
Registered: Jul 2009
Posts: 83

Original Poster
Rep: Reputation: 15
Also I just started writing script

Quote:
#!/bin/bash
#set -x

if [ -s filea ];then
for line in $(< filea);do
echo $line
linea=`$line |grep -e ".*CN=([0-9A-Za-z]+),*"`
echo "$linea"
done
fi
I just wanted to whats the value in "line" variable. I'm getting like

Quote:
cn=usera
one,ou=some,ou=some,o=org
cn=userb
one,ou=some,ou=some,o=org
when I run the script I see the "linea" variable as below
cn=usera
cn=userb

but I want to get "cn=usera one"
I'm not sure if the regular expression works but still testing..
 
Old 04-01-2011, 04:50 PM   #5
Nominal Animal
Senior Member
 
Registered: Dec 2010
Location: Finland
Distribution: Xubuntu, CentOS, LFS
Posts: 1,723
Blog Entries: 3

Rep: Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948Reputation: 948
Please tell me this is not homework. (No, I'm serious. )
Quote:
Originally Posted by s_linux View Post
Code:
cn=user1,ou=some,o=org
cn=user2,ou=some,o=org
cn=user3,ou=some,o=org
cn=user4,ou=some,o=org
I'd say awk would be a good match for this.
Code:
awk -v "file1=path-to-file1" -v "file2=path-to-file2" '
    BEGIN {
        RS="[\t\n\v\f\r ]*[\r\n]+[\t\n\v\f\r ]*"
        FS="[\t\v\f ]*[,][\t\v\f ]*"

        # Read first file into list1
        split("", list1)
        while ((getline < file1) > 0)
            for (i = 1; i <= NF; i++)
                if ($i ~ /^[\t\v\f ]*[Cc][Nn][\t\v\f ]*=/) {
                    cn = tolower($i)
                    sub(/^[\t\v\f ]*[Cc][Nn][\t\v\f ]*=[\t\v\f ]*/, "", cn)
                    sub(/[\t\v\f ]*$/, "", cn)
                    list1[cn] = $0
                }

        # Read second file into list2
        split("", list2)
        while ((getline < file2) > 0)
            for (i = 1; i <= NF; i++)
                if ($i ~ /^[\t\v\f ]*[Cc][Nn][\t\v\f ]*=/) {
                    cn = tolower($i)
                    sub(/^[\t\v\f ]*[Cc][Nn][\t\v\f ]*=[\t\v\f ]*/, "", cn)
                    sub(/[\t\v\f ]*$/, "", cn)
                    list2[cn] = $0
                }

        # List all users in list1 that do not exist in list2.
        for (cn in list1)
            if (!(cn in list2))
                printf("%s (%s)\n", cn, list1[cn])
    }'
In the above code, I tell awk that records are separated by newlines, and any leading or trailing whitespace is part of the separator. Fields are separated by commas, again whitespace being a part of the comma.

I used two identical loops to read in the files. They check if one of the fields is the common name field (cn=), and if so, adds the entire record (as a string) into an associative array keyed by the value of the common name in lower case -- I assume you wish the comparison to be case insensitive. (If not, use $i instead of tolower($i).

The sub commands remove the cn= part and any leading and trailing whitespace.

Finally, the script loops over all names in cn1, and outputs the ones that are not listed in cn2.

Note that unlike normal awk scripts, this one has no input files. It would have been pretty natural to read only the second user list in the BEGIN section, and use a normal rule to process each record in the first file; however, I think you'll probably want to do the check the other way too -- list all users that are listed in the second file but not the first -- and you can only do both if you read both into arrays. So I'm anticipating your needs a bit.

I hope this helps, but is not your homework,
 
Old 04-02-2011, 03:14 AM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Could we not just use grep or comm?
Code:
#fileA
cn=user1,ou=some,o=org
cn=user2,ou=some,o=org
cn=user3,ou=some,o=org
cn=user4,ou=some,o=org

#fileB
cn=user1,ou=some,o=org
cn=user2,ou=some,o=org
cn=user4,ou=some,o=org
So we want what is in fileA but not in fileB:
Code:
grep -v -f fileB fileA
comm -3 fileA fileB
Both will return the third line in fileA. The nice thing about the grep is that the data does not need to be sorted as it does with comm.
 
Old 04-05-2011, 10:01 AM   #7
s_linux
Member
 
Registered: Jul 2009
Posts: 83

Original Poster
Rep: Reputation: 15
Thanks for your help.
This is NOT home work. I been learning scripting. I'm taking the chances whereever I can write a script in my company. I wrote few sofar. Now I have a requirement that needs to compare couple of files time to time and need to make the changes to the directory based on what we find from comparison.
since I'm also learning, I dont wanna use someone else script. I want to write myself so that I can learn and help others in the future.
so back to my previous post on 04-01-11, 04:22 PM, I'm not sure why the single line splits into 2-3 or may be more line based on the spaces in between.
Quote:
cn=usera
one,ou=some,ou=some,o=org
cn=userb
one,ou=some,ou=some,o=org
cn=userb
one
test,ou=some,ou=some,o=org
but actual data is ..
Quote:
cn=usera one,ou=some,ou=some,o=org
cn=userb one,ou=some,ou=some,o=org
cn=userb one test,ou=some,ou=some,o=org
if I can get in a single line, I can put that in a variable and then using regular expression,I can get the (cn=*) and using that cn value I think I can compare with second file.

Grail - I'm not sure if your solution works bcz dn context is entirely different in two files most of the cases. but will test it when I have similar dn in both files.
 
Old 04-05-2011, 10:34 AM   #8
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191Reputation: 3191
Quote:
I'm not sure why the single line splits into 2-3 or may be more line based on the spaces in between.
This one is easy, it is because your for loop performs word splitting based on the value of IFS, which by default is white space, hence each space passes an individual piece
into your line variable.

I am a little more with the fact that i misread the question that the cn=userX will be what you need to look for but the rest is irrelevant.
So how about:
Code:
egrep -v $(cut -d',' -f1 fileB | sed ':a N;s/\n/|/;ta') fileA
Again this seems to work with the examples I provided previously.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash script to compare two files swatward Linux - Newbie 17 04-18-2012 06:14 PM
Need some help on bash script to compare /etc/passwd to locked users rhbegin Linux - Software 8 11-11-2009 12:55 PM
How to Compare Two files using shell script pooppp Linux - Networking 14 08-05-2008 03:35 AM
Script to compare numbers inside two text files bugg_deccan Programming 3 10-17-2007 09:53 PM
shell script: compare 2 files anhtt Programming 6 08-29-2007 02:39 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 09:08 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration