AWK: convert LDIF file Help Help Help !!!!!!!!

haydar68 · 07-31-2008, 01:29 PM

Hi,

I experience an issue to use awk on a LDIF file generated from LDAP:

bb.ldif
dn: cn=10,ou=work,o=dom.com
cn: denis.vezina@dom.com
sn: vezina
givenName: denis
mail: denis.vezina@dom.com
displayName: denis vezina

dn: cn=20,ou=work,o=dom.com
cn: jean.paul@dom.com
sn: Paul
givenName: jean
mail: jean.paul@dom.com
displayName: jean paul

dn: cn=30,ou=work,o=dom.com
cn: isabelle.didi@dom.com
sn: didi
givenName: isabelle
mail: isabelle.didi@dom.com
displayName: isabelle didi

dn: cn=40,ou=work,o=dom.com
cn: alain.papa@dom.com
sn: papa
givenName: alain
mail: alain.papa@dom.com
displayName: alain papa

I use this awk file:

function print_user()
{
if(cn1 != "")
{
printf("%s:%s:%s:%s:%s:%s", cn1,cn2,sn1,givenName1,mail1,displayName1)
printf("\n")
}
}

BEGIN {
cn1 = ""
cn2 = ""
sn1 = ""
givenName1 = ""
mail1 = ""
displayName1 = ""

}
/^dn/ {cn1=$2}
/^cn/ {cn2=$2}
/^sn/ {sn1=$2}
/^givenName/ {givenName1=$2}
/^mail/ {mail1=$2}
/^displayName/ {displayName1=$2" " $3}
/^dn/ {
print_user()
cn1= ""
cn2 = ""
sn1 = ""
givenName1 = ""
mail1 = ""
displayName1 = ""
}
END {
print_user()
}

When I run this command:

awk -f book.awk bb.ldif >tata.txt

I get this wrong result:

cn=10,ou=work,o=dom.com:::::
cn=20,ou=work,o=dom.com::vezina:denis:denis.vezina@dom.com:denis vezina
cn=30,ou=work,o=dom.com:

aul:jean:jean.paul@dom.com:jean paul
cn=40,ou=work,o=dom.com::didi:isabelle:isabelle.didi@dom.com:isabelle didi

Something is wrong in my awk file. Can someone help me to fix this issue please?

thanks a lot,

Haydar

pixellany · 07-31-2008, 02:10 PM

I cannot see what the code is supposed to be doing. For example, you have several variables being set to the same passed argument ($2), but it's not obvious what gets passed as $2. (To find out, insert a print statement to just print out that variable.)

Why would you set several different variables to the same value?

Why would you not extract the data needed using AWK's field syntax?

haydar68 · 07-31-2008, 02:34 PM

Quote:

Originally Posted by pixellany

I cannot see what the code is supposed to be doing. For example, you have several variables being set to the same passed argument ($2), but it's not obvious what gets passed as $2. (To find out, insert a print statement to just print out that variable.)

Why would you set several different variables to the same value?

Why would you not extract the data needed using AWK's field syntax?

Hi,

I want to convert this file:

bb.ldif
dn: cn=10,ou=work,o=dom.com
cn: denis.vezina@dom.com
sn: vezina
givenName: denis
mail: denis.vezina@dom.com
displayName: denis vezina

dn: cn=20,ou=work,o=dom.com
cn: jean.paul@dom.com
sn: Paul
givenName: jean
mail: jean.paul@dom.com
displayName: jean paul

dn: cn=30,ou=work,o=dom.com
cn: isabelle.didi@dom.com
sn: didi
givenName: isabelle
mail: isabelle.didi@dom.com
displayName: isabelle didi

dn: cn=40,ou=work,o=dom.com
cn: alain.papa@dom.com
sn: papa
givenName: alain
mail: alain.papa@dom.com
displayName: alain papa

to this format:
dn:cn:sn:givenName:Mail

isplayName
cn=10,ou=work,o=dom.com:denis.vezina@dom.com:vezina:denis:denis.vezina@dom.com:denis vezina
cn=20,ou=work,o=dom.com:jean.paul@dom.com

aul:jean:jean.paul@dom.com:jean paul
cn=30,ou=work,o=dom.com:isabelle.didi@dom.com:didi:isabelle:isabelle.didi@dom.com:isabelle didi
cn=40,ou=work,o=dom.com:alain.papa@dom.com

apa:alain:alain.papa@dom.com:alain papa

If you chek my previous result, you will notice that the cn filed is missing and the first field is not matching with the other data.

Is a way to change the .awk file?

Thanks,
Haydar

jlliagre · 07-31-2008, 03:21 PM

Try that one (I renamed the variables as the one used were somewhat confusing):

Code:

function print_user()
{
  if(cn!="")
  {
    printf("%s:%s:%s:%s:%s:%s\n", dn,cn,sn,givenName,mail,displayName)
  }
}

BEGIN {
  dn = ""
  cn = ""
  sn = ""
  givenName = ""
  mail = ""
  displayName = ""
}
/^dn/ {ndn=$2}
/^cn/ {cn=$2}
/^sn/ {sn=$2}
/^givenName/ {givenName=$2}
/^mail/ {mail=$2}
/^displayName/ {displayName=$2" "$3}
/^dn/ {
  print_user()
  dn= ndn
  cn = ""
  sn = ""
  givenName = ""
  mail = ""
  displayName = ""
}
END {
  print_user()
}

ghostdog74 · 07-31-2008, 10:04 PM

Code:

awk 'BEGIN{ RS=""; FS=":"}{ print $2":"$4":"$6":"$8":"$10":"$12}' file

PTrenholme · 07-31-2008, 10:23 PM

I think that ghostdog's point is that ":" needs to be in your field separator (FS) list. I think you might also want to include a blank, but that could "mess up" the name parsing. Perhaps you'd want to strip leading and trailing blanks (if any) from $2.

While setting the record separator (RS) to null, as he suggests, will let you parse the whole set as a single record, I don't think his suggested program would scale very well to a long file.

ghostdog74 · 07-31-2008, 10:38 PM

Quote:

Originally Posted by PTrenholme

I don't think his suggested program would scale very well to a long file.

how long ?

haydar68 · 07-31-2008, 10:43 PM

Quote:

Originally Posted by jlliagre

Try that one (I renamed the variables as the one used were somewhat confusing):

Code:

function print_user()
{
  if(cn!="")
  {
    printf("%s:%s:%s:%s:%s:%s\n", dn,cn,sn,givenName,mail,displayName)
  }
}

BEGIN {
  dn = ""
  cn = ""
  sn = ""
  givenName = ""
  mail = ""
  displayName = ""
}
/^dn/ {ndn=$2}
/^cn/ {cn=$2}
/^sn/ {sn=$2}
/^givenName/ {givenName=$2}
/^mail/ {mail=$2}
/^displayName/ {displayName=$2" "$3}
/^dn/ {
  print_user()
  dn= ndn
  cn = ""
  sn = ""
  givenName = ""
  mail = ""
  displayName = ""
}
END {
  print_user()
}

THanks a lot jlliagre, it works fine.

haydar68 · 07-31-2008, 10:47 PM

Quote:

Originally Posted by ghostdog74

Code:

awk 'BEGIN{ RS=""; FS=":"}{ print $2":"$4":"$6":"$8":"$10":"$12}' file

Hi,

this command works but the result contains a space in each field.
You can see that:

cn=01,ou=people,o=tt.ca: tata.toto@tt.ca: tata: toto: tata.toto@tt.ca: tata toto
cn=02,ou=people,o=tt.ca: tata2.toto2@tt.ca: tata2: toto2: tata2.toto2@tt.ca: tata2 totot2
cn=03,ou=people,o=tt.ca: tata3.toto3@tt.ca: tata3: toto3: tata3.toto3@tt.ca: tata3 totot3
cn=04,ou=people,o=tt.ca: tata4.toto4@tt.ca: tata4: toto4: tata4.toto4@tt.ca: tata4 totot4
cn=05,ou=people,o=tt.ca: tata5.toto5@tt.ca: tata5: toto5: tata5.toto5@tt.ca: tata5 totot5
cn=06,ou=people,o=tt.ca: tata6.toto6@tt.ca: tata6: toto6: tata6.toto6@tt.ca: tata6 totot6

Is there a way to drop this space in each field?

Thanks,

Haydar

pixellany · 07-31-2008, 11:10 PM

To drop all the spaces, just pipe the data thru a SED command:

<existing code> | sed 's/ //g' > newfilename

Do you have a good book on shell scripting? You can get "Bash Guide for Beginners" free at http://tldp.org. Also, go here for really good tutorials:
http://www.grymoire.com/Unix/

ghostdog74 · 07-31-2008, 11:14 PM

just do the substitution in awk

Code:

awk 'BEGIN{ FS=":";RS=""; OFS=":"}
{
 gsub(": ",":")  
 print $2,$4,$6,$8,$10}' file

pixellany · 07-31-2008, 11:32 PM

Sigh......I guess I need to learn AWK one of these days......I feel SO inadequate...

haydar68 · 07-31-2008, 11:33 PM

Quote:

Originally Posted by ghostdog74

just do the substitution in awk

Code:

awk 'BEGIN{ FS=":";RS=""; OFS=":"}
{
 gsub(": ",":")  
 print $2,$4,$6,$8,$10}' file

Oh cool, it works fine, Thanks a lot guys

radoulov · 08-01-2008, 05:25 AM

Or:

Code:

awk -F': |\n' '{print $2,$4,$6,$8,$10}' RS= OFS=: file

haydar68 · 08-01-2008, 08:38 AM

Quote:

Originally Posted by radoulov

Or:

Code:

awk -F': |\n' '{print $2,$4,$6,$8,$10}' RS= OFS=: file

Yes it works fine. Thanks a lot