LinuxQuestions.org - substitute few words + change all the lines starting with a specific word + put blank

- Programming (https://www.linuxquestions.org/questions/programming-9/)

- - substitute few words + change all the lines starting with a specific word + put blank (https://www.linuxquestions.org/questions/programming-9/substitute-few-words-change-all-the-lines-starting-with-a-specific-word-put-blank-697953/)

substitute few words + change all the lines starting with a specific word + put blank

Hi All,

I have an old-address-list file which is having around 1500 entries. I need to convert this addresses in to a specific format.

The old-address-list file>

# cat old-address-list-file
dn: CN=Muhammad Hadhi K.M,OU=IT Dept,OU=Example Company H.O,DC=example,DC=com
cn: Muhammad Hadhi K.M
sn: Hadhi
l: Calicut
title: IT Manager
postalCode: 5456
postOfficeBox: 3434
physicalDeliveryOfficeName: IT Department
telephoneNumber: +111 1 111111 Ext:111
facsimileTelephoneNumber: +222 2 222222
givenName: Muhammad
department: IT
company: Example Company Ltd.
proxyAddresses: SMTP:hadhi@example.com
streetAddress: P.O.Box 3434
mobile: +4444 4444 4444
dn: CN=Gheevar Scaria,OU=Civil,OU=MNM Site,OU=Example Company MNM,DC=example,DC=com
cn: Gheevar Scaria
sn: Scaria
l: Cochin
title: Civil Engineer
postalCode: 2323
postOfficeBox: 4545
physicalDeliveryOfficeName: ABC Office
telephoneNumber: +111 123456789 Ext:111
facsimileTelephoneNumber: +222 212121
givenName: Gheevar
department: Civil Department
company: Example Company Ltd.
proxyAddresses: SMTP:gheevar@mnm-example.com
streetAddress: Nut Street
mobile: +444 12211221
dn: CN=Siva Kumar KP,OU=Marketing,OU=RTZ Site,OU=Example Company RTZ,DC=example,DC=com
cn: Siva Kumar KP
sn: Kumar
l: Tirur
title: Marketing Manager
telephoneNumber: +111 987654321 Ext:999
givenName: Siva
department: Marketing
company: Example Company Ltd.
proxyAddresses: SMTP:siva@rtz-example.com

Few information about this file;
==============================/

1. All addresses starts with dn:
2. So a single address entry means the values begin with a dn: up to the next dn: entry
3. The above example holds 3 valid addresses
4. The maximum values (each line including dn: holds a value) in an address entry is 16 (as the 2 first addresses in this example) (the last address is having only 10 values)
5. Some addresses may have only 4 values (dn:, cn:, sn:, proxyAddresses: SMTP:) which are the mandatory, and some others might have values in between this 4 and 16 (its not fixed).

Now the tasks to do;
===================/

1. Need to change all the dn: lines to the following way;

dn: CN=Muhammad Hadhi K.M,ou=Addressbook,dc=good,dc=com
dn: CN=Gheevar Scaria,ou=Addressbook,dc=good,dc=com
dn: CN=Siva Kumar KP,ou=Addressbook,dc=good,dc=com

means need to replace immediately after the "dn: CN=Full Name," with "ou=Addressbook,dc=good,dc=com"

2. Add the following 2 lines to just below all the dn: lines

objectClass: inetOrgPerson
objectClass: top

(this is not an address specific value, this value is same for all the addresses)

3. substitute the following words

"department:"------>to----->"ou:"
"company:"------>to----->"o:"
"proxyAddress: SMTP:"------>to----->"mail":
"streetAddress:"------>to----->"street":

4. Put a blank line between addresses (like a blank line above all the dn: lines)

5. Any procedure to verify the entire process went fine

Expecting Result;
================/

The resulting file should look like this;

# cat new-address-list
dn: cn= Muhammad Hadhi K.M,ou=Addressbook,dc=good,dc=com
objectClass: inetOrgPerson
objectClass: top
cn: Muhammad Hadhi K.M
sn: Hadhi
l: Calicut
title: IT Manager
postalCode: 5456
postOfficeBox: 3434
physicalDeliveryOfficeName: IT Department
telephoneNumber: +111 1 111111 Ext:111
facsimileTelephoneNumber: +222 2 222222
givenName: Muhammad
ou: IT
o: Example Company Ltd.
mail: hadhi@example.com
mobile: +4444 4444 4444

dn: CN=Gheevar Scaria,ou=Addressbook,dc=good,dc=com
objectClass: inetOrgPerson
objectClass: top
cn: Gheevar Scaria
sn: Scaria
l: Cochin
title: Civil Engineer
postalCode: 2323
postOfficeBox: 4545
physicalDeliveryOfficeName: ABC Office
telephoneNumber: +111 123456789 Ext:111
facsimileTelephoneNumber: +222 212121
givenName: Gheevar
ou: Civil Department
o: Example Company Ltd.
mail:gheevar@mnm-example.com
street: Nut Street
mobile: +444 12211221

dn: CN=Siva Kumar KP,ou=Addressbook,dc=good,dc=com
objectClass: inetOrgPerson
objectClass: top
cn: Siva Kumar KP
sn: Kumar
l: Tirur
title: Marketing Manager
telephoneNumber: +111 987654321 Ext:999
givenName: Siva
ou: Marketing
o: Example Company Ltd.
mail:siva@rtz-example.com

### NOTE ###: i had done this entire procedure (without any verification procedures). But it was not in a professional way, and it is something horrible.

As i am System Admin, who dont have any exposure to the programming side, i feel big difficulty in doing this. Please do provide me a solution for this task.

* * * *
* *
* * /############################\ *
# Bunch of THANKS in Advance...# *
* \############################/ *
* * *
* * * * * * * *
*

KMR

Um, I don't mean to be rude, but we're all busy, just like you. The reason people help here is something called leverage. We spend just a few minutes to give someone a big boost. That happens if someone has written a script like the one you should be writing, and it doesn't work quite right, and he asks a detailed question.

Though I'm sure there's someone here who would be willing to do the whole job. For a consideration.

Quote:

and it is something horrible

So fix it. Builds character.

If you have any questions along the way, bring 'em here!

Quote:

Originally Posted by wje_lq (Post 3412082)

So fix it. Builds character.

If you have any questions along the way, bring 'em here!

Dear, i already mentioned that, i had done it... But when i prompted questions in the middle of the task, no one gets what i need exactly... and sometimes i am also not able to define exactly where i am at this entire process... (just bcoz, as i mentioned, i dont have any programming logic or something like that)

Thats why now after doing it, i thought of asking this with a detailed description of the entire task

NOTE: Actually i started my Q from the middle only, the old-address-list file is created after certain processing from the original file, which i think, that procedure is ok for the next task.

Thanks in Advance...

KMR

wje_lq makes a valid point, that we don't just jump in and do your job for you, here. However, since you've taken the time and effort to explain in some detail what you need, and you say (some source code to support your claim would have been good) you've tried to do it once, I will offer some advice.
Your input data is relatively well structured as a series of records which can be delimited by the '<beginning-of-line>dn:' clause. From there, records seem to be delimited into fields delimited by newlines. The field of interest in each record seems to be the first field in all cases. The field of interest is structured such that it can be decomposed into smaller fields, using commas.
All of this fields and records and delimiting terminology is suggesting an AWK procedure, although there seems to be just enough additional stuff that a more procedural solution like Perl is appropriate. I've left a tiny bit for you to finish up.

Code:

#! /usr/bin/perl -w

use strict;

    while( <> ){

        if( $_ =~ m/^dn:/ ){

            # delete everything after the first comma...

            my $record = $_;

            chomp $record;

            $record =~ s/,.*//;

            print "\n$record,ou=Addressbook,dc=good,dc=com\n";

            print "objectClass: inetOrgPerson\nobjectClass: top\n";

        }

        else{

            my $record = $_;

            $record =~ s/department:/ou:/g;

            $record =~ s/company:/o:/g;

            

            #

            # Add more sub's here, as needed

            #

            print $record;

        }

        

    }

Run it from the commandline, giving the input filename as an argument.
--- rod.

Edit: If you look even a little bit, you will see that I disregarded my own analysis, and simply edited the file, line-by-line. I did however check for the 'special' line that starts a new record.

theNbomr, Really really thanks for your script. It is working fantastic.

as you asked for the source code;

What i was doing was;

1. grep all the dn: lines and redirect it to a file.
2. then using awk, print up to the dn: CN=Full Name and append the ou=Addressbook,dc=good,dc=com
3. then using sed -e '/^dn:.*/R newfile' oldfile | sed -e '/^dn:.*/d;n'
i changed all the dn: lines in the original file
4. then individually ran the following cmnd to substitute
sed 's/company:/o:' file
sed 's/department:/ou:' file

5. then using sed or awk (i dont remember, and now when i checked my history, its already over written), i put a blank lines

there was some correction required, which i had done it manually. And it took around 3 days for me to complete this entire task.

NOW "theNbomr's" perl script is working fine for me... Thanks a lot...

Just one more thing; is there any way to check all the mandatory fields (dn:, cn:, sn:, mail:), are present in each address entries

after doing all the stuff, i faced a problem like, in some of the addreses there were no sn: entry. It was around on 40 records, so i edited it manually. is there any way to automate this.

FYI --> THis is a part of migration from MS Exchange to Linux MAIL Server.

As our other sites still running on exchange+ADS, we cannot setup an automate address book update on our new "100% Linux Server" site. So we will have to send them a form, which they will fill whenever they create new users in their offices. Using this form we can update our Linux Address book. i created a form, but facing again problem in converting it to the required format.

Here we can go with two options... but need to end in a particular goal.

One is, follow my form, and convert that form to the required file.

Or

Build another easy form, which we can easily convert to the required file.

The following is the form which i had created;
Full-Name: Muhammad Hadhi K.M
First-Name: Muhammad
Middle-Name: Hadhi
Last-Name: K. M.
Designation-or-Title: IT Head
Department: IT Department
Mobile-number: 55667788
Telephone-number-with-extension: 776655 555
Fax-number: 8294304
Physical-Office-Location: Calicut
Company: The Best Company
Location: Calicut
Street: Good Street
Post-Box: 3456
Postal-Code: 1231
Mail-id: hadhi@best.com

Need to convert this to the following;

dn: cn= Muhammad Hadhi K.M, ou=Addressbook, dc=best,dc=com
objectClass: inetOrgPerson
objectClass: top
cn: Muhammad Hadhi K.M
sn: Hadhi
l: Calicut
title: IT Head
postalCode: 1231
postOfficeBox: 3456
physicalDeliveryOfficeName: Calicut
telephoneNumber: 776655 555
facsimileTelephoneNumber: 8294304
givenName: Muhammad
ou: IT Department
o: The Best Company
mail: hadhi@best.com
mobile: 55667788

I tried by changing few lines of "theNbomr's" script. But as i dont know anything about perl, i was changing the lines blindly, which ended in failure. But now i know from this files, how to substitute all the values, to what i need, for eg, convert "Mail-id:" to "mail:". But i am not able to build dn:, cn:, sn:, from the Full Name, First Name, Middle Name, Last Name. also i am not able to grep out the unwanted lines to the ldap required format.

I hope "theNbomr" can slightly modify his script to do this task. Could you help me please...???

Thanks in Advance...

KMR

With regard to your prior question about testing for all required fields, I don't see a way of handling this without having some source for the correct content to put in the missing fields. In your sample data, the content of the 'sn:' field is different for each record. How is one to know the correct content for a given record, especially if it is missing?

For your form conversion tool, here is a code fragment that handles the conversion of the 'dn:' record:

Code:

  if( $_ =~ m/^Full-Name:\s*(.+)$/ ){

      print "dn: cn=$1,ou=Addressbook, dc=best,dc=com\n";

  }

This captures the content part of the record ($1), and uses it in printing the new version of it. The rest of the conversion is just a bunch of substitutions like the code I've already given you.

--- rod.