[SOLVED] bash script to edit file

seeberg · 04-03-2012, 04:11 AM

Hi, I need a bash script that can make the following changes to a ldif file.

Before:
dn: uid=test,ou=People,dc=skole,dc=domain,dc=com
uidNumber: 10007
gidNumber: 10007
uid: test
cn: Test User
sn:

After:
dn: uid=test,ou=People,dc=skole,dc=domain,dc=com
uidNumber: 10007
gidNumber: 10007
uid: test
cn: Test User
sn: User

Any help is very appreciated.

David the H. · 04-03-2012, 04:43 AM

Please use [code][/code] tags around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, colors, or other fancy formatting.

First of all, what are the criteria for matching the line(s) to be changed? Is there only the one line? Or are there multiple entries? Or is it the last line in the file? Or what? The solution used depends on your exact requirements.

Assuming you want to append a single name to all lines starting with "sn:", the most commonly-used tool is sed:

Code:

sed '/^sn:/ s/^sn:/sn: User/' file.txt

This prints the modified content to stdout. Assuming you're using the gnu version of sed, add the -i option to the command to make it edit the file directly. Otherwise you'd have to work through a temporary file.

I spent a few hours this weekend familiarizing myself with ed, though, and it does the job slightly more cleanly, as it's an actual text editor (as opposed to sed, the "stream editor"). The syntax is almost identical:

Code:

echo -e 'g/^sn:/ s/^sn:/sn: User/\n,p' | ed -s file.txt 
echo -e 'g/^sn:/ s/^sn:/sn: User/\nw' | ed -s file.txt

ed takes commands straight from stdin. I used "echo -e" here to interpret the "\n" newlines that separate the commands. The ",p" command at the end causes it to print the modified file to stdout (for confirmation purposes), while the "w" command writes the changes to the file.

Also note that in both cases above all lines will get the same "User" name. If you need them to be different then a different technique will be required.

Here are a few useful sed references.
http://www.grymoire.com/Unix/Sed.html
http://sed.sourceforge.net/grabbag/
http://sed.sourceforge.net/sedfaq.html
http://sed.sourceforge.net/sed1line.txt

How to use ed:
http://wiki.bash-hackers.org/howto/edit-ed
http://snap.nlc.dcccd.edu/learn/nlc/ed.html
(also read the info page)

Nominal Animal · 04-03-2012, 06:42 AM

If you want to use the final word of cn as the sn if the sn is empty, and if each cn is always listed before the sn for that person, then a bit of awk magic will help:

Code:

awk 'BEGIN { RS = "[\t\v\f ]*[\r\n][\t\v\f\r ]*"
             FS = "([\t\v\f ]+|[\t\v\f ]*:[\t\v\f ]*)"
           }
     ($1 == "cn") { sn = $NF }
     ($1 == "sn" && $2 == "") { printf("sn: %s\n", sn) ; next }
     { printf("%s\n", $0) }
    ' infile > outfile

The BEGIN rule sets the record separator to any newline convention. Any preceding or trailing whitespace is removed; I think leading or trailing whitespace is ignored anyway in LDIF files. The field separator is set to whitespace, or a colon. If there is whitespace around the colon, those are included in the separator, too.

The next rule saves the last word on the file, if the first token on the line is "cn".

The next rule matches if the only token on the line is "sn". It will print the saved sn field, and skip the rest of the rules for that record (line).

The last rule will print the record (line) as it was read. If the previous rule matched, then this one will be skipped.

If you have a larger set of files, I recommend you put the originals in one directory, say original/, and create a new directory for the modified ones, modified/. Then, in a Bash shell, run in their parent directory

Code:

for OLD in original/* ; do
    [ -f "$OLD" ] || continue
    NEW="modified/${old#*/}"
    awk 'BEGIN { RS = "[\t\v\f ]*[\r\n][\t\n\v\f\r ]*"
                 FS = "([\t\v\f ]+|[\t\v\f ]*:[\t\v\f ]*)"
               }
         ($1 == "cn") { sn = $NF }
         ($1 == "sn" && $2 == "") { printf("sn: %s\n", sn) ; next }
         { printf("%s\n", $0) }
        ' "$OLD" > "$NEW" || break
    echo "$NEW: Saved"
done

You can then easily compare all files in the two directories using e.g.

Code:

diff -raby original/ modified/ | less

The modified/ ones will be shown on the right side. Any lines that differ will have a pipe (|) in the middle.

seeberg · 04-04-2012, 03:37 AM

I see I should have described more carefully what I needed but Nominal Animal managed to guess it perfectly

David the H. · 04-05-2012, 11:17 AM

Heh. This just goes to show that you can't expect us to be psychic all the time. Always define your problem (or requirements) clearly and in detail.

This goes double for text editing problems; don't just post the input and desired output, explain it. Point out exactly how to target the text that needs to be edited (by word pattern, position in the file, etc.), and what variations we could expect that may need to be accounted for (e.g. could there be blank space in the string, multiple instances on a line or in the file, etc.). The more you can tell us about it, the more complete the replies are likely to be.

I don't know anything about ldif files, but just for my own practice, I worked up a short ed script that can handle the above example. It's very limited though, and will only work under these assumptions: 1) there's only this single record, 2) the "User" name is a single whole word at the end of the line, and 3) the "sn:" line is blank to start with (although it could be modified for points 2 & 3).

The #comments explain the commands used.

Code:

ed -s edfile.txt <<"EOF"
#copy the "cn:" line to just after the "sn:" line
/^cn:/t/^sn:/
#edit the copied line (it's still the active one) to
#remove everything but the last word, the "User" name.
#a single space is left at the front of the line.
.s/.* / /
#join the two lines together
-1,.j
#print the resulting file
,p
#write (save) the edits to the file
w
EOF