If you want to use the final word of
cn as the
sn if the
sn is empty, and if each
cn is always listed before the
sn for that person, then a bit of awk magic will help:
Code:
awk 'BEGIN { RS = "[\t\v\f ]*[\r\n][\t\v\f\r ]*"
FS = "([\t\v\f ]+|[\t\v\f ]*:[\t\v\f ]*)"
}
($1 == "cn") { sn = $NF }
($1 == "sn" && $2 == "") { printf("sn: %s\n", sn) ; next }
{ printf("%s\n", $0) }
' infile > outfile
The BEGIN rule sets the record separator to any newline convention. Any preceding or trailing whitespace is removed; I think leading or trailing whitespace is ignored anyway in LDIF files. The field separator is set to whitespace, or a colon. If there is whitespace around the colon, those are included in the separator, too.
The next rule saves the last word on the file, if the first token on the line is "cn".
The next rule matches if the only token on the line is "sn". It will print the saved sn field, and skip the rest of the rules for that record (line).
The last rule will print the record (line) as it was read. If the previous rule matched, then this one will be skipped.
If you have a larger set of files, I recommend you put the originals in one directory, say
original/, and create a new directory for the modified ones,
modified/. Then, in a Bash shell, run in their parent directory
Code:
for OLD in original/* ; do
[ -f "$OLD" ] || continue
NEW="modified/${old#*/}"
awk 'BEGIN { RS = "[\t\v\f ]*[\r\n][\t\n\v\f\r ]*"
FS = "([\t\v\f ]+|[\t\v\f ]*:[\t\v\f ]*)"
}
($1 == "cn") { sn = $NF }
($1 == "sn" && $2 == "") { printf("sn: %s\n", sn) ; next }
{ printf("%s\n", $0) }
' "$OLD" > "$NEW" || break
echo "$NEW: Saved"
done
You can then easily compare all files in the two directories using e.g.
Code:
diff -raby original/ modified/ | less
The modified/ ones will be shown on the right side. Any lines that differ will have a pipe (
|) in the middle.