CS on Friday, March 16, 2007 at 12:47 am
Fixed it ! The culprit was “LDAP referrals”. Having just spent several hours figuring it out, this actually refreshed my memory that it was the same problem – and its cause – that stopped me implementing this the last time (which, in my defense, was 18 – 24 months ago, now that I go back and search my emails, not 12 months).
We have a slightly complicated domain structure in that we have multiple child domains (for each branch office). With referrals turned on, nss_ldap recurses down from domain.com to child1.domain.com, child2.domain.com, etc before returning results. There must be something that happens in this process that sometimes triggers the errors I was seeing (because they appear at random) and also causing the slowdowns.
If you put:
into /etc/ldap.conf, the problems all go away (or, at least, they haven’t happened since – with these sort of random errors you never can be 100% sure
). Response time is also substantially faster.
I can get away with this _now_ because (for unrelated reasons) we are flattening our domain structure and nixing the child domains. Previously this was not an option so any authentication system had to be able to include users in the child domains.
I also seem to vaguely recall that you can get around much of the slowness related to multi-domain configurations by a) pointing at a DC that is a Global Catalogue and b) pointing at the GC port instead of the regular LDAP port. This didn’t help the random crashes with things like ‘getent group’ though, IIRC.
Well, now all I need to do is get the automounter going with NFS’d home directories, and we’ll be golden. I might have to bend your ear about that shortly as well