LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (http://www.linuxquestions.org/questions/linux-server-73/)
-   -   LDAP unresponsive for periods of time (http://www.linuxquestions.org/questions/linux-server-73/ldap-unresponsive-for-periods-of-time-685204/)

bluenix 11-21-2008 11:58 AM

LDAP unresponsive for periods of time
 
I really hope one of the LDAP experts can shine his/her light on the following:

I'm running the following configuration: openLDAP / Samba on debian etch (same machine).
All is running fine, but there are times that LDAP seems to stop responding. After a while, it resumes its normal operations.

At all times the cpu load is low and there's plenty of memory free.

You can notice the slowdown by the following:

1) Users can't logon to the Samba PDC anymore from windows workstations, or it takes really long (e.g. 10 minutes)
2) Running "ls -l" on a directory with a lot of files takes ages to complete
3) The same is true for running "ps axu"

Thank you all very much for having a look!


The following is a snippet from the syslog at the time the problems occur:

Code:


Nov 20 12:16:18 server1 smbd[29868]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Nov 20 12:16:18 server1 slapd[2676]: conn=11580 fd=132 ACCEPT from IP=127.0.0.1:34251 (IP=0.0.0.0:389)
Nov 20 12:16:20 server1 smbd[29868]:  ^I(unknown)
Nov 20 12:16:20 server1 slapd[2676]: conn=11575 op=2 UNBIND
Nov 20 12:16:23 server1 CRON[3637]: pam_ldap: ldap_result Timed out
Nov 20 12:16:23 server1 smbd[1786]: [2008/11/20 12:11:02, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:23 server1 smbd[31434]: [2008/11/20 12:11:09, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:24 server1 slapd[2676]: conn=11575 fd=75 closed
Nov 20 12:16:25 server1 smbd[2253]: [2008/11/20 12:12:28, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2997]: [2008/11/20 12:12:33, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2229]: [2008/11/20 12:12:34, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2251]: [2008/11/20 12:12:36, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2928]: [2008/11/20 12:12:42, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2366]: [2008/11/20 12:12:44, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2234]: [2008/11/20 12:12:44, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2240]: [2008/11/20 12:12:58, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2310]: [2008/11/20 12:13:05, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2310]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Nov 20 12:16:25 server1 smbd[2260]: [2008/11/20 12:13:10, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2354]: [2008/11/20 12:13:25, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2199]: [2008/11/20 12:14:41, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2199]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Nov 20 12:16:25 server1 smbd[2971]: [2008/11/20 12:14:33, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[29830]: [2008/11/20 12:14:54, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[29830]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Nov 20 12:16:25 server1 smbd[1781]: [2008/11/20 12:14:58, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2951]: [2008/11/20 12:15:00, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2989]: [2008/11/20 12:15:26, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[2951]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Nov 20 12:16:25 server1 smbd[2951]:  ^I(unknown)
Nov 20 12:16:25 server1 smbd[2989]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Nov 20 12:16:25 server1 smbd[30804]: [2008/11/20 12:15:43, 0] lib/smbldap.c:smbldap_connect_system(977)
Nov 20 12:16:25 server1 smbd[30804]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Nov 20 12:16:25 server1 CRON[3653]: pam_ldap: ldap_result Timed out
Nov 20 12:16:25 server1 smbd[1786]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Nov 20 12:16:25 server1 slapd[2676]: conn=11577 op=0 SEARCH RESULT tag=101 err=0 nentries=1 text=
Nov 20 12:16:26 server1 smbd[2253]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server


irishbitte 11-23-2008 11:45 AM

Sounds like a resolution issue! is there more than one machine called "dc=admin"?

bluenix 11-24-2008 04:44 AM

Quote:

Originally Posted by irishbitte (Post 3351908)
Sounds like a resolution issue! is there more than one machine called "dc=admin"?

Thank you very much for your reply!

There's only one server, which runs ldap and all other services. The rest of the machines on the network are windows workstations. Which makes me conclude (correct me if I'm wrong) that there's only one machine (the server) with cn=admin.

What should I check to make sure if it's a resolution issue?

bluenix 11-24-2008 10:49 AM

It has happened a couple of times today again...

Code:

Nov 24 16:16:14 server1 smbd[30077]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Today, every time this happened took only a few minutes....

Your input is greatly appreciated!

djsoundfx 11-24-2008 11:16 AM

Quote:

Originally Posted by bluenix (Post 3352814)
It has happened a couple of times today again...

Code:

Nov 24 16:16:14 server1 smbd[30077]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Today, every time this happened took only a few minutes....

Your input is greatly appreciated!

Intermittent problems are always fun, is samba the only thing you use to connect to your ldap server? I'm curious to see if any other programs are having this problem.

It may also help to enable more verbose logging on the LDAP side of things to see if there's maybe a problem with slapd. Try something like
sudo (or just be root) /usr/local/libexec/slapd -d -1 -h ldap:///

This will enable very verbose logging when you invoke slapd like this and it may help. After you verify that slapd is running like that try your samba connections again and post the output from the logs.

Outside of that whats your firewall & hosts file configuration look like? Is localhost setup correctly in /etc/hosts, and do you at least allow localhost to connect to 389(the default ldap port)? Maybe its just something simple...

bluenix 11-26-2008 10:21 AM

Quote:

Originally Posted by djsoundfx (Post 3352835)
Intermittent problems are always fun, is samba the only thing you use to connect to your ldap server? I'm curious to see if any other programs are having this problem.

It may also help to enable more verbose logging on the LDAP side of things to see if there's maybe a problem with slapd. Try something like
sudo (or just be root) /usr/local/libexec/slapd -d -1 -h ldap:///

This will enable very verbose logging when you invoke slapd like this and it may help. After you verify that slapd is running like that try your samba connections again and post the output from the logs.

Outside of that whats your firewall & hosts file configuration look like? Is localhost setup correctly in /etc/hosts, and do you at least allow localhost to connect to 389(the default ldap port)? Maybe its just something simple...

Running ldap with those parameters produces enormous amounts of output! The problem is, the ldap server is always in use, so I can never easily try some samba connections and see what the output of that is...

About you question if samba is the only thing: Some nss_ldap messages appear too:
Code:

/var/log/auth.log:Nov 26 15:33:19 server1 ls: nss_ldap: failed to bind to LDAP server ldap://127.0.0.1/: Unknown error
/var/log/auth.log:Nov 26 15:33:20 server1 ls: nss_ldap: reconnecting to LDAP server...
/var/log/auth.log:Nov 26 15:33:21 server1 ls: nss_ldap: failed to bind to LDAP server ldap://127.0.0.1/: Unknown error
/var/log/auth.log:Nov 26 15:33:24 server1 ls: nss_ldap: reconnecting to LDAP server...
/var/log/auth.log:Nov 26 15:33:26 server1 smbd[29141]: nss_ldap: failed to bind to LDAP server ldap://127.0.0.1/: Unknown error
/var/log/auth.log:Nov 26 15:33:28 server1 smbd[29141]: nss_ldap: reconnecting to LDAP server (sleeping 1 seconds)...
/var/log/auth.log:Nov 26 15:33:37 server1 smbd[29141]: nss_ldap: reconnected to LDAP server ldap://127.0.0.1/ after 2 attempts
/var/log/auth.log:Nov 26 15:33:37 server1 ls: nss_ldap: reconnected to LDAP server ldap://127.0.0.1/ after 1 attempt
/var/log/auth.log:Nov 26 15:33:37 server1 ls: nss_ldap: reconnected to LDAP server ldap://127.0.0.1/ after 1 attempt

My hosts file looks like this:

Code:


127.0.0.1      localhost
172.30.0.2      server1.domain  server1 admin domain

Localhost must be allowed on port 389, as it works most of the time....:scratch:

I really appreciate your help!

bluenix 11-27-2008 04:56 AM

This time LDAP seemed to fail because of me listing a directory with ls -l...

I didn't wait 'till it got back again, but restarted slap manually. After that I was able to ls -l again, and LDAP seemed OK.

bluenix 11-28-2008 09:50 AM

Please, does anybody have another suggestion, or could you please tell me what else to check to see if it maybe has to do with reolution?

Thank you very much!

irishbitte 12-01-2008 09:04 AM

Ok, LDAP is generally a very stable daemon / service to run on any machine? Is there any other daemon or service running on your machine using port 389?

This is a good command to give you an idea of what is running on your server:

Code:

netstat -a | grep LISTEN
from http://ubuntuforums.org/showthread.php?t=437888

bluenix 12-01-2008 09:07 AM

Thank you very much for your reply!

This is the output of
Code:

netstat -a | grep LISTEN
As far as I can see, nothing else is listening on 389?

Code:

tcp        0      0 localhost:2208          *:*                    LISTEN   
tcp        0      0 *:imaps                *:*                    LISTEN   
tcp        0      0 localhost:49250        *:*                    LISTEN   
tcp        0      0 *:pop3s                *:*                    LISTEN   
tcp        0      0 *:ldap                  *:*                    LISTEN   
tcp        0      0 *:swat                  *:*                    LISTEN   
tcp        0      0 localhost:7080          *:*                    LISTEN   
tcp        0      0 localhost:10024        *:*                    LISTEN   
tcp        0      0 *:netbios-ssn          *:*                    LISTEN   
tcp        0      0 *:pop3                  *:*                    LISTEN   
tcp        0      0 *:imap2                *:*                    LISTEN   
tcp        0      0 *:sunrpc                *:*                    LISTEN   
tcp        0      0 *:webcache              *:*                    LISTEN   
tcp        0      0 *:webmin                *:*                    LISTEN   
tcp        0      0 *:auth                  *:*                    LISTEN   
tcp        0      0 192.168.1.2:domain      *:*                    LISTEN   
tcp        0      0 server1.domain:domain  *:*                    LISTEN   
tcp        0      0 localhost:domain        *:*                    LISTEN   
tcp        0      0 localhost:ipp          *:*                    LISTEN   
tcp        0      0 *:3128                  *:*                    LISTEN   
tcp        0      0 *:smtp                  *:*                    LISTEN   
tcp        0      0 localhost:953          *:*                    LISTEN   
tcp        0      0 *:60858                *:*                    LISTEN   
tcp        0      0 *:microsoft-ds          *:*                    LISTEN   
tcp6      0      0 *:ldap                  *:*                    LISTEN   
tcp6      0      0 *:www                  *:*                    LISTEN   
tcp6      0      0 *:domain                *:*                    LISTEN   
tcp6      0      0 *:ssh                  *:*                    LISTEN   
tcp6      0      0 ip6-localhost:953      *:*                    LISTEN   
tcp6      0      0 *:https                *:*                    LISTEN   
unix  2      [ ACC ]    STREAM    LISTENING    8132    /var/run/dbus/system_bus_socket
unix  2      [ ACC ]    STREAM    LISTENING    952458  /tmp/.dguardianipc
unix  2      [ ACC ]    STREAM    LISTENING    952459  /tmp/.dguardianurlipc
unix  2      [ ACC ]    STREAM    LISTENING    7721    /var/run/acpid.socket
unix  2      [ ACC ]    STREAM    LISTENING    10649    public/cleanup
unix  2      [ ACC ]    STREAM    LISTENING    10656    private/tlsmgr
unix  2      [ ACC ]    STREAM    LISTENING    10665    private/rewrite
unix  2      [ ACC ]    STREAM    LISTENING    10669    private/bounce
unix  2      [ ACC ]    STREAM    LISTENING    10673    private/defer
unix  2      [ ACC ]    STREAM    LISTENING    10677    private/trace
unix  2      [ ACC ]    STREAM    LISTENING    10681    private/verify
unix  2      [ ACC ]    STREAM    LISTENING    10685    public/flush
unix  2      [ ACC ]    STREAM    LISTENING    10689    private/proxymap
unix  2      [ ACC ]    STREAM    LISTENING    10693    private/smtp
unix  2      [ ACC ]    STREAM    LISTENING    10697    private/relay
unix  2      [ ACC ]    STREAM    LISTENING    10701    public/showq
unix  2      [ ACC ]    STREAM    LISTENING    10705    private/error
unix  2      [ ACC ]    STREAM    LISTENING    10709    private/discard
unix  2      [ ACC ]    STREAM    LISTENING    10713    private/local
unix  2      [ ACC ]    STREAM    LISTENING    10717    private/virtual
unix  2      [ ACC ]    STREAM    LISTENING    10721    private/lmtp
unix  2      [ ACC ]    STREAM    LISTENING    10725    private/anvil
unix  2      [ ACC ]    STREAM    LISTENING    10729    private/scache
unix  2      [ ACC ]    STREAM    LISTENING    10733    private/maildrop
unix  2      [ ACC ]    STREAM    LISTENING    10737    private/uucp
unix  2      [ ACC ]    STREAM    LISTENING    10741    private/ifmail
unix  2      [ ACC ]    STREAM    LISTENING    10745    private/bsmtp
unix  2      [ ACC ]    STREAM    LISTENING    10749    private/scalemail-backend
unix  2      [ ACC ]    STREAM    LISTENING    10753    private/mailman
unix  2      [ ACC ]    STREAM    LISTENING    7534    /var/run/amavis/amavisd.sock
unix  2      [ ACC ]    STREAM    LISTENING    952441  /var/run/cups/cups.sock
unix  2      [ ACC ]    STREAM    LISTENING    9582    /var/lib/dcc/dccifd
unix  2      [ ACC ]    STREAM    LISTENING    9546    /var/run/avahi-daemon/socket
unix  2      [ ACC ]    STREAM    LISTENING    7922    /var/run/clamav/clamd.ctl
unix  2      [ ACC ]    STREAM    LISTENING    11062    /var/run/dovecot/dict-server
unix  2      [ ACC ]    STREAM    LISTENING    11064    /var/run/dovecot/login/default
unix  2      [ ACC ]    STREAM    LISTENING    8149    @/var/run/hald/dbus-FiB1DkXGgU
unix  2      [ ACC ]    STREAM    LISTENING    8148    @/var/run/hald/dbus-8wWHaUkhPu
unix  2      [ ACC ]    STREAM    LISTENING    11069    /var/run/dovecot/auth-worker.4121


bluenix 12-02-2008 05:00 AM

Again in the logs today... What about those messages "when not root"? Why do they sometimes show up?

Code:


Dec  2 09:03:52 server1 smbd[21238]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Dec  2 09:03:52 server1 smbd[21281]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Dec  2 09:03:53 server1 smbd[21206]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Dec  2 09:03:53 server1 smbd[21316]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Dec  2 09:03:53 server1 smbd[21242]:  failed to bind to server ldap://127.0.0.1/ with dn="cn=admin,dc=domain" Error: Can't contact LDAP server
Dec  2 09:03:53 server1 smbd[21281]:  smbldap_open: cannot access LDAP when not root..
Dec  2 09:04:00 server1 smbd[21281]:  smbldap_open: cannot access LDAP when not root..
Dec  2 10:43:04 server1 smbd[24916]:  smbldap_open: cannot access LDAP when not root..
Dec  2 10:43:04 server1 smbd[24916]:  smbldap_open: cannot access LDAP when not root..


irishbitte 12-04-2008 04:04 PM

Hmmm. When you are asked the question

"Allow local root to be LDAP admin"

on the LDAP client side, what did you answer?

Also, did you store the LDAP root / admin password on the LDAP client? It needs to be there, or else the LDAP tree has to not require admin rights to bind to it.

rock and hard place...

bluenix 12-05-2008 04:58 AM

Quote:

Originally Posted by irishbitte (Post 3364709)
Hmmm. When you are asked the question

"Allow local root to be LDAP admin"

on the LDAP client side, what did you answer?

'Yes'

Quote:

Also, did you store the LDAP root / admin password on the LDAP client? It needs to be there, or else the LDAP tree has to not require admin rights to bind to it.

rock and hard place...
It's there. The strange thing is that this whole setup has worked for a couple of weeks, and then suddenly started behaving like this. Which is really strange, because I didn't really change anything around that time...

The only thing I changed around that time was replacing the dns servers in squid.conf (proxy) with the ones of opendns...


All times are GMT -5. The time now is 12:27 PM.