I am running a home server with a Hardened Gentoo install. I have set up a hardware watchdog (Intel TCO on an 82801 (ICH6) chipset) and have it working. I am using the watchdog utility from portage.
The watchdog functions correctly, but when I run "/etc/init.d/watchdog stop" or "/etc/init.d/watchdog start", it just hangs at "Stopping watchdog ...".
Additionally, I would like to have a repair script, but I can't find any documentation on the error codes. I have also looked for an example, but have failed to find one. According to the Gentoo Wiki article I was reading, there should be an example installed somewhere with the watchdog, but I have been unable to locate it.
My watchdog options, from /etc/conf.d/watchdog, are "-v -b", but I do not see any messages in dmesg. I haven't investigated whether I should, yet.
Here is my watchdog.conf:
Code:
#ping = 172.31.14.1
#ping = 172.26.1.255
#interface = eth0
#file = /var/log/messages
#change = 1407
# Uncomment to enable test. Setting one of these values to '0' disables it.
# These values will hopefully never reboot your machine during normal use
# (if your machine is really hung, the loadavg will go much higher than 25)
max-load-1 = 10
max-load-5 = 5
max-load-15 = 2
# Note that this is the number of pages!
# To get the real size, check how large the pagesize is on your machine.
#min-memory = 1
#repair-binary = /usr/sbin/repair
#test-binary =
#test-timeout =
watchdog-device = /dev/watchdog
# Defaults compiled into the binary
#temperature-device =
#max-temperature = 120
# Defaults compiled into the binary
admin = <my email here>
interval = 10
#logtick = 1
# This greatly decreases the chance that watchdog won't be scheduled before
# your machine is really loaded
#realtime = yes
#priority = 1
# Check if syslogd is still running by enabling the following line
#pidfile = /var/run/syslogd.pid
pidfile = /var/run/sshd.pid
pidfile = /var/run/apache2.pid
pidfile = /var/run/dhcpcd-eth0.pid
pidfile = /var/run/ddclient/ddclient.pid
Does anyone have an idea about my hanging problem, or a link to documentation on the repair program?
Thank you for any help.
EDIT:
I have found that uptime reports a load average of 1.00 (it's a single-processor system) during what should be a reboot ("Stopping watchdog ..." is blocking it.) According to top, runscript is using 99.9% CPU. This was diagnosed over SSH (I set watchdog to run in the default runlevel, and start after the processes it monitors. I am changing that now.)