1st, please accept my poor english.
I have a cluster with 2 nodes running with RHEL + pacemaker.
Recently, one of the cluster node hung (pingable but blackscreen on KVM and ssh also was not possible). The other active cluster node was serving the mysql service but user reported that application couldn't access to the database.
I've no choice unless to reboot the system (to get back the console). After reboot, cluster back to normal.
Lot of ERROR messages popped out (every second) in the message log file (on impacted cluster node) as below:
ERROR: Message hist queue is filling up (500 messages in queue)
WARN: Gmain_timeout_dispatch: Dispatch function for send_reqnodes_msg took too long to execute: 240 ms (> 100 ms) (GSource: 0x9c3f660)
Could everyone advice what's went wrong on my system.
Also how relates the above messages to hung issue.
Your advice is highly appreciated. Thanks.