LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 06-12-2019, 06:35 AM   #1
topiyobol
LQ Newbie
 
Registered: Jun 2019
Posts: 1

Rep: Reputation: Disabled
Pacemaker/Corosync Cluster Monitoring Action keeps failling (Wildfly)


Hi all,

I have a 2-node cluster without STONITH based on the following software:
Ubuntu 18.04.1 LTS
Pacemaker 1.1.18
Corosync Cluster Engine, version '2.4.3'

It is not the first cluster I build but the first one based on based on Ubuntu 18.04 (so far I've been working with 16.04).

The following resources are configured: DRBD storage, virtual IP, database (postgres), Apache and wildfly server

Everything works as expected except that the Wildfly service restarts quite frequently (~once every 1~5 days) and I see this error message in the crm_mon:
Migration Summary:
* Node test-node2:
res_wildfly: migration-threshold=1000000 fail-count=16 last-failure='Sat Jun 8 06:55:20 2019'
* Node test-node1:

Failed Actions:
* res_wildfly_monitor_30000 on test-node2 'unknown error' (1): call=306, status=complete, exitreason='',
last-rc-change='Sat Jun 8 06:55:20 2019', queued=0ms, exec=0ms


The corosync log doesn't reveal much more:
Jun 08 06:55:20 [882] test-node2 crmd: info: process_lrm_event: Result of monitor operation for res_wildfly on test-node2: 1 (unknown error) | call=306 key=res_wildfly_monitor_30000 confirmed=false cib-update=2815

---------
The resource is configured with the class systemd.

Has anyone else experienced this problem or any idea what it could be? Wildfly would be running stable but gets the signal to restart from the operating system (from the cluster manager) due to this situation of some apparently failing monitoring. Disabling the monitoring is not an option, because then we would not notice if the Wildfly service is no longer available.

Let me know if I can help to understand my situation better with any additional log or configuration information.

Thank you in advance.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] cluster on slack14 (corosync, pacemaker) ciorny Slackware 2 09-19-2013 02:17 AM
cluster (corosync, pacemaker, drbd, mysql) lost communication between nodes arrals.vl Linux - Server 2 05-10-2012 10:09 AM
Debian Corosync/Pacemaker Cluster Frustrations mpapet Linux - Server 1 05-09-2012 12:40 AM
MySQL HA-cluster with DRBD, Pacemaker and Corosync Patric.F Linux - Server 2 01-28-2012 05:27 AM
LXer: How To Set Up An Active/Passive PostgreSQL Cluster With Pacemaker, Corosync, And DRBD (CentOS LXer Syndicated Linux News 0 11-17-2010 08:40 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 02:51 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration