LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 08-11-2020, 05:35 AM   #1
mozer
LQ Newbie
 
Registered: Sep 2013
Posts: 28

Rep: Reputation: Disabled
Lightbulb Corosync/pacemaker fencing issue


Hello all,

I've created a corosync/pacemaker cluster on Centos 8 with 3 Vmware nodes, everything runs as expected, i configured a floating ip among them, this responds well

Quote:
pcs status,

Full List of Resources:
* Cluster_VIP (ocf::heartbeat:IPaddr2): Started esba-cl-test-01
Now i want to set the fencing for the cluster following this (among others)

With the following

Quote:
pcs stonith create vmfence fence_vmware_rest pcmk_host_map="esba-cl-test-01:vm-1;esba-cl-test-02:vm-2;esba-cl-test-03:vm-3" ipaddr=192.168.25.45 ssl=1 login=XXXX passwd=XXXX ssl_insecure=1
Everything goes well, no errors

Quote:
Full List of Resources:
* Cluster_VIP (ocf::heartbeat:IPaddr2): Started esba-cl-test-01
* vmfence (stonith:fence_vmware_rest): Started esba-cl-test-02
Problem comes when I try to test the fencing

Quote:
stonith_admin --reboot esba-cl-test-03
I receive

Quote:
pacemaker-fenced[9403]: error: Operation 'reboot' targeting esba-cl-test-03 on <no-one> for stonith_admin.745607@esba-cl-test
-01.8e1c6371: No route to host
This problem has been reported in redhat forums but unfortunately I don’t have a user so I can't check the solution

I don’t have DNS configured, servers resolv with hosts file, but this should be enough

Has anyone encounter this problem?
Can anyone please help?


Thanks

Last edited by mozer; 08-11-2020 at 05:40 AM.
 
Old 08-23-2020, 05:12 PM   #2
tshikose
Member
 
Registered: Apr 2010
Location: Kinshasa, Democratic Republic of Congo
Distribution: RHEL, Fedora, CentOS
Posts: 525

Rep: Reputation: 95
Hi,

Taken from the link you provided above.
I split it to not make it a long one post.


Environment

Red Hat Enterprise Linux (RHEL) 7 Update 5
Red Hat Enterprise Linux (RHEL) 8
Pacemaker High Availability or Resilient Storage Add On
VMware vSphere version 6.5 and above.
 
Old 08-23-2020, 05:14 PM   #3
tshikose
Member
 
Registered: Apr 2010
Location: Kinshasa, Democratic Republic of Congo
Distribution: RHEL, Fedora, CentOS
Posts: 525

Rep: Reputation: 95
Resolution

Assuming following is cluster architecture:
cluster node hostnames are node1 and node2
cluster node names as seen by the vmware hypervisor (ESXi/vCenter) are node1-vm and node2-vm
<ESXi/vCenter IP address> is IP address of vmware hypervisor which is managing cluster nodes VMs

First check if cluster node is able to reach the hypervisor and list VMs on it. Following command will try to connect to hypervisor with provided credentials and list all machines.

Code:
     # fence_vmware_rest -a <ESXi/vCenter IP address> -l <esxi_username> -p <esxi_password> --ssl-insecure -z -o list | egrep "(node1-vm|node2-vm)"
     node1-vm,
     node2-vm,
     # fence_vmware_rest -a <ESXi/vCenter IP address> -l <esxi_username> -p <esxi_password> --ssl-insecure -z -o status -n node1-vm
     Status: ON
If above list fails, then make sure the below is true
Node is able to communicate with ESXi/vCenter on port 443/tcp (when using SSL) or on port 80/tcp (without SSL).
Ensure that the user has permissions on ESXi/vCenter for fencing.
Check if the ESXi/vCenter has trustworthy SSL certificate. If the certificate cannot be trustworthy check solution on how to relax some SSL checks.
 
Old 08-23-2020, 05:15 PM   #4
tshikose
Member
 
Registered: Apr 2010
Location: Kinshasa, Democratic Republic of Congo
Distribution: RHEL, Fedora, CentOS
Posts: 525

Rep: Reputation: 95
f command succeeded the node is able to communicate with hypervisor. Stonith device should be configured using same configuration options as were tested in listing. Some of arguments for the fence_vmware_rest command and fence_vmware_rest fencing agent in pacemaker can have slightly different name.
For this reason check the help pages of both - fence_vmware_rest command and fence_vmware_rest fencing agent (In diagnostics section is shortened listing of options used by this solution)

Create the stonith device using command below. The pcmk_host_map attribute is used to map node hostname as see by cluster to the name of virtual machine as seen on vmware hypervisor.

The first attribute in pcmk_host_map is the cluster node name as seen in /etc/corosync/corosync.conf file and the next attribute, that is post semicolon is the cluster node names as seen by the vmware hypervisor.

Code:
    # cat /etc/corosync/corosync.conf
    [...]
    nodelist {
        node {
            ring0_addr: node1  <<<=== Cluster node name
            nodeid: 1
        }

        node {
            ring0_addr: node2
            nodeid: 2
        }
    }

    # pcs stonith create vmfence fence_vmware_rest pcmk_host_map="node1:node1-vm;node2:node2-vm" ipaddr=<ESXi/vCenter IP address> ssl=1 login=<esxi_username> passwd=<esxi_password> ssl_insecure=1
To check the status of stonith device and its configuration use the commands below.

Code:
    # pcs stonith show
    Full list of resources:
    vmfence (stonith:fence_vmware_rest):    Started node1

    # pcs stonith show vmfence --full
     Resource: vmfence (class=stonith type=fence_vmware_rest)
      Attributes: pcmk_host_map=node1:node1-vm;node2:node2-vm ipaddr=<ESXi/vCenter IP address> ssl=1 login=<esxi_username> passwd=<esxi_password> ssl_insecure=1
When stonith device is started proceed with proper testing of fencing in the cluster.

Additional notes and recommendations:

Make sure package fence-agents-4.0.11-86.el7 or later is installed which has new agent fence_vmware_rest.
fence_vmware_rest works with VMware vSphere version 6.5 or higher.
Please refer to following link for support policies of fence_vmware_rest.
Once configured, it is highly recommended to test the fence functionality.
The fence agent fence_vmware_soap causes CPU usage to spike.
There is a known limitation imposed by the VMware Rest API of 1000 VMs: fence_vmware_rest monitor fails with error: "Exception: 400: Too many virtual machines. Add more filter criteria to reduce the number."
 
Old 08-23-2020, 05:16 PM   #5
tshikose
Member
 
Registered: Apr 2010
Location: Kinshasa, Democratic Republic of Congo
Distribution: RHEL, Fedora, CentOS
Posts: 525

Rep: Reputation: 95
A final note, is that not being a VM user, I cannot help more than copying and pasting as I did.
I hope it will help.
 
  


Reply

Tags
corosync, pacemaker


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Debian Corosync/Pacemaker Cluster Frustrations mpapet Linux - Server 1 05-09-2012 12:40 AM
MySQL HA-cluster with DRBD, Pacemaker and Corosync Patric.F Linux - Server 2 01-28-2012 05:27 AM
LXer: Openfiler 2.99 Active/Passive With Corosync, Pacemaker And DRBD LXer Syndicated Linux News 0 04-29-2011 09:10 AM
LXer: How To Set Up An Active/Passive PostgreSQL Cluster With Pacemaker, Corosync, And DRBD (CentOS LXer Syndicated Linux News 0 11-17-2010 08:40 AM
Device already mounted or mount point busy with CentOS/corosync/pacemaker/DRBD aschoessler Linux - Server 1 04-02-2010 08:11 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 06:05 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration