IMNOboist |
02-13-2012 03:20 PM |
Xen cluster (drbd, ocfs2, pacemaker) auto-recover problem
I have a Xen cluster running drbd, ocfs2 and Pacemaker on Ubuntu 11.10. I can live migrate a VM from node1 to node2 and vice-versa without any problems, but if I pull the power cord on the node running the VM, it fails starting up on the other node.
Here is what crm_mon shows after the failure:
Code:
============
Last updated: Mon Feb 13 13:06:20 2012
Stack: openais
Current DC: clutest2 - partition WITHOUT quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
5 Resources configured.
============
Online: [ clutest2 ]
OFFLINE: [ clutest1 ]
Master/Slave Set: ms_drbd_master [p_drbd]
Masters: [ clutest2 ]
Stopped: [ p_drbd:0 ]
Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
Started: [ clutest2 ]
Stopped: [ g_ocfs2mgmt:0 ]
Clone Set: cl_fs_ocfs2 [p_fs_ocfs2]
Started: [ clutest2 ]
Stopped: [ p_fs_ocfs2:0 ]
p_xen-lolwut (ocf::heartbeat:Xen): Started clutest2 (unmanaged) FAILED
Failed actions:
p_xen-lolwut_stop_0 (node=clutest2, call=-1, rc=1, status=Timed Out): unknown error
p_xen-lolwut_start_0 (node=clutest2, call=-1, rc=1, status=Timed Out): unknown error
Here is the output from crm configure show:
Code:
node clutest1
node clutest2
primitive p_controld ocf:pacemaker:controld
primitive p_drbd ocf:linbit:drbd \
params drbd_resource="r0" \
operations $id="op_drbd" \
op monitor interval="20" role="Master" timeout="20" \
op monitor interval="30" role="Slave" timeout="20" \
meta target-role="started"
primitive p_fs_ocfs2 ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/r0" directory="/domains" fstype="ocfs2" options="rw,noatime"
primitive p_o2cb ocf:pacemaker:o2cb
primitive p_xen-lolwut ocf:heartbeat:Xen \
params xmfile="/domains/lolwut/lolwut.cfg" \
op monitor interval="10s" \
meta target-role="Started" allow-migrate="true"
primitive xen-domutest ocf:heartbeat:Xen \
params xmfile="/domains/domutest/domutest.cfg" \
op monitor interval="10s" \
meta target-role="Stopped" allow-migrate="true"
group g_ocfs2mgmt p_controld p_o2cb
ms ms_drbd_master p_drbd \
meta resource-stickiness="100" master-max="2" clone-max="2" notify="true" interleave="true"
clone cl_fs_ocfs2 p_fs_ocfs2
clone cl_ocfs2mgmt g_ocfs2mgmt \
meta interleave="true"
colocation c_lolwut_fs inf: p_xen-lolwut cl_fs_ocfs2
colocation c_ocfs2 inf: cl_fs_ocfs2 cl_ocfs2mgmt ms_drbd_master:Master
order o_lolwut-after-fs inf: cl_fs_ocfs2:start p_xen-lolwut:start
order o_ocfs2 0: ms_drbd_master:promote cl_ocfs2mgmt:start cl_fs_ocfs2:start
property $id="cib-bootstrap-options" \
dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
default-resource-stickiness="1000"
Any help would be appreciated!
|