LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices



Reply
 
Search this Thread
Old 02-13-2012, 04:20 PM   #1
IMNOboist
Member
 
Registered: Nov 2003
Location: Northern Utah
Distribution: OpenBSD, Ubuntu, Linux Mint, Knoppix
Posts: 56

Rep: Reputation: 16
Xen cluster (drbd, ocfs2, pacemaker) auto-recover problem


I have a Xen cluster running drbd, ocfs2 and Pacemaker on Ubuntu 11.10. I can live migrate a VM from node1 to node2 and vice-versa without any problems, but if I pull the power cord on the node running the VM, it fails starting up on the other node.

Here is what crm_mon shows after the failure:
Code:
============
Last updated: Mon Feb 13 13:06:20 2012
Stack: openais
Current DC: clutest2 - partition WITHOUT quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
5 Resources configured.
============

Online: [ clutest2 ]
OFFLINE: [ clutest1 ]

 Master/Slave Set: ms_drbd_master [p_drbd]
     Masters: [ clutest2 ]
     Stopped: [ p_drbd:0 ]
 Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
     Started: [ clutest2 ]
     Stopped: [ g_ocfs2mgmt:0 ]
 Clone Set: cl_fs_ocfs2 [p_fs_ocfs2]
     Started: [ clutest2 ]
     Stopped: [ p_fs_ocfs2:0 ]
p_xen-lolwut    (ocf::heartbeat:Xen):   Started clutest2 (unmanaged) FAILED

Failed actions:
    p_xen-lolwut_stop_0 (node=clutest2, call=-1, rc=1, status=Timed Out): unknown error
    p_xen-lolwut_start_0 (node=clutest2, call=-1, rc=1, status=Timed Out): unknown error
Here is the output from crm configure show:
Code:
node clutest1
node clutest2
primitive p_controld ocf:pacemaker:controld
primitive p_drbd ocf:linbit:drbd \
        params drbd_resource="r0" \
        operations $id="op_drbd" \
        op monitor interval="20" role="Master" timeout="20" \
        op monitor interval="30" role="Slave" timeout="20" \
        meta target-role="started"
primitive p_fs_ocfs2 ocf:heartbeat:Filesystem \
        params device="/dev/drbd/by-res/r0" directory="/domains" fstype="ocfs2" options="rw,noatime"
primitive p_o2cb ocf:pacemaker:o2cb
primitive p_xen-lolwut ocf:heartbeat:Xen \
        params xmfile="/domains/lolwut/lolwut.cfg" \
        op monitor interval="10s" \
        meta target-role="Started" allow-migrate="true"
primitive xen-domutest ocf:heartbeat:Xen \
        params xmfile="/domains/domutest/domutest.cfg" \
        op monitor interval="10s" \
        meta target-role="Stopped" allow-migrate="true"
group g_ocfs2mgmt p_controld p_o2cb
ms ms_drbd_master p_drbd \
        meta resource-stickiness="100" master-max="2" clone-max="2" notify="true" interleave="true"
clone cl_fs_ocfs2 p_fs_ocfs2
clone cl_ocfs2mgmt g_ocfs2mgmt \
        meta interleave="true"
colocation c_lolwut_fs inf: p_xen-lolwut cl_fs_ocfs2
colocation c_ocfs2 inf: cl_fs_ocfs2 cl_ocfs2mgmt ms_drbd_master:Master
order o_lolwut-after-fs inf: cl_fs_ocfs2:start p_xen-lolwut:start
order o_ocfs2 0: ms_drbd_master:promote cl_ocfs2mgmt:start cl_fs_ocfs2:start
property $id="cib-bootstrap-options" \
        dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="2" \
        no-quorum-policy="ignore" \
        stonith-enabled="false" \
        default-resource-stickiness="1000"
Any help would be appreciated!

Last edited by IMNOboist; 02-13-2012 at 04:21 PM. Reason: Specified "Live" migration
 
Old 02-13-2012, 05:53 PM   #2
IMNOboist
Member
 
Registered: Nov 2003
Location: Northern Utah
Distribution: OpenBSD, Ubuntu, Linux Mint, Knoppix
Posts: 56

Original Poster
Rep: Reputation: 16
It looks like it might have something to do with one of the many parts of having ocfs2. When I try to cd into the /domains directory (the mount-point of the ocfs2 filesystem running on drbd) the terminal hangs.
 
Old 02-14-2012, 01:25 PM   #3
IMNOboist
Member
 
Registered: Nov 2003
Location: Northern Utah
Distribution: OpenBSD, Ubuntu, Linux Mint, Knoppix
Posts: 56

Original Poster
Rep: Reputation: 16
When the node that is running the VM (in this case, clutest2) is shut down using halt -p, the VM tries to migrate to the other node but it fails and shows this:
Code:
============
Last updated: Tue Feb 14 10:20:39 2012
Stack: openais
Current DC: clutest2 - partition with quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
5 Resources configured.
============

Online: [ clutest1 clutest2 ]

 Master/Slave Set: ms_drbd_master [p_drbd]
     Masters: [ clutest1 clutest2 ]
 Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
     Started: [ clutest1 clutest2 ]
 Clone Set: cl_fs_ocfs2 [p_fs_ocfs2]
     p_fs_ocfs2:1       (ocf::heartbeat:Filesystem):    Started clutest2 (unmanaged) FAILED
     Started: [ clutest1 ]
p_xen-lolwut    (ocf::heartbeat:Xen):   Started clutest1

Failed actions:
    p_fs_ocfs2:1_stop_0 (node=clutest2, call=103, rc=-2, status=Timed Out): unknown exec error
    p_o2cb:1_stop_0 (node=clutest2, call=105, rc=1, status=complete): unknown error
Then the VM starts up on the other node (it didn't live migrate, it shut down then started up) but the node that got the halt command doesn't shut down. It just hangs.

However, if the server that isn't running the VM is shut down, the VM continues humming away on the remaining node without a problem.

Still not sure what's going on...
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
MySQL HA-cluster with DRBD, Pacemaker and Corosync Patric.F Linux - Server 2 01-28-2012 06:27 AM
NFS cluster with OCFS2 + Quota + DRBD jirikmik Linux - Server 2 04-04-2011 07:58 AM
LXer: How To Set Up An Active/Passive PostgreSQL Cluster With Pacemaker, Corosync, And DRBD (CentOS LXer Syndicated Linux News 0 11-17-2010 09:40 AM
LXer: Installation And Setup Guide For DRBD, OpenAIS, Pacemaker + Xen On OpenSUSE 11. LXer Syndicated Linux News 0 08-19-2009 01:50 PM


All times are GMT -5. The time now is 10:18 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration