| Linux - Server This forum is for the discussion of Linux Software used in a server related context. |
| Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
 |
GNU/Linux Basic Guide
This 255-page guide will provide you with the keys to understand the philosophy of free software, teach you how to use and handle it, and give you the tools required to move easily in the world of GNU/Linux. Many users and administrators will be taking their first steps with this GNU/Linux Basic guide and it will show you how to approach and solve the problems you encounter.
Click Here to receive this Complete Guide absolutely free. |
|
 |
02-13-2012, 03:20 PM
|
#1
|
|
Member
Registered: Nov 2003
Location: Northern Utah
Distribution: OpenBSD, Ubuntu, Linux Mint, Knoppix
Posts: 54
Rep:
|
Xen cluster (drbd, ocfs2, pacemaker) auto-recover problem
I have a Xen cluster running drbd, ocfs2 and Pacemaker on Ubuntu 11.10. I can live migrate a VM from node1 to node2 and vice-versa without any problems, but if I pull the power cord on the node running the VM, it fails starting up on the other node.
Here is what crm_mon shows after the failure:
Code:
============
Last updated: Mon Feb 13 13:06:20 2012
Stack: openais
Current DC: clutest2 - partition WITHOUT quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
5 Resources configured.
============
Online: [ clutest2 ]
OFFLINE: [ clutest1 ]
Master/Slave Set: ms_drbd_master [p_drbd]
Masters: [ clutest2 ]
Stopped: [ p_drbd:0 ]
Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
Started: [ clutest2 ]
Stopped: [ g_ocfs2mgmt:0 ]
Clone Set: cl_fs_ocfs2 [p_fs_ocfs2]
Started: [ clutest2 ]
Stopped: [ p_fs_ocfs2:0 ]
p_xen-lolwut (ocf::heartbeat:Xen): Started clutest2 (unmanaged) FAILED
Failed actions:
p_xen-lolwut_stop_0 (node=clutest2, call=-1, rc=1, status=Timed Out): unknown error
p_xen-lolwut_start_0 (node=clutest2, call=-1, rc=1, status=Timed Out): unknown error
Here is the output from crm configure show:
Code:
node clutest1
node clutest2
primitive p_controld ocf:pacemaker:controld
primitive p_drbd ocf:linbit:drbd \
params drbd_resource="r0" \
operations $id="op_drbd" \
op monitor interval="20" role="Master" timeout="20" \
op monitor interval="30" role="Slave" timeout="20" \
meta target-role="started"
primitive p_fs_ocfs2 ocf:heartbeat:Filesystem \
params device="/dev/drbd/by-res/r0" directory="/domains" fstype="ocfs2" options="rw,noatime"
primitive p_o2cb ocf:pacemaker:o2cb
primitive p_xen-lolwut ocf:heartbeat:Xen \
params xmfile="/domains/lolwut/lolwut.cfg" \
op monitor interval="10s" \
meta target-role="Started" allow-migrate="true"
primitive xen-domutest ocf:heartbeat:Xen \
params xmfile="/domains/domutest/domutest.cfg" \
op monitor interval="10s" \
meta target-role="Stopped" allow-migrate="true"
group g_ocfs2mgmt p_controld p_o2cb
ms ms_drbd_master p_drbd \
meta resource-stickiness="100" master-max="2" clone-max="2" notify="true" interleave="true"
clone cl_fs_ocfs2 p_fs_ocfs2
clone cl_ocfs2mgmt g_ocfs2mgmt \
meta interleave="true"
colocation c_lolwut_fs inf: p_xen-lolwut cl_fs_ocfs2
colocation c_ocfs2 inf: cl_fs_ocfs2 cl_ocfs2mgmt ms_drbd_master:Master
order o_lolwut-after-fs inf: cl_fs_ocfs2:start p_xen-lolwut:start
order o_ocfs2 0: ms_drbd_master:promote cl_ocfs2mgmt:start cl_fs_ocfs2:start
property $id="cib-bootstrap-options" \
dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
default-resource-stickiness="1000"
Any help would be appreciated!
Last edited by IMNOboist; 02-13-2012 at 03:21 PM.
Reason: Specified "Live" migration
|
|
|
|
02-13-2012, 04:53 PM
|
#2
|
|
Member
Registered: Nov 2003
Location: Northern Utah
Distribution: OpenBSD, Ubuntu, Linux Mint, Knoppix
Posts: 54
Original Poster
Rep:
|
It looks like it might have something to do with one of the many parts of having ocfs2. When I try to cd into the /domains directory (the mount-point of the ocfs2 filesystem running on drbd) the terminal hangs.
|
|
|
|
02-14-2012, 12:25 PM
|
#3
|
|
Member
Registered: Nov 2003
Location: Northern Utah
Distribution: OpenBSD, Ubuntu, Linux Mint, Knoppix
Posts: 54
Original Poster
Rep:
|
When the node that is running the VM (in this case, clutest2) is shut down using halt -p, the VM tries to migrate to the other node but it fails and shows this:
Code:
============
Last updated: Tue Feb 14 10:20:39 2012
Stack: openais
Current DC: clutest2 - partition with quorum
Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
5 Resources configured.
============
Online: [ clutest1 clutest2 ]
Master/Slave Set: ms_drbd_master [p_drbd]
Masters: [ clutest1 clutest2 ]
Clone Set: cl_ocfs2mgmt [g_ocfs2mgmt]
Started: [ clutest1 clutest2 ]
Clone Set: cl_fs_ocfs2 [p_fs_ocfs2]
p_fs_ocfs2:1 (ocf::heartbeat:Filesystem): Started clutest2 (unmanaged) FAILED
Started: [ clutest1 ]
p_xen-lolwut (ocf::heartbeat:Xen): Started clutest1
Failed actions:
p_fs_ocfs2:1_stop_0 (node=clutest2, call=103, rc=-2, status=Timed Out): unknown exec error
p_o2cb:1_stop_0 (node=clutest2, call=105, rc=1, status=complete): unknown error
Then the VM starts up on the other node (it didn't live migrate, it shut down then started up) but the node that got the halt command doesn't shut down. It just hangs.
However, if the server that isn't running the VM is shut down, the VM continues humming away on the remaining node without a problem.
Still not sure what's going on...
|
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -5. The time now is 11:38 AM.
|
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|