midorikawa |
03-21-2013 04:00 PM |
Quote:
Originally Posted by jefro
(Post 4916151)
Wonder if moving the vm's to a clean new installed host would show any improvement. At first it would seem to be an issue with clients but if both fail it leads me to consider host or host hardware. I'd bet host hardware at this point.
|
What's strange about this, is that the host has no issues whatsoever. The host is on 2 320GB SATA drives in md RAID1, and the RAID shows as clean. The other thing that's strange about this is that it wasn't both at first. At first it was just one, and only moved to the other when I started up MySQL. Shutting down all VMs and pushing resource consumption up higher than with MySQL running in either guest does nothing.
I'd really rather not reinstall unless I was sure this is the cause.
I also just tried using virtio, and booting from an ISO, and still no luck. It locked up here:
Code:
livecd ~ # mount /dev/vda1 /mnt/gentoo
and here:
Code:
livecd ~ # fsck.ext4 -fy /dev/vda1
e2fsck 1.42 (29-Nov-2011)
/: recovering journal
It doesn't matter if I virtio the iso or not, the result is the same.
I just straced with -ff the virtualized process while it's locked, and got the following output:
Code:
[pid 13453] rt_sigaction(SIGALRM, NULL, {0x7ff23af35dc0, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7ff238876460}, 8) = 0
[pid 13453] write(4, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(3, 0x7fff5ce802b0, 128) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13453] select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 0}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 29541303}}, NULL) = 0
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 29433214}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 29067851}}, NULL) = 0
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [4])
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(4, "\1\0\0\0\0\0\0\0", 512) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] read(5, "\4\0\0\0\0\0\0\0", 16) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [3])
[pid 13453] read(5, 0x7fff5ce80340, 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13453] read(3, "\16\0\0\0\0\0\0\0\376\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0"..., 128) = 128
[pid 13453] rt_sigaction(SIGALRM, NULL, {0x7ff23af35dc0, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7ff238876460}, 8) = 0
[pid 13453] write(4, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(3, 0x7fff5ce802b0, 128) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13453] select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 0}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 29563778}}, NULL) = 0
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 29465375}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 29115804}}, NULL) = 0
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [4])
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(4, "\1\0\0\0\0\0\0\0", 512) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] read(5, "\4\0\0\0\0\0\0\0", 16) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [3])
[pid 13453] read(5, 0x7fff5ce80340, 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13453] read(3, "\16\0\0\0\0\0\0\0\376\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0"..., 128) = 128
[pid 13453] rt_sigaction(SIGALRM, NULL, {0x7ff23af35dc0, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7ff238876460}, 8) = 0
[pid 13453] write(4, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(3, 0x7fff5ce802b0, 128) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13453] select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 0}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 29581552}}, NULL) = 0
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 29243760}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 29098059}}, NULL) = 0
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [4])
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(4, "\1\0\0\0\0\0\0\0", 512) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] read(5, "\4\0\0\0\0\0\0\0", 16) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [3])
[pid 13453] read(5, 0x7fff5ce80340, 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13453] read(3, "\16\0\0\0\0\0\0\0\376\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0"..., 128) = 128
[pid 13453] rt_sigaction(SIGALRM, NULL, {0x7ff23af35dc0, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7ff238876460}, 8) = 0
[pid 13453] write(4, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(3, 0x7fff5ce802b0, 128) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13453] select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 0}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 29278736}}, NULL) = 0
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 29188725}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 28789433}}, NULL) = 0
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [4])
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(4, "\1\0\0\0\0\0\0\0", 512) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] read(5, "\4\0\0\0\0\0\0\0", 16) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [3])
[pid 13453] read(5, 0x7fff5ce80340, 16) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13453] read(3, "\16\0\0\0\0\0\0\0\376\377\377\377\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\0\0\0\0"..., 128) = 128
[pid 13453] rt_sigaction(SIGALRM, NULL, {0x7ff23af35dc0, ~[KILL STOP RTMIN RT_1], SA_RESTORER, 0x7ff238876460}, 8) = 0
[pid 13453] write(4, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(3, 0x7fff5ce802b0, 128) = -1 EAGAIN (Resource temporarily unavailable)
[pid 13453] select(1, [0], NULL, NULL, {0, 0}) = 0 (Timeout)
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 0}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 29696573}}, NULL) = 0
[pid 13453] timer_gettime(0x1, {it_interval={0, 0}, it_value={0, 29380885}}) = 0
[pid 13453] timer_settime(0x1, 0, {it_interval={0, 0}, it_value={0, 29029681}}, NULL) = 0
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [4])
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] read(4, "\1\0\0\0\0\0\0\0", 512) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL) = 1 (in [5])
[pid 13453] read(5, "\4\0\0\0\0\0\0\0", 16) = 8
[pid 13453] select(11, [3 4 5 8 9 10], [], [], NULL^CProcess 13453 detached
While I'm at it, lsof dumps the following:
Code:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
qemu-syst 13453 root cwd DIR 9,127 4096 2 /
qemu-syst 13453 root rtd DIR 9,127 4096 2 /
qemu-syst 13453 root txt REG 9,127 4920872 2419871 /usr/bin/qemu-system-x86_64
qemu-syst 13453 root mem REG 9,127 14592 2326682 /lib64/libdl-2.15.so
qemu-syst 13453 root mem REG 9,127 1909952 1350217 /usr/lib64/libcrypto.so.1.0.0
qemu-syst 13453 root mem REG 9,127 431760 1350220 /usr/lib64/libssl.so.1.0.0
qemu-syst 13453 root mem REG 9,127 1732824 2326238 /lib64/libc-2.15.so
qemu-syst 13453 root mem REG 9,127 135074 2312695 /lib64/libpthread-2.15.so
qemu-syst 13453 root mem REG 9,127 88440 1353096 /lib64/libz.so.1.2.7
qemu-syst 13453 root mem REG 9,127 1009848 2325424 /lib64/libm-2.15.so
qemu-syst 13453 root mem REG 9,127 588432 1340465 /usr/lib64/libpixman-1.so.0.28.0
qemu-syst 13453 root mem REG 9,127 5280 1369378 /lib64/libaio.so.1.0.1
qemu-syst 13453 root mem REG 9,127 71904 2087180 /usr/lib64/libseccomp.so.1.0.1
qemu-syst 13453 root mem REG 9,127 264704 1361335 /usr/lib64/libjpeg.so.8.0.2
qemu-syst 13453 root mem REG 9,127 174928 2268981 /usr/lib64/libpng15.so.15.13.0
qemu-syst 13453 root mem REG 9,127 18728 1375234 /lib64/libuuid.so.1.3.0
qemu-syst 13453 root mem REG 9,127 921272 1358418 /usr/lib64/libasound.so.2.0.0
qemu-syst 13453 root mem REG 9,127 345968 1350297 /lib64/libncurses.so.5.9
qemu-syst 13453 root mem REG 9,127 360456 1393330 /usr/lib64/libcurl.so.4.3.0
qemu-syst 13453 root mem REG 9,127 10456 2326765 /lib64/libutil-2.15.so
qemu-syst 13453 root mem REG 9,127 1192888 1352994 /usr/lib64/libglib-2.0.so.0.3200.4
qemu-syst 13453 root mem REG 9,127 35656 2326731 /lib64/librt-2.15.so
qemu-syst 13453 root mem REG 9,127 144816 2326771 /lib64/ld-2.15.so
qemu-syst 13453 root mem REG 0,9 3826 anon_inode:kvm-vcpu (stat: No such file or directory)
qemu-syst 13453 root 0u CHR 136,4 0t0 7 /dev/pts/4
qemu-syst 13453 root 1u CHR 136,4 0t0 7 /dev/pts/4
qemu-syst 13453 root 2u CHR 136,4 0t0 7 /dev/pts/4
qemu-syst 13453 root 3u 0000 0,9 0 3826 anon_inode
qemu-syst 13453 root 4u 0000 0,9 0 3826 anon_inode
qemu-syst 13453 root 5u 0000 0,9 0 3826 anon_inode
qemu-syst 13453 root 6u CHR 10,232 0t0 1256 /dev/kvm
qemu-syst 13453 root 7u 0000 0,9 0 3826 anon_inode
qemu-syst 13453 root 8u CHR 10,200 0t0 45 /dev/net/tun
qemu-syst 13453 root 9u CHR 10,200 0t0 45 /dev/net/tun
qemu-syst 13453 root 10u 0000 0,9 0 3826 anon_inode
qemu-syst 13453 root 11r REG 9,127 179519488 791369 /root/install-amd64-minimal-20130110.iso
qemu-syst 13453 root 12u REG 9,2 137679273984 12 /web/web.img
qemu-syst 13453 root 13u BLK 9,3 0x4af000000 4638 /dev/md3
qemu-syst 13453 root 14u 0000 0,9 0 3826 anon_inode
qemu-syst 13453 root 15u 0000 0,9 0 3826 anon_inode
I do see that there's a "no such file or directory" on anon_inode:kvm-vcpu, but I'm entirely unsure what this is or whether or not it's the cause of my problems, and google doesn't show much useful info. At least, not that I've found.
|