LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Slackware (https://www.linuxquestions.org/questions/slackware-14/)
-   -   VirtualBox 4.0.4 + crashing guest HDD (https://www.linuxquestions.org/questions/slackware-14/virtualbox-4-0-4-crashing-guest-hdd-883410/)

Cesare 05-29-2011 07:23 PM

VirtualBox 4.0.4 + crashing guest HDD
 
Since my upgrade to 13.37 + VirtualBox 4.0.4 (from SBo) I'm having trouble with one of my VMs containing a PostgreSQL installation (also from SBo) which has been running fine before on 13.1 + VBox 3.2.10.

This is what happens: I'm starting an I/O heavy job on the DB that should take about an hour but never finishes because the virtual disk stops working after 10 to 40 minutes.

Code:

ata1.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/00:00:e8:11:44/01:00:02:00:00/40 tag 0 ncq 131072 in
        res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
...lots of these
ata1.00: device reported invalid CHS sector 0
...and lots of these

The messages vary slightly from time to time, so do the symtpoms. Sometimes just the DB crashes, sometimes I get a forced filesystem shutdown. It happens on the original 13.1-VM and also on a freshly installed 13.37-VM.

There's one message...
Code:

hrtimer: interrupt took 6544232 ns
...in the VMs syslog as soon as I/O starts which suggests timing problems and which I don't think I've seen before. It's the only hint. The host machine is working fine and the VM's VBox.log doesn't log anything when the crash happens.

Here's what I've tried so far to solve this problem:
  • upgraded to VirtualBox 4.0.8 and back to 4.0.4 again
  • switched virtual disk between IDE and SATA - with and without host cache
  • tested countless kernel boot options in the VM
  • downgraded the host-kernel to 2.6.35.12 (config from /testing)
Nothing of this helped. The virtual IDE-disk has different error messages (including a very interesting "lost interrupt" message) but crashes nontheless.

The Postgres-DB is for development only, so I'm not losing any data here, but I haven't been getting any work done since, either :-(

Has anyone else encountered this problem or can at least point me into the right direction?

PS: VirtualBox was built in a multilib VM, but host and guest are pure 64-bit. I've been using this setup since 13.0.

sysfce2 05-29-2011 08:03 PM

I recall reading somewhere that VirtualBox 4 is still a lot buggier than the 3 series (which is what I am still using). Hopefully most of that would be fixed by the .8 update, but it may be worth trying to downgrade to 3.2 again.

I also don't use the version from the SBo: I use the binary from virtualbox.org. It is more full featured (at the expense of not being open source) and also has a 64 bit version.

chrisretusn 05-30-2011 01:08 AM

I don't see that 4.0+ is buggier. The SBo version though, does seem to have problems. I prefer and use the binary version also.

ppr:kut 05-30-2011 02:44 AM

Quote:

I also don't use the version from the SBo: I use the binary from virtualbox.org. It is more full featured (at the expense of not being open source) and also has a 64 bit version
This is no longer true for version 4. The binary version you'd get from virtualbox.org is just a badly compiled version of what you'd get from SB.o if you'd compile it yourself (it even links with pam). There's no feature difference whatsoever, except intentionally over the switches explained in the README.

---------- Post added 30-05-11 at 09:45 ----------

Quote:

The SBo version though, does seem to have problems
I'd be glad to hear about those problems

Darth Vader 05-30-2011 03:01 AM

Quote:

Originally Posted by sysfce2 (Post 4370541)
I also don't use the version from the SBo: I use the binary from virtualbox.org. It is more full featured (at the expense of not being open source) and also has a 64 bit version.

The binary from virtualbox.org is just an OSE edition, available on SBo, too. To become that VirtualBox, really, full featured, you should install the Extension Package. ;)

chrisretusn 05-30-2011 09:10 AM

Quote:

Originally Posted by ppr:kut (Post 4370733)
I'd be glad to hear about those problems

I see your the SBo maintainer. Thanks for asking.

Even if you are a member of vboxusers, USB does not work unless you set permissions for /proc/bus/usb via rc.S or via fstab enties.

tronayne 05-30-2011 10:36 AM

Although I have only installed the binary versions directly from http://www.virtualbox.org/, currently VirtualBox-4.0.8-71778-Linux_amd64.run and Oracle_VM_VirtualBox_Extension_Pack-4.0.8-71778.vbox-extpack and the Guest Additions (stuff like USB just doesn't like to work without both the Extension Pack and Guest Additions), the only problems I have had have been due to a lack of RAM available to the guest and, in the case of DBMS running in the guest, a lack of shared memory available to the guest. MySQL and PostgreSQL will tend to use a lot of shared memory, message queues and semaphores (you can see what's going there with ipcs).

You did not mention how much RAM you have assigned to the guest; on my 8G box, I assign at least 4G to a guest that's going to be doing large data base transactions or updates (updates in particular eat RAM for breakfast). Have you tuned your DBMS installation in the guest following recommendations for shared memory and transactions sizing?

Something you can do that may provide a dynamic indication of what's going on is start GKrellM on the host (it's provided with Slackware 13.x) and keep an eye on the CPU(s), disk and memory displays and perhaps adjust host or guest parameters; you know, if you see your CPU(s) at 99% and swapping going on, well, might give you hint.

Hope this helps some.

Cesare 06-07-2011 08:07 PM

I *think* I've nailed this down...

The DB-VM produced not only lots, but TONS of I/O - a bit too much for my not-quite-fast host HDD. I've seen it happen live with "watch -n 1 cat /proc/meminfo" and "vmstat 1": suddenly there's more than 1GB of dirty RAM, the machine hits vm.dirty_ratio enforcing sync I/O, everything gets really slow, and the VM gets virtual SATA timeouts.

If I'm right, this would also explain why I didn't have this problem with the other, much less I/O-heavy VMs. But why did it happen after the upgrade? The DB is constantly growing and I guess it's just a coincidence that I hit the critical size just now.

If you're experiencing problems with I/O-heavy VBox-VMs the following steps could help:
  • enforce bandwidth limits (new in VBox 4.0)
  • reduce vm.dirty_background_ratio or set vm.dirty_background_bytes to get the bgwriter going earlier
  • try to reduce I/O :-)

Besides... I never had a need for any VirtualBox feature that wasn't part of the OSE, so I never used anything else but the SBo version. A Slackware package is just so much nicer than an installer, that does things like compiling kernel modules during boot time.


All times are GMT -5. The time now is 01:42 AM.