Is there a way to stop Linux from chocking up and often crashing if disk IO is slow?
Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336
Original Poster
Rep:
Quote:
Originally Posted by jpollard
One advantage iSCSI has over NFS for the host providing the virtual disk is that it would bypass the two level I/O. In the NFS mode, the VM making an I/O request to its virtual disk first gets translated to a host reference - then the host translates that to an NFS reference... then the NFS server translates that to a disk reference.
With iSCSI, the VM would make a an I/O request to the virtual device, which then gets sent to the iSCSI target, which can then translates to a disk block and to a disk reference.
This would eliminate the VM host from a lot of excess work - including buffer management which can add latency to the usual NFS delays from both the file server and the VM host.
Note: the iscsi target does not have to be a hardware unit - CAN be, but it isn't required.
That's what I'm thinking too, iSCSI would probably take out lot of overhead. For actual file access I can just setup a file server VM with a large virtual disk, then do NFS and SMB for that. This would also have the advantage of being able to eventually get iSCSI cards so I can put the OS on the SAN too for physical servers. Less parts that can fail.
That's what I'm thinking too, iSCSI would probably take out lot of overhead. For actual file access I can just setup a file server VM with a large virtual disk, then do NFS and SMB for that. This would also have the advantage of being able to eventually get iSCSI cards so I can put the OS on the SAN too for physical servers. Less parts that can fail.
You don't need "iSCSI" cards. It is all software passing SCSI commands over a network connection. The targeted host then interprets the SCSI commands - which COULD just pass them to a dedicated disk, but usually interprets them to access a disk file. The VM would use an iscsi driver to intercept the commands - and encapsulate them to send to a server over the network.
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336
Original Poster
Rep:
Quote:
Originally Posted by jpollard
Actually you can.
VMFS is a distributed shared filesystem. You can get that with gluster.
An iscsi target is more aimed at giving the VM an appearance of a dedicated device. Suitable for a root disk.
Yeah but VMFS is proprietary is it not? If I want to setup multiple Linux HOSTS to use KVM or other VM solution (not vmware) and I want them to be able to access the same iSCSI targets, what file system would I use on the HOSTS? That's what I'm asking. Or would gluster take care of that? Is that a file system on it's own?
Ex: I go on one of the VM hosts, and setup an ISCSI target, which will be like a raw hard drive, I need to know what file system I would format with so that I can setup that same target on another host to see the same files, without risk of corruption. Not all file systems will work this way.
Quote:
Originally Posted by jpollard
You don't need "iSCSI" cards. It is all software passing SCSI commands over a network connection. The targeted host then interprets the SCSI commands - which COULD just pass them to a dedicated disk, but usually interprets them to access a disk file. The VM would use an iscsi driver to intercept the commands - and encapsulate them to send to a server over the network.
I was talking about physical servers. If I wanted to I could put an iSCSI card in it and have it boot off a target, rather than put a hard drive in the server. It would eliminate a point of failure. Those are ridiculously expensive though so probably would not bother... For VMs then the VM hosts would use software iSCSI initiator. I've managed SANs before. Just never in Linux/open source but I want to set up my environment that way, if it means better performance. I kinda treat my file server as a SAN anyway so there's really no point in the overhead of NFS when I can do block storage.
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336
Original Poster
Rep:
Been trying to find info on setting up a HA iSCSI environment in Linux and there is little to no documentation out there so I think for now I will scrap that idea for now and I rather not try to completely overhaul my environment live anyway, I'll wait till the future when I decide to actually get more hardware to actually do HA.
I just want to know what I can do to make my existing setup choke less. What files do I have to edit, what do I have to put in them, etc. For example how do I disable the caching like was suggested? Where do I go for that?
The problem isn't caching - but that COULD introduce problems with multiple updates to a file from different places...
The problem appears to be timeouts, which is why I indicated a number of options for NFS mounts to change the timeouts...
One last nfs option (and I don't like it as it makes things harder to shutdown) is to use the "hard" option. This causes NFS clients to hang while an NFS server reboots - and if it never reboots, you can't easily shutdown the client as it is locked in an uninterruptable wait for the server...
Linux has a command top which shows which process is taking up lots of CPU and memory resources.
Use the top command and kill the process which is unnecessarily taking up lots of CPU and memory resources.
For a file server, top will only report itself... NFS is done within the kernel.
And within the this particular context, I think it will show sufficient idle time...
I BELIEVE (not having proof) that the sum of latencies involved with the I/O are causing the problem, not necessarily a lack of CPU time. It may be an overloaded network... or overloaded disk... and neither are examined by top. Might try "iotop" instead.
NFS doesn't have a timeout. If a read takes an hour, everything waits and runs fine after it completes.
However the disk block device driver in your VM does have a timeout. You can change it from the default, usually 30 or 60 seconds, to 5 minutes by doing the command below:
NFS doesn't have a timeout. If a read takes an hour, everything waits and runs fine after it completes.
However the disk block device driver in your VM does have a timeout. You can change it from the default, usually 30 or 60 seconds, to 5 minutes by doing the command below:
Code:
echo 300 >/sys/class/block/sda/device/timeout
NFS does indeed have a timeout - unless you mount "hard", which introduces management problems to the clients. The timeouts can also cause total system hangs when multiple systems depend on a single export... as one client can lock the entire tree. And if that client then enters a LONG timeout cycle, other clients will gradually backup behind that lock.
Now changing the VM device driver timeout would be an interesting modification. I hadn't considered that.
NFS does indeed have a timeout - unless you mount "hard", which introduces management problems to the clients. The timeouts can also cause total system hangs when multiple systems depend on a single export... as one client can lock the entire tree. And if that client then enters a LONG timeout cycle, other clients will gradually backup behind that lock.
Now changing the VM device driver timeout would be an interesting modification. I hadn't considered that.
And that thought brings up another thought...
Has the possibility of using an NFS mounted root filesystem been considered?
This would remove the VM drivers from the loop, and allow direct NFS handling of the root filesystem between the VM and the server. It is "close" to the way iscsi would be interacting with the server by not having to work through the VM host which would then have to work through NFS.
PS:
There would be a couple of advantages provided:
1. shared space with the file server where unused storage by one VM would be available to another...
2. Possible sharing of /usr among all VMs (assuming all are at the same level)
3. Possibly easier updating? I haven't done this in a long time, but when I was doing it, only the file server needed updating - as updating it would update the /usr filesystem (presumably shared). If the NFS /usr is separate, only one temporary host would need updating, and that one would update the shared /usr for all. The only thing the /root filesystem would have that HAS to be separate is /etc, and /var (assuming /tmp is mounted as a tmpfs mount).
Alternatively (and likely simpler) would be to have /root (and /usr combined) separate for each VM. Takes up more disk space though as there would be no shared binaries. The unused space would still be shared.
One way to view this model is that the VMs are all treated as diskless clients of a file server.
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336
Original Poster
Rep:
I've used top and iotop, and backup jobs will naturally cause lot of usage, I don't want to stop those, I just don't want the system to choke up because there's lot of activity. Torrents seem to cause lot of activity too due to dealing with lot of small writes. It's one thing if access is slower because of increased I/O, I just don't want the systems to crash or have issues and end up generating tons of errors, which is what happens now. For that time out command, which system do I put that on, the ESX hosts? The file server? Or each VM? Guessing those changes are not persistent so I'd have to set it in my startup script too?
Also figured this might help, this is what my exports file looks like:
I've used top and iotop, and backup jobs will naturally cause lot of usage, I don't want to stop those, I just don't want the system to choke up because there's lot of activity. Torrents seem to cause lot of activity too due to dealing with lot of small writes. It's one thing if access is slower because of increased I/O, I just don't want the systems to crash or have issues and end up generating tons of errors, which is what happens now. For that time out command, which system do I put that on, the ESX hosts? The file server? Or each VM?
EACH VM.
Quote:
Guessing those changes are not persistent so I'd have to set it in my startup script too?
Since you have a RH based kit, no. There is a /etc/sysconfig.d (and see the manpage on sysctl) that handles that.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.