LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 07-07-2015, 06:21 PM   #16
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336

Original Poster
Rep: Reputation: 54

Quote:
Originally Posted by jpollard View Post
One advantage iSCSI has over NFS for the host providing the virtual disk is that it would bypass the two level I/O. In the NFS mode, the VM making an I/O request to its virtual disk first gets translated to a host reference - then the host translates that to an NFS reference... then the NFS server translates that to a disk reference.

With iSCSI, the VM would make a an I/O request to the virtual device, which then gets sent to the iSCSI target, which can then translates to a disk block and to a disk reference.

This would eliminate the VM host from a lot of excess work - including buffer management which can add latency to the usual NFS delays from both the file server and the VM host.

Note: the iscsi target does not have to be a hardware unit - CAN be, but it isn't required.
That's what I'm thinking too, iSCSI would probably take out lot of overhead. For actual file access I can just setup a file server VM with a large virtual disk, then do NFS and SMB for that. This would also have the advantage of being able to eventually get iSCSI cards so I can put the OS on the SAN too for physical servers. Less parts that can fail.
 
Old 07-07-2015, 06:26 PM   #17
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Actually you can.

VMFS is a distributed shared filesystem. You can get that with gluster.

An iscsi target is more aimed at giving the VM an appearance of a dedicated device. Suitable for a root disk.

Last edited by jpollard; 07-07-2015 at 06:27 PM.
 
Old 07-07-2015, 06:39 PM   #18
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by Red Squirrel View Post
That's what I'm thinking too, iSCSI would probably take out lot of overhead. For actual file access I can just setup a file server VM with a large virtual disk, then do NFS and SMB for that. This would also have the advantage of being able to eventually get iSCSI cards so I can put the OS on the SAN too for physical servers. Less parts that can fail.

You don't need "iSCSI" cards. It is all software passing SCSI commands over a network connection. The targeted host then interprets the SCSI commands - which COULD just pass them to a dedicated disk, but usually interprets them to access a disk file. The VM would use an iscsi driver to intercept the commands - and encapsulate them to send to a server over the network.
 
Old 07-07-2015, 07:06 PM   #19
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336

Original Poster
Rep: Reputation: 54
Quote:
Originally Posted by jpollard View Post
Actually you can.

VMFS is a distributed shared filesystem. You can get that with gluster.

An iscsi target is more aimed at giving the VM an appearance of a dedicated device. Suitable for a root disk.
Yeah but VMFS is proprietary is it not? If I want to setup multiple Linux HOSTS to use KVM or other VM solution (not vmware) and I want them to be able to access the same iSCSI targets, what file system would I use on the HOSTS? That's what I'm asking. Or would gluster take care of that? Is that a file system on it's own?

Ex: I go on one of the VM hosts, and setup an ISCSI target, which will be like a raw hard drive, I need to know what file system I would format with so that I can setup that same target on another host to see the same files, without risk of corruption. Not all file systems will work this way.

Quote:
Originally Posted by jpollard View Post
You don't need "iSCSI" cards. It is all software passing SCSI commands over a network connection. The targeted host then interprets the SCSI commands - which COULD just pass them to a dedicated disk, but usually interprets them to access a disk file. The VM would use an iscsi driver to intercept the commands - and encapsulate them to send to a server over the network.
I was talking about physical servers. If I wanted to I could put an iSCSI card in it and have it boot off a target, rather than put a hard drive in the server. It would eliminate a point of failure. Those are ridiculously expensive though so probably would not bother... For VMs then the VM hosts would use software iSCSI initiator. I've managed SANs before. Just never in Linux/open source but I want to set up my environment that way, if it means better performance. I kinda treat my file server as a SAN anyway so there's really no point in the overhead of NFS when I can do block storage.
 
Old 07-08-2015, 03:46 AM   #20
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336

Original Poster
Rep: Reputation: 54
Been trying to find info on setting up a HA iSCSI environment in Linux and there is little to no documentation out there so I think for now I will scrap that idea for now and I rather not try to completely overhaul my environment live anyway, I'll wait till the future when I decide to actually get more hardware to actually do HA.

I just want to know what I can do to make my existing setup choke less. What files do I have to edit, what do I have to put in them, etc. For example how do I disable the caching like was suggested? Where do I go for that?
 
Old 07-08-2015, 05:54 AM   #21
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
You don't disable the caching.

The problem isn't caching - but that COULD introduce problems with multiple updates to a file from different places...

The problem appears to be timeouts, which is why I indicated a number of options for NFS mounts to change the timeouts...

One last nfs option (and I don't like it as it makes things harder to shutdown) is to use the "hard" option. This causes NFS clients to hang while an NFS server reboots - and if it never reboots, you can't easily shutdown the client as it is locked in an uninterruptable wait for the server...
 
Old 07-08-2015, 06:08 AM   #22
Slax-Dude
Member
 
Registered: Mar 2006
Location: Valadares, V.N.Gaia, Portugal
Distribution: Slackware
Posts: 528

Rep: Reputation: 272Reputation: 272Reputation: 272
Regarding host disk cache: google is your friend
https://pubs.vmware.com/vsphere-4-es...hard_disk.html

Regarding cluster filesystems on linux: I use OCFS2 and like it a lot as it is simple and gets the job done, but you can use others.
https://en.wikipedia.org/wiki/List_o...systems#SHARED
 
Old 07-08-2015, 06:42 AM   #23
navigatorsystemindia
LQ Newbie
 
Registered: Feb 2015
Location: bangalore
Posts: 3

Rep: Reputation: Disabled
Linux has a command “top” which shows which process is taking up lots of CPU and memory resources.

Use the top command and kill the process which is unnecessarily taking up lots of CPU and memory resources.
 
Old 07-08-2015, 08:01 AM   #24
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by navigatorsystemindia View Post
Linux has a command top which shows which process is taking up lots of CPU and memory resources.

Use the top command and kill the process which is unnecessarily taking up lots of CPU and memory resources.
For a file server, top will only report itself... NFS is done within the kernel.

And within the this particular context, I think it will show sufficient idle time...

I BELIEVE (not having proof) that the sum of latencies involved with the I/O are causing the problem, not necessarily a lack of CPU time. It may be an overloaded network... or overloaded disk... and neither are examined by top. Might try "iotop" instead.

Last edited by jpollard; 07-08-2015 at 08:13 AM.
 
Old 07-08-2015, 08:20 AM   #25
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: Fedora
Posts: 4,149

Rep: Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264Reputation: 1264
NFS doesn't have a timeout. If a read takes an hour, everything waits and runs fine after it completes.

However the disk block device driver in your VM does have a timeout. You can change it from the default, usually 30 or 60 seconds, to 5 minutes by doing the command below:

Code:
echo 300 >/sys/class/block/sda/device/timeout
 
Old 07-08-2015, 08:31 AM   #26
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by smallpond View Post
NFS doesn't have a timeout. If a read takes an hour, everything waits and runs fine after it completes.

However the disk block device driver in your VM does have a timeout. You can change it from the default, usually 30 or 60 seconds, to 5 minutes by doing the command below:

Code:
echo 300 >/sys/class/block/sda/device/timeout
NFS does indeed have a timeout - unless you mount "hard", which introduces management problems to the clients. The timeouts can also cause total system hangs when multiple systems depend on a single export... as one client can lock the entire tree. And if that client then enters a LONG timeout cycle, other clients will gradually backup behind that lock.

Now changing the VM device driver timeout would be an interesting modification. I hadn't considered that.
 
Old 07-08-2015, 08:35 AM   #27
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by jpollard View Post
NFS does indeed have a timeout - unless you mount "hard", which introduces management problems to the clients. The timeouts can also cause total system hangs when multiple systems depend on a single export... as one client can lock the entire tree. And if that client then enters a LONG timeout cycle, other clients will gradually backup behind that lock.

Now changing the VM device driver timeout would be an interesting modification. I hadn't considered that.
And that thought brings up another thought...

Has the possibility of using an NFS mounted root filesystem been considered?

This would remove the VM drivers from the loop, and allow direct NFS handling of the root filesystem between the VM and the server. It is "close" to the way iscsi would be interacting with the server by not having to work through the VM host which would then have to work through NFS.

PS:
There would be a couple of advantages provided:
1. shared space with the file server where unused storage by one VM would be available to another...
2. Possible sharing of /usr among all VMs (assuming all are at the same level)
3. Possibly easier updating? I haven't done this in a long time, but when I was doing it, only the file server needed updating - as updating it would update the /usr filesystem (presumably shared). If the NFS /usr is separate, only one temporary host would need updating, and that one would update the shared /usr for all. The only thing the /root filesystem would have that HAS to be separate is /etc, and /var (assuming /tmp is mounted as a tmpfs mount).

Alternatively (and likely simpler) would be to have /root (and /usr combined) separate for each VM. Takes up more disk space though as there would be no shared binaries. The unused space would still be shared.

One way to view this model is that the VMs are all treated as diskless clients of a file server.

Last edited by jpollard; 07-08-2015 at 08:46 AM.
 
Old 07-08-2015, 03:57 PM   #28
Red Squirrel
Senior Member
 
Registered: Dec 2003
Distribution: Mint 20.1 on workstation, Debian 11 on servers
Posts: 1,336

Original Poster
Rep: Reputation: 54
I've used top and iotop, and backup jobs will naturally cause lot of usage, I don't want to stop those, I just don't want the system to choke up because there's lot of activity. Torrents seem to cause lot of activity too due to dealing with lot of small writes. It's one thing if access is slower because of increased I/O, I just don't want the systems to crash or have issues and end up generating tons of errors, which is what happens now. For that time out command, which system do I put that on, the ESX hosts? The file server? Or each VM? Guessing those changes are not persistent so I'd have to set it in my startup script too?


Also figured this might help, this is what my exports file looks like:

Code:
/volumes/raid1/p2p       falcon.loc(rw,all_squash,anonuid=1020,anongid=1039) falcon2.loc(rw) borg.loc(rw) p2p.loc(rw,all_squash,anonuid=1020,anongid=1039) htpc.loc(rw) hal9000.loc(rw) 10.5.5.5(rw)
/volumes/raid3/userdata/ryan    falcon.loc(rw) falcon2.loc(rw) borg.loc(rw) 10.5.5.5(rw)
/volumes/raid3/public           10.1.0.0/16(ro) falcon.loc(rw) falcon2.loc(rw) borg.loc(rw,root_squash) 10.5.5.5(rw) moria.loc(rw)
#/volumes/raid3/applications     falcon.loc(rw) falcon2.loc(rw) borg.loc(rw) 10.5.5.5(rw)
/volumes/raid3/intranet         falcon.loc(rw) falcon2.loc(rw) borg.loc(rw) 10.5.5.5(rw)
/volumes/raid2/backups          falcon.loc(rw) falcon2.loc(rw) hal9000.loc(rw) borg.loc(rw) appdev.loc(ro)
/volumes/raid1/temp             10.1.0.0/16(rw,no_root_squash) 10.5.5.0/16(rw,no_root_squash)
/volumes/raid2/temp             10.1.0.0/16(rw,no_root_squash) 10.5.5.0/16(rw,no_root_squash)
/volumes/raid3/temp             10.1.0.0/16(rw,no_root_squash) 10.5.5.0/16(rw,no_root_squash)
/volumes/raid1/appdev           10.0.0.0/8(rw,all_squash,anonuid=1066,anongid=1066)

/volumes/raid1/vms_lun1              borg.loc(rw,all_squash,anonuid=1046,anongid=1046) moria.loc(rw,all_squash,anonuid=1046,anongid=1046)
/volumes/raid2/vms_lun2              borg.loc(rw,all_squash,anonuid=1046,anongid=1046) moria.loc(rw,all_squash,anonuid=1046,anongid=1046)
/volumes/raid3/vms_lun3              borg.loc(rw,all_squash,anonuid=1046,anongid=1046) moria.loc(rw,all_squash,anonuid=1046,anongid=1046)

/volumes/raid1/mysql/borg.loc          borg.loc(rw,all_squash,anonuid=27,anongid=27) 
/volumes/raid1/mysql/appdev.loc        appdev.loc(rw,all_squash,anonuid=27,anongid=27)
Any options I can add there to make it better?
 
Old 07-08-2015, 04:35 PM   #29
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Quote:
Originally Posted by Red Squirrel View Post
I've used top and iotop, and backup jobs will naturally cause lot of usage, I don't want to stop those, I just don't want the system to choke up because there's lot of activity. Torrents seem to cause lot of activity too due to dealing with lot of small writes. It's one thing if access is slower because of increased I/O, I just don't want the systems to crash or have issues and end up generating tons of errors, which is what happens now. For that time out command, which system do I put that on, the ESX hosts? The file server? Or each VM?
EACH VM.
Quote:
Guessing those changes are not persistent so I'd have to set it in my startup script too?
Since you have a RH based kit, no. There is a /etc/sysconfig.d (and see the manpage on sysctl) that handles that.
 
Old 07-09-2015, 03:15 AM   #30
voleg
Member
 
Registered: Oct 2013
Distribution: RedHat CentOS Fedora SuSE
Posts: 354

Rep: Reputation: 51
Definitely hardware error for sda, replace it.
Other option: replace SATA cable (is it SATA?).
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How do I get DBPedia References to stop crashing immediately after I start it? verynoobish Slackware 4 05-23-2011 10:34 PM
Whole system very slow and keeps crashing justinp526 Linux - Newbie 21 08-23-2010 06:39 AM
kde too slow or crashing when I have internet brasuca Linux - Networking 0 10-06-2005 04:20 PM
DISCUSSION: Howto stop your laptop from crashing when power source is changed. bufo333 LinuxAnswers Discussion 1 07-24-2004 03:11 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 10:06 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration