Ssh spins up all file server hard drives. Any ideas on why? (hdparm related)
I have a home file server (i.e., NAS) running Ubuntu 12.04 minimal. The system disk and home directories are on a USB flash drive and there are 6 spinning hard drives containing my data (They are separate disks, not configured for RAID). They are connected to SATA ports on my motherboard. Most of them get little regular use, so I keep the rotating disks spun down (i.e., I use hdparm to spin them down after an hour of inactivity). Whenever I try and read from one of them (generally over samba) it spins up without waking the other 5. That's just what I want.
Everything works great except that if I wait a few days and then ssh into the server, for some reason all 6 rotating disks spin up (the login takes a long time while it waits for them) despite the fact that I have not tried to access any of them. If I then spin them down (or let them spin down using the timeout) and then ssh in again, they do not spin up unless I try and access them, which is the behavior I want.
A few more specifics of my system, in case they matter:
* I do run apache, with all the apache content on the USB drive (available only to the LAN). I don't think it should be touching my data drives.
* I login to ssh using asymmetric key authentication. The .ssh directory is in my home directory, which is on the USB flash drive.
* I have a .hushlogin file in my home directory, so there's no message of the day or anything. Actually I'm not sure all of what .hushlogin does.
* My shell is bash.
* None of the data drives or their subdirectories is in my path.
* I can only test this once a day at most (probably less frequently than that) because if it hasn't been too long, there is no problem on login (i.e., none of the drives feel compelled to spin up unless they are actually desired.
* I do have to log in to this server relatively frequently as this is how I connect to my home box from outside the network (that is, I connect to my file server, which is always on, and run a script on it that wakes my regular box, then ssh to the regular box).
Do you have any ideas on what is causing the wakeup signal to be sent to all those drives? Or if there's a log file that is likely to have information on what happened, that would be a great start.
Interesting. I wonder if spin-ups are controlled by the kernel or by user land? The "vm.block_dump" sysctl dumps inode nfo to syslog, can't remember if it logs device names as well though, and logging syscalls could be done using the audit service. Logging to disk needs syscalls too so at first best limit any rule to stat.* and read I think.
Sorry, some of that went above my head. Tell me if I'm on the right track:
I tailed /var/log/syslog and the last entry was a couple of weeks ago although I experienced the spinup issue just now, so I don't think it's being logged there. I assume that's what you were referring to.
When one connects by ssh, what things happen? It should just be .profile, .bashrc, /etc/bash.bashrc, right?
Is there a sleep state (computerwide) that might be happening and when it wakes, it also spins up the drives? Shot in the dark, here.
I need to test whether it also happens when it log in with a password. I'll test tomorrow.
I've been working on a new hypothesis and so far it seems to have worked (I'd like to check if it's even reasonable, though, in case it has been a coincidence):
The system drive is a USB flash drive, so I was thinking maybe the issue had to do with a USB suspend function (the computer itself never goes to sleep that I've seen). I was thinking maybe when I try and ssh in and the system drive (and my home directory) is not available a more general wakeup command got issued that affected the spinning drives.
To check this I pass usbcore.autosuspend=-1 to the kernel at boot. I've only been able to check the computer a few times since then, but each time the computer was working correctly and no time did it make me wait nor spin up all the SATA drives.
Do you think this is a coincidence, or can you imagine my hypothesis being correct? Somehow I thought USB suspend functions worked at a much higher frequency (like, a few seconds after last use) but until recently I had never even heard of USB suspend, so I'm by no means an expert on the subject.
|All times are GMT -5. The time now is 12:01 PM.|