LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   "sudo: Can't mkdir /var/run/sudo/%user%: File exists" (https://www.linuxquestions.org/questions/linux-server-73/sudo-cant-mkdir-var-run-sudo-user-file-exists-854178/)

CNBarnes 01-04-2011 12:24 PM

"sudo: Can't mkdir /var/run/sudo/%user%: File exists"
 
For some reason, my mysql server (debian lenny) has begun having semi-frequent flakiness. I don't think it's MySql itself that is causing the problems - but rather something "system related".

I have looked at all the typical suspects (/var being full, etc), and nothing in any of the log files jumps out at me. But there is 1 thing that might be an indicator: The error message in the subject line.

What is really wierd is that the sudo command DOES work.

Rebooting the machine resolves the problem ... for a while. Which is to say, I do not get the error message every time I use the sudo command.


Last indicator is in /var/log/auth.log - logons (and sudo's) during the period of time of system flakiness show no entries in the auth.log file. Despite the fact that I can indeed logon, su to root, and reboot the computer.

unSpawn 01-05-2011 08:10 AM

Quote:

Originally Posted by CNBarnes (Post 4212930)
For some reason, my mysql server (debian lenny) has begun having semi-frequent flakiness.

Since when?
What happened at that time?


Quote:

Originally Posted by CNBarnes (Post 4212930)
Last indicator is in /var/log/auth.log - logons (and sudo's) during the period of time of system flakiness show no entries in the auth.log file.

If nothing gets logged or written to in /var during that time then what does 'touch /var/tmp/testfile' result in (as unprivileged user, as root)? And running strace on your sudo command?

CNBarnes 01-05-2011 09:27 AM

Quote:

Originally Posted by unSpawn (Post 4213909)
Since when?
What happened at that time?

It started roughly 2 weeks ago.. It has been up and runnning for a couple of years now (6 months since last upgrade).


Quote:

If nothing gets logged or written to in /var during that time then what does 'touch /var/tmp/testfile' result in (as unprivileged user, as root)? And running strace on your sudo command?
I'll give that a try next time it happens.

unSpawn 01-05-2011 11:07 AM

Quote:

Originally Posted by CNBarnes (Post 4214044)
It started roughly 2 weeks ago..

What happened at that time? Retracing your steps update/installation/re-configuration-wise could help.

CNBarnes 01-05-2011 11:32 AM

Quote:

Originally Posted by unSpawn (Post 4214147)
What happened at that time? Retracing your steps update/installation/re-configuration-wise could help.

Nothing. Nobody had even logged into the machine in months (it's a mysql server that sits there and collects data from various websites).

CNBarnes 01-06-2011 09:24 AM

Quote:

Originally Posted by unSpawn (Post 4213909)
If nothing gets logged or written to in /var during that time then what does 'touch /var/tmp/testfile' result in (as unprivileged user, as root)? And running strace on your sudo command?


touch: cannot touch `/var/tmp/testfile': Read-only file system

And then this:

vmsql:~# mount
/dev/sda1 on / type ext3 (rw,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)


vmsql:~# ls -alF
total 0

vmsql:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 16T 16T 2.7G 100% /
tmpfs 1015M 0 1015M 0% /lib/init/rw
udev 10M 52K 10M 1% /dev
tmpfs 1015M 0 1015M 0% /dev/shm



Ok, so what would cause /dev/sda1 to mount ro?

unSpawn 01-06-2011 09:36 AM

Quote:

Originally Posted by CNBarnes (Post 4215294)
Ok, so what would cause /dev/sda1 to mount ro?

This, the 'tune2fs -e remount-ro' equivalent: '/dev/sda1 on / type ext3 (rw,errors=remount-ro)'. Deploying a server w/o a robust partitioning scheme and w/o monitoring is not such a fab idea.

CNBarnes 01-10-2011 04:11 PM

Just a follow - we finally figured out what was corrupting the sda1 disk.

Answer: VMWare

This server is a VMWare 4.0 virtual machine. The Lenny kernal we were running was *slightly* older (2.6.18). Upgrading to 2.6.26-2-688 seems to have solved the problem.


All times are GMT -5. The time now is 05:51 AM.