filesystem /var file system suddenly utilizes 100%
Linux - EnterpriseThis forum is for all items relating to using Linux in the Enterprise.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
filesystem /var file system suddenly utilizes 100%
Hi,
I had an issue with redhat server whereby /var file system was suddenly utilized 100% of space. It was triggered by alert and it was just a short issue. The server was just fine few minutes after that. There were no logs in /var/log/messages.
I suspect it could be there was a huge data loaded to other folders on that /var directory earlier but I could not confirm this or maybe other issues.
How can I trace what was going on that particular time?
If /var suddenly filled up and then emptied, my guess in something humungous was written to /var/tmp. Perhaps the process crashed when it ran out of space, or it moved the data on. Not that much writes to /var/tmp
Indeed, the time stamp for "check_log_messages._var_log_messages.messagelog" file is the exact time the issue occurred. Could this be the issue? I have no idea what is this file for..
I compared to other server, all files in /var/tmp/check_logfiles/ where own by nagios utilize only 8.0K. So I do not think this is the cause. Had googled around but haven't find anything yet.
Of course it's not there, because your space issue has resolved itself. Your usage went to 100% then back to normal. I was just thinking back - Where can a program erase files? The time to check is when usage is at 100%.
The time was at 12.30 but it was not logged in any logs of what was going on. It is hard to trace from OS level.
There were no cron jobs running at the time. Logrotate was fine. I was just thinking it could be due to application but I would like to check from OS level first before asking application team to check further..
Well, one way is turn on process accounting. That way you will get a log of the processes running, and when that process terminates. If it is a process aborting due to no disk, the disk will be freed, and I believe the accounting entry will contain the reason for the exit (exit status). This is not exactly precise as it will not identify the file name of the failure.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.