Monitoring uptime
Any uptimed users here?
Monitoring uptime used to be a badge of honor and a fun way to measure reliability and availability. These days with frequent reboots for security patching, that is now somewhat out of style. Ok, go to the next level. How many nines of uptime? Most folks won't know without some kind of help. uptimed seems suited for monitoring how many nines of availability. Just looking for conversation. Thanks again. :) |
I use GKrellM and it has a built-in uptime display.
Per the man page, I believe that it monitors /proc/uptime. I think the uptime command also monitors /cat/proc. |
My bad for not explaining better. I don't want to monitor or display uptime. I am thinking about uptime history because a reboot restarts the uptime counter.
A cron job run every minute to store data and a script to do the math could suffice. If on average a server is rebooted once every two weeks, and the reboot takes about 4 minutes, that is about 99.98% uptime. Might be nice to display that history. |
What if you put something like "uptime >> /some_directory/uptime.log" in rc.local_shutdown? Then as long as you shutdown/reboot cleanly you will have a log of all your uptimes. You could also timestamp it and then make a script to parse the log and do the math to average it all out for your percentage stats.
Just a thought |
my slackware stable uptime is always the length of time between kernel updates
no need to monitor anything, it's all in the changelog |
Perhaps you could write a script to log the uptime to a file at shutdown.
This article tells how to do it with SysVinit and SystemD: https://opensource.com/life/16/11/ru...shutdown-linux Here's a (very old) LQ thread on the topic: https://www.linuxquestions.org/quest...utdown-323412/ |
Thanks for the replies. The original question was whether anybody is using uptimed. :)
That said, a script at shutdown/reboot is okay most of the time but won't help with inadvertent shutdowns. A cron job run every minute or two would be better. The last command contains boot times and might suffice. Either way a script is needed to do the math. How to display the data is another question. Yeah, I can do that. :) I started thinking about the idea. I love shell scripting but I looked online to see if anybody had already invented the same wheel. I found uptimed and hence the original question. BTW, late this afternoon at work I installed uptimed on some test systems. Looks like the tool does not understand containers because I got the same results as the host system. I need to look into that after the weekend. :) |
Quote:
Disclaimer: I've only tested it for a short while since I've only hacked this out over the last hour. Please make suggestions or edits if you please. Code:
#!/bin/bash Code:
Current uptime of 0 Days, 00 Hours, 15 Minutes, 53 Seconds Edit 2: I cleaned up the script a bit and added more statistics Edit 3: Added cron job routines, streamlined script and better documented |
Quote:
I'm not yet committed to uptimed. Just curious if others use the tool. As uptimed might not support containers, a home-grown solution might be preferred. Quote:
The last command uses /var/log/wtmp. The log stores system boot start times and uptime, which could be used to create a cumulative history. I tried a quick test in a VM and the last command results don't nicely handle inadvertent shutdowns/reboots. The last command probably is not a reliable candidate for creating an uptime history. |
I edited the script in my previous post to add some more stats like shortest, longest, average, and total downtime. See the edited post for more details. The only reason for the crappy stats there is because I rebooted a couple times to test functionality :)
The issue with the cron job will be that you wont get clean stats on individual uptime sessions because it will be logging in intervals determined by the cron's frequency, not actual lengths of full sessions. This wouldn't be an issue if all you're interested in is total uptime and percentage but you wont be able to easily determine the other stats I've mentioned, and you also still have a chance of unclean shutdown between cron jobs. That would be the point where I'd break off into a more of an actual program to keep track of everything and I guess "uptimed" is an option where someone has done this. Also, I just like practicing my scripting and saw no need to make things more complex with writing/compiling programs in C or some other language. :) I guess you could use the output of "last" like you mentioned but I think the script would get ugly with cleaning up and parsing that info to get what you want. |
There's a very nice program that is most suitable here, called downtimed. Instead of measuring uptime, it measure downtime. Which if you thing about it, it possibly more important. https://github.com/snabb/downtimed
|
Quote:
|
I didn't know about uptimed until reading about it here.
Traditionally I've used Nagios to track/monitor system uptimes, etc, but that's a heavy solution compared to uptimed, which might fill the niche for systems which don't warrant Nagios monitoring. |
Quote:
|
This thread is getting a little old but I had time today and I updated the bash script in post #8 again to include catching improper shutdowns with a cronjob. It'll require setting lines in rc.local_shutdown, rc.local, and crontab to function. See the script header for details.
The output report has also expanded a bit. Edit out whatever you dont want if you use it, or don't use it, it was fun to play around with scripting again. |
All times are GMT -5. The time now is 06:30 PM. |