Daily server checklist for newbies

anon091 · 10-10-2012, 09:40 AM

Hi guys. We have a variety of Linux servers, and I want to start getting my non-Linux folks into the habit of checking them every morning for like disk space, logs, at least the basics for now.

Does anyone have, or know a link to, a good basic/general daily Linux server checklist that says what to check and how? Or maybe we can make-up one in this thread, as i'm not having any luck finding one. Heck, maybe i'll even learn a few things from this

Thanks in advance for everyone chiming in, this could be fun.

unSpawn · 10-10-2012, 10:14 AM

Quote:

Originally Posted by rjo98

I want to start getting my non-Linux folks into the habit of checking them every morning for like disk space, logs, at least the basics

This may not be the type of response you were looking for but unless these servers have been problematic for ages w/o chance of change or require fussing over for whatever else reasons or are too disparate or there's too few of them to warrant any investment (or if you just want to play BOFH-like games with employees ;-p) I would suggest using a "dashboard" instead. Trending allows you to respond before problems manifest themselves, overview allows you to pinpoint and respond to problems quickly and using a dashboard is efficient anyway since it tends to consolidate all sorts of nfo (agents, SNMP, logs, whatever else) and may require less effort and knowledge to use. If you disagree then at least agree you don't need to check things that Just Work (tm) and focus on anomalies. Process reports (Monit), disk space, user, syslog and daemon errors (Logwatch), network anomalies (Snort report) about everything can be mailed to a central mailbox, right?

anon091 · 10-10-2012, 10:23 AM

That's the kind of response I was looking for, kinda. We're trying to evaluate different free monitoring software now to see what works, although we haven't found one yet that does everything. All the servers I was thinking of having them monitor are very old, and do have problems from time to time, so thought maybe a daily morning check of certain things they could do on each one might give us an idea before something bad happens. Ideally we're hoping we'll find some monitoring software that can email us scheduled reports of disk space/usage, deamon errors, etc, but really not sure what would best do it. Not so concerned with the network anomalies as of yet. The next couple we're going to try are Nagios and OpenNMS, see how those work. but was just thinking of hopefully putting together a basic, until-we-get-monitoring-setup morning routine going. plus it would help familiarize these people with linux a bit more, as they have no experience now really.

but yeah, I do agree, a dashboard and email alerting is the end game for us, just looking for a good intermediate (manual labor) step till we get that squared away.

chrism01 · 10-11-2012, 04:59 AM

Well, as far as manual labour goes, you could run 'top' to check cpu load, RAM, swap space.
Also 'df -h' for disk space.
You should probably also check some key logfiles, but which ones would depend on the services each system runs.
The generic/default logfile is /var/log/messages.
Depending on the exact Distro, you could check /var/log/secure.

HTH

anon091 · 10-11-2012, 07:58 AM

Thanks Chris, that does help, and those things you mentioned should probably be at the top of the list. Great start to the list, maybe some other people will chime in with other stuff as well.

oneindelijk · 10-11-2012, 08:48 AM

Try cacti
It's free and it does a decent job displaying all kinds of stuff in neat graphics.
You can find a lot of templates on the internet for various devices
(such as Linux machines, Cisco Routers etc..)

JaseP · 10-11-2012, 09:07 AM

Quote:

Originally Posted by unSpawn

...(or if you just want to play BOFH-like games with employees ;-p) ...

Oh, I love that reference!

anon091 · 10-11-2012, 04:46 PM

I tried cacti, but all it did was graphs from what I could tell.

That was a funny reference.

chrism01 · 10-12-2012, 12:20 AM

Generally I'd go Nagios for alerts and Cacti for graphs.
Theoretically Nagios has graphs ( sort of) but I can't recommend them.
There are many alert and graphing tools, some of which purport to do both, but as with Prog Langs or Linux distros, there's no 'best'; its a subjective choice.

As for my prev comment, it'd be pretty easy to write a simple script to do those basic checks once a day (eg 4am) and email the results.
Obviously there's an extensive list of things you could check for, but I'd keep it pared down if I were you; you do want people to actually read them, right?

Actually, a good tool that does a lot for you already as far as log check+email goes it the logwatch tool.

anon091 · 10-12-2012, 10:02 AM

Thanks Chris. I think here it's going to come down to Nagios or OpenNMS, just not sure which one yet. Seems Nagios has been around longer so more people use that one.