[SOLVED] Centralized monitoring solution for small computer network
Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Centralized monitoring solution for small computer network
Hi fellow linux users!
I have 3 machines at home each running different versions of slackware and all connected in a small network. Everything works fine so far. I am however searching for some kind of centralized solution to perform the monitoring of the machines from a centralized point. Right now, the machines are more or less independent. Ideally, it should be available through a webpage so I can access it from anywhere only by using a web browser.
I have a desktop PC (my main machine which I use every day but doesnt run anything specific in the background, no apache server, no database server, etc, just a stock install of Slack).
Another machine is a media center in my living room. This one is tailored toward media playback with XBMC but also doesnt run anything except a few scripts for MythTV and a small mysql server for MythTV's recordings. Except that, I cant recall it running anything else..
Finally, the main machine is my home server. This machine runs dozens of processes and applications (MySQL, Apache, Zend, Zoneminder, CouchPotato, Sickbeard, SABnzbd+, Webmin, and literally dozens of rc scripts.
Basically, the solution I am seeking would provide me with a general overview of the machine's state (both hardware & software) and provide some information such as:
hard drive status
hard drive temperatures
storage status (RAID array states, LVM, leftover space & usage)...
Temperature sensors
RAM usage
CPU usage
The status of the servers & applications (mysql, apache, etc)
Some kind of "syslog" where I would have all relevant logs in the same place..
The machine's network info and hardware spec (mac address, IP address, uptime, blabla)
Logged on users
Running processes
Ethernet connection status (bandwidth, speeds, transfers, opened ports, etc)
In other words something very similar to what Webmin provides, only I would like all machines to be accessible on the same location, and not having to install webmin on all machines. Having such solution would prevent some bad stuff like the mysql running out of disk space and crashing other apps... My goal is to have all relevant info in a snapshot to identify potential problems and prevent losses. I like to use the CLI via SSH or tightVNC and run my own probing, but its time consuming.
I am sure others have found a decent or excellent solution. I have found Nagios. Has anyone used it? Is it doing what I want it to do? Anybody can provide feedback?
Please note that I am not really searching for a management console (to do admin stuff like installs/uninstalls or modifications) but a monitoring solution. All admin tasks I will keep doing them locally on the machine or via SSH (im old school).
Hated nagios.
All that c-li (and I don't fear no c-li) editing for remote hosts, and more editing for user-supplied custom commands +\- syntax, blah blah blah. Inheritance in Nagios Objects is a bitch.
zabbix is what I choose to use.
It replaced Icinga+Cacti here at cirrhus9.com for our client host monitoring and alerts. It saves me time moving elements (as a copy of...) to new hosts.
the zabbix_agentd is in a repo near you and can be configured with 2 edits in the /etc/zabbix/zabbix_agentd.conf
Others may have more to say and I'm not sure what "Webmin provides".
It doesn't utilize rrd tools and the data is stored in a mysql db.
Clients can receive a range of specialized "items", some could be zabbix-specific (internal to zabbix) entries, and others could be snmp-based inquiries, to custom client-side scripts.
Properly configured, you could easily create a zabbix user that has R/O perms on hosts assigned to that user, which they can login from anywhere and view the hosts data.
I got particularely interested in 3 of them (by comparing features).
Zabbix (the one you recommended) which seems very good but doesnt offer a trending forecast (minor setback IMO)
Pandora FMS: seems to have all the same features as Zabbix except it does trending forecast. Plus, the interface looks snappy and nice and uses Perl (better than PHP?)
Zenoss: similar to pandora & zabbix but uses Python for the interface.
Again, without trying all three of them and comparing (I wont as I dont have time to do so) it seems to boil down to UI coding language (perl, python or php)... Other than that, they all supprot Mysql which just became a requirement for me.
Have you (or anyone else) tried any of the above solutions? If so, how did it go?
Well, I have used Nagios (for alerts etc) & Cacti for trend graphing.
I was quite happy with it, but everybody has preferences (much like distros).
You should have a good look at the relevant home websites for eg screenshots and the like.
As per that wikipedia page, there's a lot to choose from; there's no one 'best' for everybody.
I have tried zabbix on my slackware server. While the installation went fairly well (if you read the README.Slackware files that comes with the packages).
I however have some issues with the hosts config. After login, I see several red error messages saying "Unable to select configuration" and there is nothing in the pages (no machine info, etc..) Everything is empty.
I have tried to fiddle with the config files (in /etc/xzabbix/) to no avail.
In the zabbix_server logs I see:
Code:
28771:20130213:221255.143 Sending list of active checks to [127.0.0.1] failed: host [localhost] not found
28771:20130213:221455.275 Sending list of active checks to [127.0.0.1] failed: host [localhost] not found
28771:20130213:221655.407 Sending list of active checks to [127.0.0.1] failed: host [localhost] not found
28771:20130213:221855.473 Sending list of active checks to [127.0.0.1] failed: host [localhost] not found
28771:20130213:222055.497 Sending list of active checks to [127.0.0.1] failed: host [localhost] not found
28771:20130213:222255.529 Sending list of active checks to [127.0.0.1] failed: host [localhost] not found
And in zabbix_agentd logs I see:
Code:
29927:20130213:221255.143 No active checks on server: host [localhost] not found
29927:20130213:221455.275 No active checks on server: host [localhost] not found
29927:20130213:221655.407 No active checks on server: host [localhost] not found
29927:20130213:221855.473 No active checks on server: host [localhost] not found
29927:20130213:222055.497 No active checks on server: host [localhost] not found
29927:20130213:222255.529 No active checks on server: host [localhost] not found
I wonder if these are related ? The options in the config files are:
telnet localhost 10050 (after about 2 seconds the command exited with the message in bold below...
Code:
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Connection closed by foreign host.
If I replace the "localhost" statement for the Server (actually there is no server argument but DBHost), zabbix_server complains that it cant connect to the mysql server on the server (Cannot connect to the mysql server at 192.168.0.101)..
By reinstalling the databases, I ended up having a setup that works!!!
(So I think...). If its true then I am really not sure what happened the first time I installed the databases...
I am not very good with it so far.. I have troubles navigating and finding the info I want to see. I guess a bit of practice and reading the manual will greatly help.
In the meantime, have you disabled some of the alarms such as:
Code:
POP3 server is down on Server
News (NNTP) server is down on Server
IMAP server is down on Server
FTP server is down on Server
Since I am not running these services on my local Server, I find it useless (for me) that Zabbix reports these as problems... Ill dig more and post back some of the findings!
localhost is just a plain-jane vanilla host template definition that includes a broad range of checks for many standardized daemons and their ports for the Linux platform.
If you wish to disable those:
/zabbix/hosts.php > "zabbix server" > items.
you can toggle Active or Disabled for each of those elements.
Similarly you can also delete them by using the check-box (left column) and the drop-down at the bottom of list on any page.
I tend to disable before remove.
If you are not intending to ever have a need to monitor POP3, NNTP, IMAP, or FTP, then delete away.
You can always put them 'back' from the "Template Linux", surprise!, later.
zabbix_agentd on any foreign host and 2 edits to /etc/zabbix/zabbix_agentd.conf and restart, done. 5 minute client.
hey Habitual, if you dont mind giving me a hand with the setup, just to get me going, Ill be very glad!
So Ive added all hosts and the icons are blue (blue Z icon) which I believe indicate the server can communicate with the hosts.
What I dont understand f that there seems to be no metrics (things to measure) on the 2nd and 3rd hosts I added. For the first host I added (the actual server where zabbix server runs), the host was neatly organized and metric were categorized as such:
But for the other hosts, I only get a "dump" of the metrics, not organized by categories. I am confused as to how and what the "templates" are actually doing??
What use does the network map has??? I have built mine but Im not sure what its good for..
As stupid as it sounds, can you monitor little network appliances such as switches, a Wifi access point, and a UPS connected via USB to a workstation??
Also, do you know if there is a way to add custom triggers or metrics? For example, on my server, I run a lot of php/python apps (SAB, Couchpotato, Newznab, Sickbeard, and other apps I access through the browser... Id like zabbix to monitor these as well to make sure they run.. Same for Slackware's well known rc scripts
If you know any of these answers, please share!! in the meantime, I will continue plating with it!
Thanks!
hey Habitual...
What I dont understand f that there seems to be no metrics (things to measure) on the 2nd and 3rd hosts I added.
Please call me JJ, Habitual is just an .alias I use for formality.
Understanding will come.
These 2nd and 3rd hosts...
You install zabbix_agentd, I take it on them?
These 2 values need to be edited on each remote host that has a "host" entry at /zabbix/hosts.php
Code:
Hostname=uniq_identifier used in /zabbix/hosts.php
Server=IP.of.ZBX.SRV
Don't forget to "bounce" the zabbix_agentd daemon.
These are titled "Applications" in /zabbix/hosts.php and are just 'logical' groupings defined in each host, or template used for it.
It is also grouped under /zabbix/latest.php for an easy way to classify relevant checks or other form of classification.
Also under /zabbix/hosts.php you can easily edit one of those metrics by "getting there faster" should you need/desire to edit. Let me give a real-life example...
I use a combination of zabbix_agent and snmp items to pull data from a remote host.
When I use the "Create Item" function in /zabbix/items.php (We're adding a desired metric) I designate 'snmp' under the "New application" entry on the create item sub-page, and I have all snmp items grouped neatly together as an Application subset at /zabbix/hosts.php or /zabbix/latest.php
This is a time-saving feature IMO and saves you, me, and we a lot of time scrolling through what can be pages and pages of "items" in a hosts' definition. They are merely a classification mechanism and a neat navigation feature.
As a snmp-related note: Sometimes you'll have a target to monitor and the agent can't be installed, such hosts zabbix will have to rely upon simple checks only, or snmp data.
What I dont understand f that there seems to be no metrics (things to measure) on the 2nd and 3rd hosts I added. For the first host I added (the actual server where zabbix server runs), the host was neatly organized and metric were categorized as such:
But for the other hosts, I only get a "dump" of the metrics, not organized by categories. I am confused as to how and what the "templates" are actually doing??
I'll show you how to do a mass update to the list of items that are "dumped" without any categories. It's a 2 step affair and about 2 minutes. No data loss. You just concentrate on getting comfortable with the new shiny thingie.
Quote:
What use does the network map has??? I have built mine but Im not sure what its good for..
Maps are great for a quick summary or "cheat-sheet" of what's what is happening graphically with a particular set of hosts. See attached, I use it for a NOC View. Each element is just a host Type and I believe the rendered map will show any errors on the host, the way /zabbix/dashboard.php will show you issues via "Last 20 issues" in a textual manner, the map shows issues graphically.
Quote:
As stupid as it sounds, can you monitor little network appliances such as switches, a Wifi access point, and a UPS connected via USB to a workstation??
Switches and UPS I believe are do-able. Any host you can install zabbix_agentd on, or net-snmp against is a candidate for monitoring. UPS via USB? Sure, monitor the mount or something a little more clever.
IPMI also. Wifi: zabbix_agentd.conf can be configured to work.
Quote:
Also, do you know if there is a way to add custom triggers or metrics? For example, on my server, I run a lot of php/python apps (SAB, Couchpotato, Newznab, Sickbeard, and other apps I access through the browser... Id like zabbix to monitor these as well to make sure they run..
Same for Slackware's well known rc scripts
You lost me on the rc scripting requirement? "Custom triggers or metrics" we may have to discuss but there are tons and tons of ways to customize triggers. If Network is down on X, don't report process Y as down, or the Z[fs] mount as 'down'. It's a bit complicated and not my strong suit but the zabbix.com forum community is helpful.
wrt:
Quote:
(SAB, Couchpotato, Newznab, Sickbeard)
Easy, Easier, especially if they have their own port
Scripts on the remote host to check that same host are very easy and are achieved in 2 ways, a remote script file proper, or a UserParameter= entry in /etc/zabbix/zabbix_agentd.conf on the desired target.
I added this to one of my client's hosts to check for the mongod process. I add this to /etc/zabbix/zabbix_agentd.conf and does a "pidof mongod | wc -l" and reports the result back to zabbix that mongod is running or not.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.