This is a bit long post sorry about that. Being totally new to Linux getting some difficulty in handling this particular problem. Any help on this would be great!
We have the MRTG application for router monitoring running in on a Linux Server, monitoring and polling around 300 routers. The server has 2 CPU with 6 GB RAM.
For this we have almost 200+ cronjobs scheduled to start every 5th minute. Basically to poll every router on every 5th minute.
My question is: Is there a limitation on how many cronjobs we can run on a server (both as a good practice and as a limitation of cron queue). I dont see any configuration currently on the server.
The problem on the server is we can see that the server goes down on every 2-3 days and we do not have any panic/or other messages in the message file which can relate to some kind of problem.
The only message we see is:
crond: System error
which keeps appearing on every minute or so.
This has prompted to me to suspect if the server is going down because of cron queue growing ?
Also when I see top command I can see that there are around 800-1400 zombie processes and in the ps output lots of crond [defunct] processes.