Fedora This forum is for the discussion of the Fedora Project. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
08-04-2009, 06:33 PM
|
#1
|
LQ Newbie
Registered: Aug 2009
Posts: 3
Rep:
|
Hello,
I've had this fedora box setup for me to use as a mailserver. It's running sendmail and i'm using pop3d in order to allow users to check their email.
The problem is now that for the past 3-4 days the server keeps crashing at around 10am. I wanted to know how i would go about troubleshooting this. Where is the logs i can look out? /var/logs doesn't seem to have the proper logs.
Thank you
just an update.
i checked the "messages" and "secure" files in /var/log and all i see is just someone running a brute force on my SSH port and trying different users and failing.
Last edited by unSpawn; 08-04-2009 at 07:31 PM.
Reason: //Merge posts to retain 0-reply status
|
|
|
08-04-2009, 07:55 PM
|
#2
|
Moderator
Registered: May 2001
Posts: 29,415
|
Quote:
Originally Posted by kir2u
The problem is now that for the past 3-4 days the server keeps crashing at around 10am. I wanted to know how i would go about troubleshooting this. Where is the logs i can look out? /var/logs doesn't seem to have the proper logs.
|
Crashing how? Does it reboot spontaneously? Or do you have to reboot it? Does it show errors on the console or when you log in? When a machine reboots unintendedly, reading back /var/log/messages lines from the approximate time of reboot might reveal information about processes that ran or errored out. Also check at which time logrotate kicks in (/etc/crontab) and with what configuration (/etc/logrotate.d/syslog) so you know if you also need to read back archived copies of /var/log/messages. Since 10AM sounds too regular I'd check (copies of) /var/log/cron and root crontab (/var/spool/cron/root) as well. If none of the logs reveal clues at the approximate time of reboot then you might want to start logging more information by tweaking what gets logged in /etc/syslog.conf (e.g.: '*.debug -/var/log/debug'), running SMART checks and collect system statistics with Atop, Dstat or Collectl.
|
|
|
08-04-2009, 10:11 PM
|
#3
|
LQ Newbie
Registered: Aug 2009
Posts: 3
Original Poster
Rep:
|
Quote:
Originally Posted by unSpawn
Crashing how? Does it reboot spontaneously? Or do you have to reboot it? Does it show errors on the console or when you log in? When a machine reboots unintendedly, reading back /var/log/messages lines from the approximate time of reboot might reveal information about processes that ran or errored out. Also check at which time logrotate kicks in (/etc/crontab) and with what configuration (/etc/logrotate.d/syslog) so you know if you also need to read back archived copies of /var/log/messages. Since 10AM sounds too regular I'd check (copies of) /var/log/cron and root crontab (/var/spool/cron/root) as well. If none of the logs reveal clues at the approximate time of reboot then you might want to start logging more information by tweaking what gets logged in /etc/syslog.conf (e.g.: '*.debug -/var/log/debug'), running SMART checks and collect system statistics with Atop, Dstat or Collectl.
|
i dont think it's an actual reboot. It just hangs of some sort because i stop getting my mails and can't SSH to the box so have to manually restart the server to get back into it.
|
|
|
08-04-2009, 10:18 PM
|
#4
|
LQ Newbie
Registered: Aug 2009
Posts: 3
Original Poster
Rep:
|
another thing i see is:
error: stat of /var/log/ppp/connect-errors failed: No such file or directory
when i do : logrotate /etc/logrotate.conf
could this be it? it's in the daily cron tab folder.
|
|
|
08-05-2009, 12:46 AM
|
#5
|
LQ Guru
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,430
|
That probably(??) shouldn't cause as much trouble as you're having, but it's definitely worth fixing.
Have a good look through your logfiles for anything at that time of day or just before.
|
|
|
08-05-2009, 05:18 AM
|
#6
|
Moderator
Registered: May 2001
Posts: 29,415
|
Quote:
Originally Posted by kir2u
i dont think it's an actual reboot. It just hangs of some sort because i stop getting my mails and can't SSH to the box so have to manually restart the server to get back into it.
|
Depending on hardware specs and load a machine may appear to hang for some period of time, but since you did not post details to show it actually crashed that's just speculation. If "manually restart the server" means hard resetting the machine then you may expect all sorts of problems. Filesystems are quite robust but they were not intended to suffer continuous and survive deliberate power cuts like that. Like Chrism01 said the missing /var/log/ppp/connect-errors is not going to make the machine hang. I think I gave you enough pointers to get started so do get back to us in more detail about what logs you looked at and what you did find.
|
|
|
08-06-2009, 07:49 AM
|
#7
|
Member
Registered: Jul 2003
Posts: 244
Rep:
|
One thing I've seen on rare occasions is systems hangs caused by flaky hardware or a high process which takes over the system and both show up as gaps in collectl data. In other words, when collectl is run as a daemon and taking samples every 10 seconds, each sample is exactly 10 seconds apart within a msec of each other with virtually no missed samples. If some piece of hardware misbehaves or some very high priority process such as the 'oom killer' takes over the system, no other process will get any run time until it finishes. This will show up as a few missing collectl samples and sometimes as many as several minutes worth.
-mark
|
|
|
08-06-2009, 08:13 PM
|
#8
|
LQ Guru
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,430
|
If its always around 10am I'd start by looking at all the crontabs... and also look through any logfiles at about that time (start from 09:45).
|
|
|
08-06-2009, 09:00 PM
|
#9
|
Moderator
Registered: May 2001
Posts: 29,415
|
I already mentioned all of that in post #2.
|
|
|
All times are GMT -5. The time now is 05:50 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|