LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 06-26-2008, 12:25 PM   #1
baazil1
LQ Newbie
 
Registered: Jun 2008
Location: Cincinnati
Posts: 5

Rep: Reputation: 0
Linux Server Crashing and this newb don't know why


OK, so I was thrown into a situation and am hoping this place can help me out. If someone is REALLY interested in helping me out, baazil1 is my AIM and Yahoo IM alias, contact me anytime.

Here is the situation:

First off, operating system and software:
Fedora Core 6 with Plesk 8.2 (64 bit)
- hosted at 1and1.com

Basically, what is going on is the server keeps crashing on a somewhat regular basis. In reading through threads and googling some things, I found the messages file in the var/log dir and opened it. I found the following:

Code:
Jun 24 10:53:36 u15288850 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Jun 24 10:53:36 u15288850 dhclient: DHCPACK from xxx.xxx.xxx.xxx
Jun 24 10:53:36 u15288850 dhclient: bound to xxx.xxx.xxx.xxx -- renewal in 74067 seconds.


Jun 26 01:45:51 u15288850 xinetd[3306]: START: smtp pid=10713 from=xxx.xxx.xxx.xxx
Jun 26 01:45:58 u15288850 dhclient: DHCPREQUEST on eth0 to xx.xxx.x.xxx port 67
Jun 26 01:46:10 u15288850 dhclient: DHCPREQUEST on eth0 to xx.xxx.x.xxx port 67
Jun 26 01:46:21 u15288850 xinetd[3306]: EXIT: smtp status=1 pid=10713 duration=30(sec)
Jun 26 01:46:22 u15288850 dhclient: DHCPREQUEST on eth0 to xx.xxx.x.xxx port 67
Jun 26 01:47:09 u15288850 last message repeated 3 times
Jun 26 01:48:20 u15288850 last message repeated 5 times
Jun 26 01:49:26 u15288850 last message repeated 5 times
Jun 26 01:50:28 u15288850 last message repeated 5 times
Jun 26 01:51:43 u15288850 last message repeated 7 times
Jun 26 01:52:51 u15288850 last message repeated 5 times
Jun 26 01:54:00 u15288850 last message repeated 5 times
Jun 26 01:55:16 u15288850 last message repeated 4 times
Jun 26 01:56:24 u15288850 last message repeated 6 times
Jun 26 01:57:34 u15288850 last message repeated 4 times
Jun 26 01:58:40 u15288850 last message repeated 4 times
Jun 26 01:59:45 u15288850 last message repeated 5 times
Jun 26 02:01:05 u15288850 last message repeated 5 times
Jun 26 02:02:16 u15288850 last message repeated 5 times
Jun 26 02:03:19 u15288850 last message repeated 4 times
Jun 26 02:04:28 u15288850 last message repeated 6 times
Jun 26 02:05:32 u15288850 last message repeated 5 times
Jun 26 02:06:43 u15288850 last message repeated 5 times
Jun 26 02:07:46 u15288850 last message repeated 5 times
Jun 26 02:08:56 u15288850 last message repeated 5 times
Jun 26 02:10:03 u15288850 last message repeated 4 times
Jun 26 02:11:14 u15288850 last message repeated 5 times
Jun 26 02:12:15 u15288850 last message repeated 5 times
Jun 26 02:13:19 u15288850 last message repeated 4 times
Jun 26 02:14:23 u15288850 last message repeated 4 times
Jun 26 02:15:28 u15288850 last message repeated 4 times
Jun 26 02:16:29 u15288850 last message repeated 4 times
Jun 26 02:17:40 u15288850 last message repeated 6 times
Jun 26 02:18:46 u15288850 last message repeated 7 times
Jun 26 02:19:56 u15288850 last message repeated 4 times
Jun 26 02:21:05 u15288850 last message repeated 7 times
Jun 26 02:22:13 u15288850 last message repeated 5 times
Jun 26 02:23:31 u15288850 last message repeated 4 times
Jun 26 02:24:35 u15288850 last message repeated 4 times
Jun 26 02:25:47 u15288850 last message repeated 5 times
Jun 26 02:27:00 u15288850 last message repeated 5 times
Jun 26 02:28:03 u15288850 last message repeated 4 times
Jun 26 02:29:19 u15288850 last message repeated 6 times
Jun 26 05:24:10 u15288850 syslogd 1.4.1: restart.
Jun 26 05:24:10 u15288850 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Jun 26 05:24:10 u15288850 kernel: Linux version 2.6.23.14-20080205a (root@buildd-amd64) (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #1 SMP Tue Feb 5 14:13:59 CET 2008
I thought the lines from June 24 would be useful as that is pretty much the only change over a three day period. The rest of the log is condensed to what takes place on the June 26 lines. At some point toward the end of the log (BOLD), the server stopped responding and had to be restarted.

Can anyone help me understand what is going on here? More useful information. I have basically NO experience with Linux servers or Fedora or Plesk. I am more of a windows guy (please hold applause ) and was asked to take a look here. I can get around the server via SSH if there is another file or log that would be helpful to post, just let me know which one or ones to gather.

Thanks in advance for any help. Again, if you feel inclined to be a super helper, buzz me via IM at baazil1 (AOL and Yahoo)

Thanks and take care,
Clay

PS. From what I gather, it seems as if the DHCP server is requesting an IP address ALOT. Periodically, the log is, interupted by someone checking or sending mail and then for some reason, it stops responding. 1and1.com have been absolutely NO HELP at all (not surprising) and I am at my last resort here, so, again, any help would be grand.

Last edited by baazil1; 06-26-2008 at 12:27 PM.
 
Old 06-26-2008, 01:03 PM   #2
dkm999
Member
 
Registered: Nov 2006
Location: Seattle, WA
Distribution: Fedora
Posts: 407

Rep: Reputation: 35
First off, my condolences on your specialization in Windoze.

Second, the log messages you have posted mostly indicate that this box does not have a static IP address, and that it is trying desperately to get one from the DHCP server at xx.xxx.x.xxx. There could be several reasons for this lack of success: first, there may not be a DHCP server at that address; or, there might be one there, but it is configured to deny your request; or, there might be some firewall in the way (perhaps even on your machine) that is interfering either with the outbound packet (UDP port 67) or the return (UDP port 66).

Your June 24 messages show that, when the DHCP client broadcasts its request for an address, it succeeds, so if there is a firewall problem, it is not a blanket blockage on the UDP ports. But once the DHCP client has heard from the server, it wants to try to renew the lease with that same server. The succeeding requests directed to that address seem to be getting lost. Perhaps a tcpdump trace on UDP traffic on ports 66 and 67 would reveal whether the problem is outbound or inbound.

Third, is there any actual evidence that this system crashes, on only that it becomes unreachable? I realize this is a fine distinction, but one that may affect the analysis. It appears that the DHCP client activity may precede the expiration of the IP address lease, and once it expires, the machine may just become unreachable.

Parenthetically, your OS announces itself as a Debian distribution, not a Fedora Core 6 distro. That should not make too much difference in working out this problem.
 
Old 06-26-2008, 01:49 PM   #3
baazil1
LQ Newbie
 
Registered: Jun 2008
Location: Cincinnati
Posts: 5

Original Poster
Rep: Reputation: 0
Information good

Yes, sadly, windows is what i grew up on and is all I really know, but I have been thinking about a switch to Linux and based on the wealth of information from that last post, I can see it is the way to go. Windows forums have one sentence replies that leave you scratching your head, asking the same question again, but, I digress.

With all that information, thanks again, I need to pull the newbie card out again. i'll try and be precise.

There is a firewall set up on the server. It is activated, here are a current list of rules associated with it:

Code:
1	All	Any	21	TCP	Allow
2	All	Any	22	TCP	Allow
3	All	Any	80	TCP	Allow
4	All	Any	443	TCP	Allow
5	All	Any	25	TCP	Allow
6	All	Any	110	TCP	Allow
7	All	Any	143	TCP	Allow
8	All	Any	465	TCP	Allow
9	All	Any	993	TCP	Allow
10	All	Any	995	TCP	Allow
11	All	Any	8443	TCP	Allow
12	All	123	Any	UDP	Allow
13	All	53	Any	UDP	Allow
14	All			ICMP	Allow
15	All	Any	66	UDP	Allow
16	All	Any	67	UDP	Allow
I added the last two, not knowing if this is good practice or not, probably not, or I may have gotten it wrong, but is this the way to go, or do the previous rules already allow for this?

How does one run a tcpdump trace effectively? (again, extreme newb)

The evidence of the apparent crash comes from teh web site that is hosted on the server. The web site becomes unreachable and the server is presumed crashed and restarted.

Thanks again for the indepth explanation. It was insightful, I just don't know enough about Linux, yet, to get by.

Thanks and take care,
Clay
 
Old 06-26-2008, 02:18 PM   #4
dkm999
Member
 
Registered: Nov 2006
Location: Seattle, WA
Distribution: Fedora
Posts: 407

Rep: Reputation: 35
I am unfamiliar with the format you posted for your firewall rules; this must not be an iptables ruleset. But it does appear that you have the right rules in place as far as dhcp is concerned; it uses udp port 67 to make a request, and the server uses udp port 66 for the reply.

To run tcpdump in this situation (where you will need to restart the server in order to see the result), you need to do a two-step: first, capture packets and write them to a file, and (when the server comes back to life), read out the results. To do this trace, you will probably have to become super-user. This is the way you give yourself temporary root privileges; the command is su. Here is a pair of commands that should capture the necessary data:
Code:
# tcpdump -i eth0 -U -s 256 -w trace.dmp udp port 66||67
and
Code:
# tcpdump -nnX -r trace.dmp
The first command captures any udp packets on the relevant ports and stores them in trace.dmp, writing to the file after each packet. (Be sure you know what directory this file is going into, so that you can connect to the same directory after the reboot.) The second produces a (numeric) listing of this dumpfile. Armed with that, you ought to be in better shape to see where the dhcp exchange is breaking down.
 
Old 06-26-2008, 02:24 PM   #5
baazil1
LQ Newbie
 
Registered: Jun 2008
Location: Cincinnati
Posts: 5

Original Poster
Rep: Reputation: 0
The port read out

It was the 1and1.com control panel gui. Sorry, I should have posted that and written out the column titles.

Here it is again:

Code:
       RmtIP| RmtPort|lcl Port|Protcol|Action
1	All	Any	21	TCP	Allow
2	All	Any	22	TCP	Allow
3	All	Any	80	TCP	Allow
4	All	Any	443	TCP	Allow
5	All	Any	25	TCP	Allow
6	All	Any	110	TCP	Allow
7	All	Any	143	TCP	Allow
8	All	Any	465	TCP	Allow
9	All	Any	993	TCP	Allow
10	All	Any	995	TCP	Allow
11	All	Any	8443	TCP	Allow
12	All	123	Any	UDP	Allow
13	All	53	Any	UDP	Allow
14	All			ICMP	Allow
15	All	Any	66	UDP	Allow
16	All	Any	67	UDP	Allow
I will follow your instructions and see if I can get some more information.

Thanks again,
Clay

Last edited by baazil1; 06-26-2008 at 02:32 PM.
 
Old 06-26-2008, 02:47 PM   #6
baazil1
LQ Newbie
 
Registered: Jun 2008
Location: Cincinnati
Posts: 5

Original Poster
Rep: Reputation: 0
The readout

I cannot read. I performed the tcpdump and rebooted the server. when it came back up, i ran the command you had and received...

Code:
[root@u15288850 /]# tcpdump -nnX -r trace.dmp
reading from file trace.dmp, link-type EN10MB (Ethernet)
[root@u15288850 /]#
i even tried vi trace.dmp to no avail.

I am using putty SSH to log into the box. that's about the extent of my knowledge when it comes to putty commands or even linux commands.

I'll keep trying.

Thanks,
clay
 
Old 06-26-2008, 04:18 PM   #7
dkm999
Member
 
Registered: Nov 2006
Location: Seattle, WA
Distribution: Fedora
Posts: 407

Rep: Reputation: 35
Sorry, I was imprecise in my instructions; I guess I was thinking about my own world, where I could just log onto the console and run the tcpdump command and let it wait forever.

You will need to send that first command off and wait for the problem to develop; this is complicated by the fact that you are logging in remotely via PuTTY, because you will want to log off, which will terminate the command. To get this right remotely, you will need to get tcpdump running without supervision. On my system, I was able to do this by this incantation (borrowed from the way that the Fedora system starts up daemons)
Code:
$su
Password:
#runuser -s /bin/bash -c ">/dev/null 2>&1; tcpdump -i eth0 -U -w trace.dmp udp port 66||67" &
#^D
$logout
What this does is to send off a shell to execute your tcpdump write phase, and detach it from your login session. Then you wait for your server to become inaccessible, reboot (which will terminate the tcpdump and its parents), log in again, and execute the read phase directly from your shell.
 
Old 06-26-2008, 06:57 PM   #8
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.9, Centos 7.3
Posts: 17,362

Rep: Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377
Actually, you could just use the nohup cmd to detach. That's what its for:

nohup tcpdump -i eth0 -U -w trace.dmp udp port 66||67 &
 
Old 06-26-2008, 08:17 PM   #9
dkm999
Member
 
Registered: Nov 2006
Location: Seattle, WA
Distribution: Fedora
Posts: 407

Rep: Reputation: 35
Well, glory be. One learns all sorts of things on these boards. :-)
 
Old 06-27-2008, 01:28 AM   #10
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.9, Centos 7.3
Posts: 17,362

Rep: Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377Reputation: 2377

Just fyi, for a history lesson, its short for no-hangup, from the old days, when if you weren't in the same building, you were almost definitely on a dial-up cxn.
(and yes, I've used an acoustic coupler..)

'&' just puts the prog in the background, but does not disconnect it from the terminal session, which is why it dies when you logout. Normally, these days, you'll be warned if you do that.
 
Old 06-27-2008, 01:40 AM   #11
Mr. C.
Senior Member
 
Registered: Jun 2008
Posts: 2,529

Rep: Reputation: 61
I'm not sure I understand something. You said the server 'crashed' but also that you had to reboot.

Has the system actually hard crashed, and rebooted itself? Or is it still running but unresponsive? Console access?

Does your dmesg show any data from the crash ?
 
Old 06-27-2008, 07:26 AM   #12
baazil1
LQ Newbie
 
Registered: Jun 2008
Location: Cincinnati
Posts: 5

Original Poster
Rep: Reputation: 0
Update

I have run the tcpdump script that dkm and chris have suggested. I am currently waiting for another "incident" so I can then go in and read the files and hopefully there will be information in those file that will be useful.

@Mr. C - I believe I am using the term crashing very loosely. With the server location being off-site, we are not sure if it has actually crashed. The more likely condition is that is is unresponsive. Again, we have a web site hosted on this server the becomes unreachable, and based on this, we conclude the server is down, crashed, unresponsive, etc., and reboot it.

@Mr. C - kindly explain what dmesg is and how it is used?

@ChrisM and DKM - is there any special command (via putty) that I need to use to view the files that have been created once the reboot is needed? I currently have trace.dmp and nohup.out files in my root directory (i also have the root password and login with that each time).

Thanks for everyones help, this is quite interesting stuff.

Take care,
Clay
 
Old 06-27-2008, 12:29 PM   #13
Mr. C.
Senior Member
 
Registered: Jun 2008
Posts: 2,529

Rep: Reputation: 61
I might be more useful to everyone if you reserve crash for actual
machine/kernel crash, which requires a reboot. And if you use the
term unresponsive, indicate of the machine is unresponsive (eg, no
console) vs. the network (no ping), vs. the host applications you
use. They all have different diagnostics and of course remedies.

When the kernel crashes, there is valuable data available via dmesg,
also stored in /var/log/dmesg (man dmesg). Or if a NIC is having
trouble, error diagnostics may be available as well.

If you're just talking about the apache daemon becoming unresponsive,
that too requires a different diagnostic.

Are you able to remotely login to the off-site server?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Linux Server Crashing tyretradesoftware Linux - Server 2 02-12-2008 04:52 PM
Linux Newb X server probs FWSquatch Debian 12 01-18-2006 10:54 AM
Linux Web Server....Newb TFredrickson80 Linux - Newbie 12 05-04-2005 09:38 PM
Linux Server keeps Crashing chrisellis Linux - General 2 06-25-2004 09:59 PM
Newb: Pointing my domain name to my linux server once here Linux - Networking 2 09-08-2003 04:57 PM


All times are GMT -5. The time now is 05:02 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration