LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 09-18-2005, 05:43 PM   #1
chiggly
LQ Newbie
 
Registered: Sep 2005
Location: United Kingdom
Posts: 5

Rep: Reputation: 0
all ports down


Hi there,

I am hoping someone can help. I recently signed up to a 3 year managed OS agreement with ***** on which they installed Red Hat Enterprise Edition.

Direct Admin, PHP, MYSQL, APache anc SSL were also configured. It had only one website running.

Everything was ok for 3 mths and then we noticed one day that we could not log on vis SSH or FTP.

The services were down. Then the next day Apache died. No errors in the log files.. Then FTP and SSH went down again. It went to a stage where it was getting re-booted every day several times a day.

It had very low volume... I requested my ISP to look into it and they came back with the following
***********************************************************
The load average is fine, there are no errors at all in any of the log files.

I will continue to monitor the server see if any process does cause a severe spike in system resource usage.

If not, and the problem persists, I would recommend scheduling a time to swap the drives into a new chassis. This will confirm or eliminate bad hardware as the culprit.
***********************************************************

Before we had the chance to look further into this..

no one could log on via SSH.. The ISP couldn't via there serial consoles and neither could the support staff in the data centre. Though Apache was still running.

If you tried to log in you would get the login prompt and password prompt, but then the session would just hang and not report anything back.

or it would just timeout.

Though 'pinging' the server got a reply.

I was then told that they were:-
mounting the disks to stop services from starting at boot and then booting to see if they would then be able to login. It was unsuccessful.

THEN..
I was told they were giving me a new chassis (cpu and ram), new disks and I should install the server from scratch again!

I am not a systems administrator but I would like to know if anyone has come across anything like this before with RedHat?

Do services going down sound like a software or hardware problem?

As they can not tell me what went wrong I have no confidence that it won't re-occur and do not have the money or time to re-set everything up again.

Can anyone shed some light ? Someone with more knowledge than myself?

Any ideas of advice is much appreciated.
 
Old 09-18-2005, 05:52 PM   #2
JimBass
Senior Member
 
Registered: Oct 2003
Location: New York City
Distribution: Debian Sid 2.6.32
Posts: 2,100

Rep: Reputation: 49
Chances are very good that it is hardware failure. Were it to be apache or the server itself screwing up, you would have logs of it. Also the fact that everything breaks, the Apache, ftp and ssh, suggests some type of problem that is beyond the control of the OS, meaning something is broken/breaking on the hardware level.

Also, what in the hell is managed OS agreeent if they don't manage the box? Nothing you did caused this problem, how can they sell you a support contract and then not provide a running machine? Sounds very suspicious. They are pulling the old "cover-thine-own-buttocks" approach to this, which is a fairly good indication that it could be their fault.

Peace,
JimBass
 
Old 09-18-2005, 06:06 PM   #3
Matir
LQ Guru
 
Registered: Nov 2004
Location: San Jose, CA
Distribution: Debian, Arch
Posts: 8,507

Rep: Reputation: 128Reputation: 128
Sounds to me, most likely, like failing hard disk(s). Suspicious program failure usually indicates bad ram or disk, consistent failures on the same services (after reboot) sounds like disk to me.
 
Old 09-18-2005, 07:23 PM   #4
chiggly
LQ Newbie
 
Registered: Sep 2005
Location: United Kingdom
Posts: 5

Original Poster
Rep: Reputation: 0
Thank youvery much for your answers.


They are telling me that they are running over backwards to help and tryign to say it is my fault and I NEED to pay to have it all configured again as it is not part of their SLA!!!!!

They are also trying to say I got hacked into... and need to fork out an extra £145 a month after 3 'free' months!!

All they monitor for £300 a month is PING!!!!!

They said there hardware monitors did not show anything wroing with their disks or hardware and that they monitor hundreds of other servers.

Do companies sue if you break out of a contract! I am not confident with setting up a new server if they don't give me a diagnosis. Though you have confirmed what I thought!!!!
 
Old 09-18-2005, 07:28 PM   #5
Matir
LQ Guru
 
Registered: Nov 2004
Location: San Jose, CA
Distribution: Debian, Arch
Posts: 8,507

Rep: Reputation: 128Reputation: 128
This all depends on the contract you have with them. Some observations, though. For 300 pounds (sorry, no pound sign on my kdb) on a dedicated server, they should damn well monitor more than that.

Who is your current host? I've heard very good things about rackspace in the UK. Might want to talk to them.

It's possible you were hacked in to. It *DOES* happen, like it or not. And if it did happen, it's not their fault. Conversely, if it did happen, all you need to do is reformat and reinstall, not move to new hardware. Something smells fishy here.
 
Old 09-18-2005, 07:47 PM   #6
chiggly
LQ Newbie
 
Registered: Sep 2005
Location: United Kingdom
Posts: 5

Original Poster
Rep: Reputation: 0
VERIO UK.

I should have gone with Rackspace. I didn't only because I had been with VERIO since 1999 December.

I asked about hacking and they said there was no increase in bandwidth.. no unusual activity. no nothing on the server. Talking to them though it is like they are looking for anything to blame it on but themselves. PLUS, they still have the website up! How! Doesn't make sense if MY SOFTWARE caused the problem as they atried to suggest.

It is the fact that from the 3rd of sept to the 13th of sept. It was ME telling them it was down.. Me telling them to reboot. When it first happened on the 3rd. there support staff said it could be disks and suggested moving them to a new chassis. They didn't. Everytime they rebooted they I aksed them to look into the problem and stop rebooting as it was hiding the true nature of the problem.

My colleague checked all the OS log files nothing. !!! They said their hw monitors showed nothing. its the fact that they were so comfortable to do nothing but reboot.. then get active towards the end .. then calmly say.. here's a NEW empty server.. start again!!!!!

They only backup data and I was told when I asked if they could restore back to the point the server was at on the 30th of August that... they are unable to do a bare metal restore of applications etc. to get you into a state that resembles where yo were last week' as ONLY windows servers have that feature.

So I am out of pocket in terms of re-configuring the server...

I have a high volume site still on the VPS (virtual private server) and to tell the truth if it was on that server I would be bankrupt now.

AS they are offering no diagnosis to the problem I do nto feel confident with re-configuring (paying for it) and starting afresh as I have no assurances it won't happen again.
 
Old 09-18-2005, 08:50 PM   #7
Matir
LQ Guru
 
Registered: Nov 2004
Location: San Jose, CA
Distribution: Debian, Arch
Posts: 8,507

Rep: Reputation: 128Reputation: 128
I would definitely move somewhere else. At the kind of prices you pay, they should offer bare metal restoration services. It's complete BS that they only back up data. On a DEDICATED server, how do they now how your filesystem is set up?

I can't imagine working with this company.
 
Old 09-19-2005, 03:18 AM   #8
chiggly
LQ Newbie
 
Registered: Sep 2005
Location: United Kingdom
Posts: 5

Original Poster
Rep: Reputation: 0
So bare metal restoration is possible on a LINUX server?

The sales rep told me it was ONLY available on WINDOWS.
 
Old 09-19-2005, 12:39 PM   #9
Matir
LQ Guru
 
Registered: Nov 2004
Location: San Jose, CA
Distribution: Debian, Arch
Posts: 8,507

Rep: Reputation: 128Reputation: 128
They may only provide it on Windows, but yes. They could copy your drive, use a tool like norton ghost, whatever.
 
Old 10-16-2005, 05:12 PM   #10
chiggly
LQ Newbie
 
Registered: Sep 2005
Location: United Kingdom
Posts: 5

Original Poster
Rep: Reputation: 0
They have finally told me that this is what went wrong.... after I insisted..


We have identified a bug with the auditd daemon that froze access via shell (console and ssh) when the /var partition was more than 80% full. /var is now 21.3% full.


Anyone ever heard of this? Considering the box was hardly used? Little to no server activity.


Thanks in advance.
 
Old 10-16-2005, 06:09 PM   #11
BoldKiller
Member
 
Registered: Apr 2002
Location: Montreal, Quebec
Distribution: Debian, Gentoo, RedHat
Posts: 142

Rep: Reputation: 15
I did a quick google. There seem to be quite a lot of know bugs with auditd on Red Hat enterprise. But I did not see anything saying it would freeze the console and ssh when /var got filled.

Frankly, I dont see how this could happend. If it was something like "when there is less than XX mb" Than would make sense. But the percentage??

I think they really dont know, but they dont want to say it. I too would be ashame to tell a client paying me this kind of money that I have no clue at what is going on!!

Anyway, good luck.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Cannot Open Mail Server Ports 25, 110, and 220. Other Ports will open. Binxter Linux - Newbie 9 11-29-2007 02:03 AM
need help with ports alagenchev Linux - Security 5 10-22-2005 07:29 PM
ports firebug1 Slackware 3 09-22-2005 07:53 PM
? about ports bwoodwar Linux - Networking 3 09-19-2005 04:33 PM
ports Pedroski Linux - Newbie 5 11-20-2004 03:33 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 08:15 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration