LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware
User Name
Password
Linux - Hardware This forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?

Notices


Reply
  Search this Thread
Old 08-18-2006, 06:32 AM   #1
timmeke
Senior Member
 
Registered: Nov 2005
Location: Belgium
Distribution: Red Hat, Fedora
Posts: 1,515

Rep: Reputation: 61
SCSI bus hangs on Fedora Core 3


Occassionally, I see a lot of Abort operations, device reset operations and scsi bus reset operations, that all time-out.
The only SCSI device I have is a Promise Ultratrak RAID device.

The messages (taken from dmesg) look something like this:
sym0:1:0: ABORT operation started.
sym0:1:0: ABORT operation timed-out.

and they seem to be coming from the kernel (according to /var/log/messages).

After a few such messages, the disk becomes unusable. Any attempt to access it (ls, find, cd, etc) just hangs. ps shows that processing are hanging (presumably waiting on IO).

Attempts to unmount the disk fail (umount says "device is busy"). Shutting down the machine ("shutdown -r now") doesn't work either.

To resolve the issue, I now do a hard shutdown (ie power button) on the machine, shut down the RAID device, restart it and then reboot the FC3 computer. After that, all returns to normal.

My kernel is 2.6.9-1.667
lspci reports:
02:08.0 SCSI storage controller: LSI Logic / Symbios Logic 53c1010 Ultra3 SCSI Adapter (rev 01)

Any ideas on how to fix the issue? Is this bug fixed in later kernel versions? Or is my RAID device to blaim?
 
Old 08-19-2006, 10:20 PM   #2
macemoneta
Senior Member
 
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,593
Blog Entries: 2

Rep: Reputation: 344Reputation: 344Reputation: 344Reputation: 344
The symptom you're report is typical of a failing hard drive. Make sure you have a good backup.

You may also want to apply some maintenance; the current kernel for FC3 is: 2.6.12-1.1381_FC3

Last edited by macemoneta; 08-19-2006 at 10:25 PM.
 
Old 08-21-2006, 02:17 AM   #3
timmeke
Senior Member
 
Registered: Nov 2005
Location: Belgium
Distribution: Red Hat, Fedora
Posts: 1,515

Original Poster
Rep: Reputation: 61
Thanks for the response, macemoneta. I'll try the maintenance first. If that doesn't help, I'll try to get my
hands on new disks.

Are there any risks involved in upgrading from the 2.6.9 kernel to the 2.6.12 kernel?
For instance, do I need to upgrade gradually (ie first to 2.6.10, then 2.6.11 then 2.6.12)?
 
Old 08-21-2006, 07:59 AM   #4
macemoneta
Senior Member
 
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,593
Blog Entries: 2

Rep: Reputation: 344Reputation: 344Reputation: 344Reputation: 344
There is never a requirement for intermediate upgrade, and there are no risks involved in the process. When you upgrade the kernel, the old kernel remains available (you will be given the choice at boot in the grub menu). If you encounter a problem, you can reboot and select the old kernel.

yum -y update kernel kernel-devel
 
Old 08-22-2006, 02:02 AM   #5
timmeke
Senior Member
 
Registered: Nov 2005
Location: Belgium
Distribution: Red Hat, Fedora
Posts: 1,515

Original Poster
Rep: Reputation: 61
OK. I'll give that a go, macemoneta and post back if I have some results.
Thansk again.
 
Old 08-25-2006, 02:31 AM   #6
timmeke
Senior Member
 
Registered: Nov 2005
Location: Belgium
Distribution: Red Hat, Fedora
Posts: 1,515

Original Poster
Rep: Reputation: 61
The kernel upgrade has been done. After the latest SCSI hang, I decided to reboot with the new kernel to see if the problem is fixed.
Unfortunately, MySql is now broken. It won't start and it lists a pthread_create error as the cause.

pthread_create seems to be a standard C function, so it may be part of glibc or something like that. Could it be that the new kernel can't work with my old C lib and I need to upgrade that lib too? I'm a bit hesitant to upgrade my C libraries, since this may very well break my processing.
 
Old 08-25-2006, 07:37 AM   #7
macemoneta
Senior Member
 
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,593
Blog Entries: 2

Rep: Reputation: 344Reputation: 344Reputation: 344Reputation: 344
You appear to be caught between a rock and a hard place. You'll need to decide whether the problem is severe enough that you are willing to upgrade your entire system (yum -y update) or if you want to live with the current situation.

Another alternative would be to try to find a SCSI controller that doesn't exhibit a problem at your current software level.
 
Old 08-25-2006, 08:13 AM   #8
timmeke
Senior Member
 
Registered: Nov 2005
Location: Belgium
Distribution: Red Hat, Fedora
Posts: 1,515

Original Poster
Rep: Reputation: 61
Indeed. I'm caught between upgrading the entire system (unknown impact) and a disk that fails occasionally (it used to be around once every few months, but now it's up to once a week).

It's probably the best solution to get a new disk drive (the increasing frequency may indicate a more fundamental hardware problem).
Unless this "yum -y update" method is relatively painless?

Just to add to the confusion: normally, I only get the SCSI bus messages (ie ABORT, BUS RESET, etc) I posted above and the disk is inaccessible (any attempt to access it freezes the terminal window).
Yesterday, on the other hand, the system did encounter similar SCSI problems yesterday (= after installation of new kernel, but still running under the old one), but the disk came back "up" afterwards, in the sense that my programs didn't freeze while they were accessing the disk. Instead, a bunch of error messages came up telling me that the disk was "read-only" and that my program's attempts to write files to it failed;
Could the kernel install have caused this change in behaviour, even though the new kernel wasn't used?
 
Old 08-25-2006, 07:53 PM   #9
macemoneta
Senior Member
 
Registered: Jan 2005
Location: Manalapan, NJ
Distribution: Fedora x86 and x86_64, Debian PPC and ARM, Android
Posts: 4,593
Blog Entries: 2

Rep: Reputation: 344Reputation: 344Reputation: 344Reputation: 344
The kernel install could not have changed the behavior, if it was not used. It's likely that you are seeing multiple failures in that case, some of which are recoverable by your current kernel.

I obviously don't know your physical environment, but is it possible that the equipment is running hot? I've seem overheated drives/controllers give an assortment of failures. That you go so long between failures seems suspicious.
 
Old 08-28-2006, 02:19 AM   #10
timmeke
Senior Member
 
Registered: Nov 2005
Location: Belgium
Distribution: Red Hat, Fedora
Posts: 1,515

Original Poster
Rep: Reputation: 61
As for physical location, the computer is in a special room, with air-conditioning.
However, this airco can be easily shut down and it is a bit suspicious that a similar computer, in the same room, also faced disk issues around the same time. So you may be on to something with the heating.
I'll try to look into the issue a bit further and will post back if I find anything.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Upgrade Fedora Core 4 (FC4) to Fedora Core 5 (FC5) hangs vogelap Fedora - Installation 10 05-22-2006 09:00 AM
Fedora Core 3 And Scsi Troubles gunslinger77 Linux - Software 1 02-16-2005 11:17 AM
Help installing Fedora Core 3 on scsi jobless_joe Linux - General 3 11-11-2004 02:08 PM
Fedora Core 1 install hangs on Loading SCSI Driver with AIC-7896 godboy Fedora - Installation 2 05-05-2004 02:07 AM
SCSI bus has been reset ...hangs... Thaidog Linux - Newbie 2 02-20-2004 04:06 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware

All times are GMT -5. The time now is 12:16 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration