kernel panic reading specific sectors of sata hard disk
Linux - KernelThis forum is for all discussion relating to the Linux kernel.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
kernel panic reading specific sectors of sata hard disk
Hi everybody,
since yesterday I get consistently a kernel panic message when reading from some sectors of a partition of my sata hd. Those sectors belong to a file (which I was trying to back up yesterday), so any read, even a simple "cat FILENAME", provokes a kernel panic, every time (be it after hours of use or just after a fresh boot after a long rest).
I get the exact same problem with the same file (and so far only with that file) using archlinux (kernel 2.6.25), opensuse 10.3 (kernel 2.6.22), mandriva 2008.1 live (kernel 2.6.24) and systemrescuecd 0.4.2 (kernel 2.6.23).
The latter distro gives me the most info about what happens before freezing (I found nothing in /var/log/messages; caps-lock and scroll-lock lights blink):
ata4: timeout waiting for ADMA IDLE, stat=0x440
ata4: timeout waiting for ADMA LEGACY, stat=0x440
ata4.00: exception Emask 0x1 SAct 0x0 SErr 0x0 action 0x0
ata4.00: CPB resp_flags 0x11: , CMD error
ata4.00: cmd c8/00:00:f4:83:ed/00:00:00:00:00/e8 tag 0 cdb 0x0 data 131072 in
res 51/40:00:ef:84:ed/00:00:00:00:00/e8 Emask 0x9 (media error)
CPU 0: Machine Check Exception: 0000000000000004
CPU 0: Bank 4: b200000000070f0f
Kernel panic - not syncing: CPU context corrupt
Clocksource tsc unstable (delta = 4686082248 ns)
My guess was a hardware failure from the disk, possibly because of bad blocks
(but shouldn't I get only errors in the logs in this case?). However trying to run
badblocks on the (unmounted) partition produces the same kind of panic: not a way to fix the problem.
Also, the file got written to disk at some point in the past, and that gave no error!
I tried booting with option 'nomce': the machine keeps freezing at the same point, I just do not see the kernel panic message.
The hardware: Athlon 64 3200+ on an ASUS motherboard with the nvidia chipset. The kernel is 32bit (x86). The SATA drive is from Maxtor. I have been using this hardware for almost three years, and the opensuse 10.3 kernel for more than six months, without problems.
This does sound like the disk is having trouble. Go to the vendor's site, and see if they have any diagnostics utilities that you can run on the disk. These will probably be dos command line or bootable floppy images.
The problem was solved.
Thanks for the advice, Mr. C. I downloaded the Seagate/Maxtor utility (iso9660 format) and ran it: it found one error and offered to repair it, which I let it do. Indeed, now reading the file provokes no panic nor errors.
I still have to find out whether the repair was destructive or not (it's an avi, I'll have to watch it).
Great advice! It fixed the issue for me as well. Symptoms and configuration was very similar to the original poster's.
I was getting kernel messages like
ata1: timeout waiting for ADMA IDLE, stat=0x440
ata1: timeout waiting for ADMA LEGACY, stat=0x440
during an fsck, after which the kernel would freeze. I downloaded SeaTools for DOS and burned it to a CD, booted from it, ran the long test, which found 2 errors, repaired them (overwrote them with zeros) when asked if I wanted to, and after a reboot it worked again. The errors were only on one drive from a RAID5 set, so I hope that all data is still there.
The disk is a Seagate Barracuda 7200.8, ST3250823AS. No SMART errors had been reported, and it said "SMART values NOT tripped" or something similar in SeaTools. AMD AthlonX2 3800+, nVidia CK804 chipset, ASUS motherboard here too. Debian unstable, kernel version 2.6.24-1-amd64.
Last edited by rickardh; 07-02-2009 at 08:05 PM.
Reason: Added more info on my HW.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.