Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
I've been getting hard-disk related errors. It first happened with a 1 year old disk, and I thought it was a hardware failure. However I replaced it with a new one, and I still get errors. Here is an excerpt from /var/log/kern.log:
Sometimes this completely locks up the system. Since this happened with two disks, I suppose it's some other factor. I tried using another IDE cable, and different kernels (2.4.25 and 2.6.5) without success. Finally, I found out that disabling dma (with ide=nodma) makes the problem disappear, but of course with a big performance penalty. The strange thing is also that the disk worked flawlessly for about a week.
How can I discover the culprit? Thanks for any idea or help.
It may be the IDE chipset. You could switch the cable from IDE0 to IDE1 (and make the corresponding changes in /etc/fstab and your bootloader configuration). This only a partial test because IDE0 and IDE1 do have some circuitry in common. But if the problem is in the IDE chipset and in the part of the circuitry dedicated to IDE0 then this swap will show that the problem is in the IDE chipset.
Like the previous poster, I started getting this error (and a bunch of related ones) on a system that had been formerly stable for several years, and it started the second I installed Fedora Core 3. Before, I had been using the default RedHat 2.4 kernel, and now I'm using the 2.6 kernel. I'm also using reiserfs, and this error has been corrupting the file systems on my second IDE controller quite dramatically. I can't even back the files up.
I found a work around for backing up files while I nail down the source of this problem....
I grabbed an ISO of Knoppix a while ago (version 3.3) and I booted off that Knoppix disc. Then I was able to mount and rsync my important stuff onto other disk drives in case I corrupt my filesystems beyond recovery while trouble-shooting.
I've been playing with re-compiling kernel 2.6.9 but haven't yet been able to nail down which disk option is causing these errors.
Maybe DMA code is changed in new kernels? Maybe it's enabled by default and it wasn't in the older kernels?
My hdparm -I /dev/hde output shows an asterisk next to UDMA5... but does that really mean DMA is on or off? I dunno.....
ICH4: IDE controller at PCI slot 0000:00:1f.1
PCI add-in card:
PDC20267: IDE controller at PCI slot 0000:01:05.0
Since my last post, I recompiled dozens of times (literally) from kernel 2.4.22 up through 2.6.10 RC2 and enabling disabling many different options related to IDE. I also swapped controller cards to rule out hardware failure. The best I have come up with was grabbing the .config file from Knoppix and re-compiled 2.6.10 RC2, turning off just enough "options" to get a successful compile.
I'm stable again, but at a serious hit on drive performance.
Timing buffered disk reads: 10 MB in 3.27 seconds = 3.06 MB/sec used to be more like 60 MB in the same 3.27 seconds.
@elfoozo: Do you know at which kernel you started seeing this behavior? I've always used a 2.5/2.6 kernel on this machine and I'm pretty sure this problem was absent before. I do not really remember when it started, because at first I thought it was a failing drive. If it hadn't been a laptop, I probably would have replaced it already...
Oh My Gods.
I went through the Exact same experience as the first poster, with the 1 year old drive, exchange, one week wait etc, except I didn't switch cables because this is on my Toshiba Laptop. I'll try installing that 2.6.10-kernel now. Must I "strip out everything" to make it work? I'm fairly n00 to that stuff...
THANKS so much for the tips.
I should've noticed how FC2test3 worked but newer distros (including Skolelinux) all got messed up. I didn't notice until I managed to do a minimal install w/o my HD breaking down, so that was in a terminal and saw the errors come.
Scratch that. When my laptop is warm, I can't even reformat the drive from my Partition Magic floppies (the longer it's been on, the . I guess I need a new one... although it seems weird to be dying on me after just one week. How sad.