LinuxQuestions.org
Go Job Hunting at the LQ Job Marketplace
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices

Reply
 
Search this Thread
Old 10-27-2009, 09:27 AM   #1
zamula
LQ Newbie
 
Registered: Feb 2008
Posts: 9

Rep: Reputation: 0
Repeated hard drive thrashing followed by lockup


A couple weeks ago I started having odd problems with my laptop that may be hardware related, but I'm not sure.

First, my specs: I have a Dell Inspiron E1505 (aka 6400) running Arch Linux and KDE4, completely up to date. I also have a Windows XP partition that I boot into occasionally. The Linux partition is fully encrypted: I'm using LVM on top of dm-crypt.

The problem: I'll be using my laptop normally (some light web surfing, etc.) when all of the sudden programs will go completely unresponsive. This is accompanied by continuous hard drive access. Sometimes, after 30 seconds or so, the hard drive will stop and I'll regain control. Sometimes I'm forced to hold down the power button for 5 seconds to force my computer to turn off. (And if it stops on its own, odds are good it'll have another episode in a few minutes.)

Occasionally I can get to a terminal (Alt-Ctrl-F1), though usually I can't do anything there either. One time, though, I found a ton of error messages that might shed some light on my problem:

EXT3fs error (device dm-1): ext3_find_entry: reading directory # _____ offset 0

[clip many more lines saying the same thing, but with different directory numbers]

EXT3fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted
EXT3fs error (device dm-3) in ext3_orphan_del: Journal has aborted
EXT3fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted
EXT3fs error (device dm-3) in ext3_delete_inode: Journal has aborted

I've seen similar messages a few times since; also, at least once I got a kernel panic after such messages. (I suspect that these messages are always displayed, but not necessarily where I can see them, since I can't always get to a terminal.)

I began to think perhaps my hard drive was failing; it's fairly new (I got it last summer) and I treat my laptop with care, but it can happen. So I ran some diagnostics. Unfortunately, the diagnostic utility from Western Digital (who makes my hard drive) wouldn't finish booting; I'd get as far as a boot screen saying it was loading Caldera DR-DOS, and then nothing. So I used Hitachi's tool instead. After an hour or hour and a half (it did a thorough surface scan and queried the S.M.A.R.T information, among other things), it found no problems. Hm.

One more thing that makes me wonder if it's a hardware problem: at the same time this started, my Windows partition started acting up as well. I'd be watching a television show online (e.g., Hulu)--I can get reasonable full-screen graphics performance under Windows but not in Linux--when I'd get a blue screen of death and the hard drive would thrash. I don't remember the STOP code, unfortunately; if it would help, I can switch back over to XP and try to get the error again.

So what do you think? Are the Windows crashes related, or could that just be coincidence (and maybe a buggy version of Adobe Flash or something)? Should I trust the diagnosis of the Hitachi tool, or is it possibly missing something that only a WD-produced tool would know about my hard drive? Maybe it's something about my file system, rather than hardware? I'm very open to suggestions, since this is super annoying. (And yes, I have full, up-to-date backups, in case things go really wrong.)
 
Old 10-27-2009, 10:22 AM   #2
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 14,989

Rep: Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672
Quote:
Originally Posted by peppergrower View Post
A couple weeks ago I started having odd problems with my laptop that may be hardware related, but I'm not sure.

First, my specs: I have a Dell Inspiron E1505 (aka 6400) running Arch Linux and KDE4, completely up to date. I also have a Windows XP partition that I boot into occasionally. The Linux partition is fully encrypted: I'm using LVM on top of dm-crypt.

The problem: I'll be using my laptop normally (some light web surfing, etc.) when all of the sudden programs will go completely unresponsive. This is accompanied by continuous hard drive access. Sometimes, after 30 seconds or so, the hard drive will stop and I'll regain control. Sometimes I'm forced to hold down the power button for 5 seconds to force my computer to turn off. (And if it stops on its own, odds are good it'll have another episode in a few minutes.)

Occasionally I can get to a terminal (Alt-Ctrl-F1), though usually I can't do anything there either. One time, though, I found a ton of error messages that might shed some light on my problem:

EXT3fs error (device dm-1): ext3_find_entry: reading directory # _____ offset 0

[clip many more lines saying the same thing, but with different directory numbers]

EXT3fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted
EXT3fs error (device dm-3) in ext3_orphan_del: Journal has aborted
EXT3fs error (device dm-3) in ext3_reserve_inode_write: Journal has aborted
EXT3fs error (device dm-3) in ext3_delete_inode: Journal has aborted

I've seen similar messages a few times since; also, at least once I got a kernel panic after such messages. (I suspect that these messages are always displayed, but not necessarily where I can see them, since I can't always get to a terminal.)

I began to think perhaps my hard drive was failing; it's fairly new (I got it last summer) and I treat my laptop with care, but it can happen. So I ran some diagnostics. Unfortunately, the diagnostic utility from Western Digital (who makes my hard drive) wouldn't finish booting; I'd get as far as a boot screen saying it was loading Caldera DR-DOS, and then nothing. So I used Hitachi's tool instead. After an hour or hour and a half (it did a thorough surface scan and queried the S.M.A.R.T information, among other things), it found no problems. Hm.

One more thing that makes me wonder if it's a hardware problem: at the same time this started, my Windows partition started acting up as well. I'd be watching a television show online (e.g., Hulu)--I can get reasonable full-screen graphics performance under Windows but not in Linux--when I'd get a blue screen of death and the hard drive would thrash. I don't remember the STOP code, unfortunately; if it would help, I can switch back over to XP and try to get the error again.

So what do you think? Are the Windows crashes related, or could that just be coincidence (and maybe a buggy version of Adobe Flash or something)? Should I trust the diagnosis of the Hitachi tool, or is it possibly missing something that only a WD-produced tool would know about my hard drive? Maybe it's something about my file system, rather than hardware? I'm very open to suggestions, since this is super annoying. (And yes, I have full, up-to-date backups, in case things go really wrong.)
Based on the symptoms described, the errors, and the physical notes (i.e. 'thrashing'), I'd say your drive was having a problem. The surface-scan diags check the physical media, but won't catch a flaky logic-board on your drive, unless the problem occurs DURING that scan.

I'd replace the drive ASAP, since you've got good backups. I'd also probably get a cheap external USB enclosure for the existing drive, so you'll have an easier time copying files back and forth, since you can still access it.
 
Old 10-27-2009, 11:48 AM   #3
zamula
LQ Newbie
 
Registered: Feb 2008
Posts: 9

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by TB0ne View Post
Based on the symptoms described, the errors, and the physical notes (i.e. 'thrashing'), I'd say your drive was having a problem. The surface-scan diags check the physical media, but won't catch a flaky logic-board on your drive, unless the problem occurs DURING that scan.

I'd replace the drive ASAP, since you've got good backups. I'd also probably get a cheap external USB enclosure for the existing drive, so you'll have an easier time copying files back and forth, since you can still access it.
You're correct; I don't know what the exact problem is, but just in the last couple of hours it's gotten to the point that half the time my computer doesn't even recognize that the hard drive exists. (There's a diagnostic utility built into the BIOS, and when I ran it it complained that I had no hard drive, so I don't think it's just a bad boot sector.) So perhaps it's the logic board, as you suggested.

Oh well! It's still under warranty, and I even have my old hard drive set up with a fresh install of Arch from a couple months ago when I was looking into switching (from Ubuntu), so this'll be a small headache but nothing serious. Thanks for the help! I'll be calling Western Digital today to get a replacement.
 
Old 10-27-2009, 12:13 PM   #4
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 14,989

Rep: Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672
Quote:
Originally Posted by peppergrower View Post
You're correct; I don't know what the exact problem is, but just in the last couple of hours it's gotten to the point that half the time my computer doesn't even recognize that the hard drive exists. (There's a diagnostic utility built into the BIOS, and when I ran it it complained that I had no hard drive, so I don't think it's just a bad boot sector.) So perhaps it's the logic board, as you suggested.

Oh well! It's still under warranty, and I even have my old hard drive set up with a fresh install of Arch from a couple months ago when I was looking into switching (from Ubuntu), so this'll be a small headache but nothing serious. Thanks for the help! I'll be calling Western Digital today to get a replacement.
No problem, and good luck. I've been in your shoes more times than I care to remember.
 
Old 11-08-2009, 02:38 AM   #5
zamula
LQ Newbie
 
Registered: Feb 2008
Posts: 9

Original Poster
Rep: Reputation: 0
Alas, it looks like it is hardware, but not the hard drive: I swapped in the other hard drive I mentioned, and my laptop still claims it can't find a bootable device. Plus, I ran Western Digital's disk scan (using the Windows version of their tool), and it came out totally clean. I'm thinking a failing SATA controller now.

Due to the intermittent nature of the problem, and the way it most often occurs after moving the laptop (or shifting its position on my lap), I suspect a loose connection somewhere. I'm not afraid to use a soldering iron (even on surface mount); does anyone have any ideas where to start on something like this? (I figure if it's dying anyway, at least there's little danger I'll make it worse.)

Alternatively, does anyone have any other ideas how to fix this? Is the SATA controller something you can swap out on a laptop, and would it be worth it on one that's slightly over 3 years old?
 
Old 11-08-2009, 02:46 PM   #6
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 14,989

Rep: Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672
Quote:
Originally Posted by peppergrower View Post
Alas, it looks like it is hardware, but not the hard drive: I swapped in the other hard drive I mentioned, and my laptop still claims it can't find a bootable device. Plus, I ran Western Digital's disk scan (using the Windows version of their tool), and it came out totally clean. I'm thinking a failing SATA controller now.

Due to the intermittent nature of the problem, and the way it most often occurs after moving the laptop (or shifting its position on my lap), I suspect a loose connection somewhere. I'm not afraid to use a soldering iron (even on surface mount); does anyone have any ideas where to start on something like this? (I figure if it's dying anyway, at least there's little danger I'll make it worse.)

Alternatively, does anyone have any other ideas how to fix this? Is the SATA controller something you can swap out on a laptop, and would it be worth it on one that's slightly over 3 years old?
No, they're not typically something you can upgrade/replace, as they're usually part of the mobo. You MIGHT want to consider getting a brand-new motherboard, though...3 years old, you can perhaps find one fairly cheap....
 
Old 11-08-2009, 03:26 PM   #7
MrCode
Member
 
Registered: Aug 2009
Location: Oregon, USA
Distribution: Arch
Posts: 864
Blog Entries: 31

Rep: Reputation: 148Reputation: 148
Quote:
No, they're not typically something you can upgrade/replace, as they're usually part of the mobo. You MIGHT want to consider getting a brand-new motherboard, though...3 years old, you can perhaps find one fairly cheap....
For a laptop? I thought the only user-replaceable components in a laptop were the HDD and RAM...

Last edited by MrCode; 11-08-2009 at 03:27 PM.
 
Old 11-08-2009, 03:31 PM   #8
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 14,989

Rep: Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672
Quote:
Originally Posted by MrCode View Post
For a laptop? I thought the only user-replaceable components in a laptop were the HDD and RAM...
Normally, yes. But 3 years old, and not under warranty, why not? Nothing magic about it...just held in with tiny screws, so as long as you've got some patience and feel like doing it, it's not difficult.

The hardest part is finding the replacement part.
 
Old 11-09-2009, 10:40 AM   #9
zamula
LQ Newbie
 
Registered: Feb 2008
Posts: 9

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by TB0ne View Post
No, they're not typically something you can upgrade/replace, as they're usually part of the mobo. You MIGHT want to consider getting a brand-new motherboard, though...3 years old, you can perhaps find one fairly cheap....
Yeah, that's a good idea; I'll look into it. (Thanks again for the suggestions!) I took my laptop apart on a whim the other night; I knew it wouldn't help, but it was mildly cathartic. Anyway, it's not too bad to take apart, and replacing the motherboard would only be one step farther than I've been already.

Random question: without thinking, I took the heat sink off (to more easily blow some dust out of the fins). If I find a way to keep this machine going I should probably apply some fresh thermal grease, yes? (The stuff that was on the CPU was totally dried out, and the GPU seemed to just have something rubbery--not thermal grease at all--so I'm assuming I just leave that be.)
 
Old 11-09-2009, 10:58 AM   #10
TB0ne
Guru
 
Registered: Jul 2003
Location: Birmingham, Alabama
Distribution: SuSE, RedHat, Slack,CentOS
Posts: 14,989

Rep: Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672Reputation: 2672
Quote:
Originally Posted by peppergrower View Post
Yeah, that's a good idea; I'll look into it. (Thanks again for the suggestions!) I took my laptop apart on a whim the other night; I knew it wouldn't help, but it was mildly cathartic. Anyway, it's not too bad to take apart, and replacing the motherboard would only be one step farther than I've been already.

Random question: without thinking, I took the heat sink off (to more easily blow some dust out of the fins). If I find a way to keep this machine going I should probably apply some fresh thermal grease, yes? (The stuff that was on the CPU was totally dried out, and the GPU seemed to just have something rubbery--not thermal grease at all--so I'm assuming I just leave that be.)
I certainly would...fresh thermal paste is always a must, in my opinion.
 
Old 11-10-2009, 05:22 PM   #11
lewc
Member
 
Registered: Nov 2009
Distribution: Gentoo, Slackware or Debian
Posts: 60
Blog Entries: 1

Rep: Reputation: 18
content removed

Last edited by lewc; 11-27-2009 at 04:51 PM.
 
  


Reply

Tags
bsod, crash, ext3, filesystem, harddrive, kernel panic


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
hard drive thrashing while computer is idle di11rod Ubuntu 2 05-02-2009 01:47 PM
msec_find making system slow and thrashing hard drive once in a while k.king Mandriva 13 03-17-2009 05:10 PM
hard drive is thrashing after SUSE install mtdew3q Suse/Novell 4 02-13-2007 11:59 PM
hard lockup on shutdown oxman Linux - General 2 12-14-2002 02:22 AM
XServer hard lockup after boot gopher Linux - Software 10 09-22-2001 09:57 PM


All times are GMT -5. The time now is 10:27 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration