Hitachi 7K1000.B Hard drive causes sudden shutdowns
First things first, this happens in BOTH windows vista and Ubuntu 9.04. Now, as my previous WD 250GB drive was sticking on start up and prone to failure, I bought this 1TB hard drive as a replacement. Now, for seemingly no reason, it causes my computer to shut down instantly, as if losing power. This never happened with my WD drive and it isn't caused by processor usage or drive writing. It runs from anywhere between 10 min and 13 hours so far without shutting down. Things I was doing before shut down are not consistent, or processor intensive. My GPU and CPU are both running normal temperature and I have blown out my entire PC just in case. GeForce 8200 motherboard (has VERY bad Sata controllers... but works), onboard video and sound, HP cd burner and DVD drive. Feel free to ask for outputs of commands, I'm fairly comfortable in linux. I choose to post here as linux is much better for diagnosing underlying issues in my experience.
Looks like you are not alone:
Might try the drives in another system if possible just to test. I don't know if this problem is specific to certain hard drives or just a general problem.
I would have tried in a different box if any of them supported sata, but they don't. The link you gave seems to be more specific to during boot, mine has never failed during boot, maybe I'm missing something though.
Oddly enough, I think this is caused by my NVIDIA display drivers. As I recall this started shortly after installing them on Vista. In addition, with no video drivers my fresh linux install ran for 24+ hours, but within 15 min of rebooting after installing the video drivers, I encountered the same problem. This little tidbit in my syslog file may help somebody:
Jun 30 15:11:18 me-desktop x-session-manager: WARNING: Application 'libcanberra-login-sound.desktop' failed to register before timeout
Jun 30 15:11:18 me-desktop pulseaudio: module-x11-xsmp.c: X11 session manager not running.
Jun 30 15:11:18 me-desktop pulseaudio: module.c: Failed to load module "module-x11-xsmp" (argument: ""): initialization failed.
Jun 30 15:16:00 me-desktop kernel: [ 332.744412] NVRM: failed to unregister from the ACPI subsystem!
Jun 30 15:16:00 me-desktop bonobo-activation-server (me-5650): could not associate with desktop session: Failed to connect to socket /tmp/dbus-HFlARn45hq: Connection refused
Jun 30 15:16:28 me-desktop kernel: [ 360.242351] nvidia 0000:02:00.0: setting latency timer to 64
Jun 30 15:16:28 me-desktop kernel: [ 360.242584] NVRM: loading NVIDIA UNIX x86 Kernel Module 185.18.14 Wed May 27 02:23:13 PDT 2009
Jun 30 15:16:59 me-desktop kernel: [ 391.887488] nvidia 0000:02:00.0: setting latency timer to 64
Jun 30 15:16:59 me-desktop kernel: [ 391.887968] NVRM: loading NVIDIA UNIX x86 Kernel Module 173.14.12 Thu Jul 17 18:11:36 PDT 2008
Jun 30 15:16:59 me-desktop kernel: [ 391.904126] NVRM: failed to register with the ACPI subsystem!
Jun 30 15:16:59 me-desktop kernel: [ 391.904142] NVRM: API mismatch: the client has the version 185.18.14, but
Jun 30 15:16:59 me-desktop kernel: [ 391.904144] NVRM: this kernel module has the version 173.14.12. Please
Jun 30 15:16:59 me-desktop kernel: [ 391.904145] NVRM: make sure that this kernel module and all NVIDIA driver
Jun 30 15:16:59 me-desktop kernel: [ 391.904146] NVRM: components have the same version.
Jun 30 15:16:59 me-desktop kernel: [ 391.904162] NVRM: failed to unregister from the ACPI subsystem!
Jun 30 15:17:01 me-desktop /USR/SBIN/CRON: (root) CMD ( cd / && run-parts --report /etc/cron.hourly)
Jun 30 15:17:02 me-desktop kernel: [ 395.002088] NVRM: failed to register with the ACPI subsystem!
Jun 30 15:17:02 me-desktop kernel: [ 395.002105] NVRM: API mismatch: the client has the version 185.18.14, but
Jun 30 15:17:02 me-desktop kernel: [ 395.002106] NVRM: this kernel module has the version 173.14.12. Please
Jun 30 15:17:02 me-desktop kernel: [ 395.002107] NVRM: make sure that this kernel module and all NVIDIA driver
Jun 30 15:17:02 me-desktop kernel: [ 395.002108] NVRM: components have the same version.
Jun 30 15:17:02 me-desktop kernel: [ 395.002123] NVRM: failed to unregister from the ACPI subsystem!
Jun 30 15:17:06 me-desktop kernel: [ 398.102107] NVRM: failed to register with the ACPI subsystem!
Jun 30 15:17:06 me-desktop kernel: [ 398.102124] NVRM: API mismatch: the client has the version 185.18.14, but
Jun 30 15:17:06 me-desktop kernel: [ 398.102126] NVRM: this kernel module has the version 173.14.12. Please
Jun 30 15:17:06 me-desktop kernel: [ 398.102127] NVRM: make sure that this kernel module and all NVIDIA driver
Jun 30 15:17:06 me-desktop kernel: [ 398.102128] NVRM: components have the same version.
Jun 30 15:17:06 me-desktop kernel: [ 398.102143] NVRM: failed to unregister from the ACPI subsystem!
Jun 30 15:17:06 me-desktop gdm: CRITICAL: gdm_config_value_get_bool: assertion `value->type == GDM_CONFIG_VALUE_BOOL' failed
It looks to me like it is a version mismatch error... but I don't know much about the nit picky stuff in the kernel.
Have you considered that you might have a power supply that is being pressed to maximum (or beyond) capacity on one of the supply rail combinations ?
Or possibly that the power supply has developed an intermittent fault ?
Either might be the case.
The situation with the nVidia drivers might be a red herring.
I suggest that you re-estimate the current requirements and substitute a replacement PSU as a test.
Thanks Chris, I actually had that thought myself recently. However, after doing a bios update, the problem appears to be solved.... well, in linux at least. 72+ hours without a shutdown so far. Vista has gotten much worse though, top run time of 10 min now. Is it possible that vista would be drawing more power from increased hardware usage and causing that problem? It seems unlikely to me, but I'm not sure. Time for some more google work. If anyone has any more ideas they would be greatly appreciated.
If the processing activity* of the PC is high, more current is drawn from the power supply. If the PSU is failing due to inability to supply (or consistently supply) higher currents on the supply rails (or combinations of supply rails) then running an OS that is intensively using the system hardware may case a fault similar to that you describe.
(*any component in the PC will draw more current if it is in intensive use. eg. CPU, hard disks reading/writing more often, temperature controlled fans switching on and off, floppy disk drives operating on and off, more frequent memory access, video card GPUs being intensively used, etc)
Again the only way to be sure that the PSU is at the root of the problem is to change it out for a working spare.
You can also experiment using an old fashioned method.
Try running the PC in a small room with a higher room temperature
(eg say 10 degrees Celsius).
Raising the room temperature can be achieved simply by using an electric fan heater, or similar heater, to raise the air temperature in the room. If the electronics in the PSU are marginal, the higher operating temperature MAY cause the PSU to fail more rapidly or more regularly.
Note: Just heat up the air in the room. DON'T aim it into the PSU or PC! And DON'T leave the room unattended while you perform the tests - for obvious reasons (ie. fire).
This test obviously doesn't cover all electronic failure cause possibilities. But if the problem source is related to dried out capacitors, some instances of poor solder joints, or marginally operating electronic components this test may give you a lead.
remember it may not be the PSU but other electronic components in the system that are causing the problem.
Personally I'd invest in another PSU. At best the fault is resolved. At worst you will have a known working spare PSU on the shelf for future use.
Without sounding too parental please remember not to open and work on the PSU yourself. PC PSUs are switch mode power supplies and CAN KILL YOU. Maintenance of PSUs should be left to skilled, experienced, and appropriately licensed technicians.
Hope that helps
Just a few more thoughts for your consideration:
*thoroughly de-dusted the system
*thoroughly de-dust the CPU fan
*check the CPU fan for correct operation
* bundle the cables in the PC so that there is free airflow through the case
*remove and re-seat the CPU and any expansion cards
*separate the CPU from the heatsink - replacing the thermal pad, or replacing the thermal grease between the two
Thanks Chris, I'll test that as soon as I can; I don't have the time for that right now, but I'll post back as soon as I've tested it.
I tested for power issues, and everything checked out. However, I've been running off of my first hard drive and decided to order another sata connector to see if it was an issues with running an OS from it for some reason. It runs perfectly fine... until you keep it active for somewhere between 5 and 15 minutes. I'm going to toy around with what modes it's in and see if that helps at all, but right now I'm thinking a return is my best option. Thanks everyone.
I've gotten it to work, but the terms are a bit confusing to me. In linux in runs fine regardless of how it's plugged in. In vista, however, I have to have a molex to sata converter to use it, and I can't boot from it. I can't decide if it's over heating or the PSU is failing. I can run games and such from the hard drive for hours and not have it fail if I'm not booting from it and have the molex to sata connector. makes me think it's the PSU, but why would linux have no troubles with it then? anyways, I'm getting a cooling rack for it for christmas, hopefully that'll clear things up.
It seems you have a Frankenstein computer because I think you throw in a a lot junk to make a computer. One of the junk pieces was an old power supply. I agree you have an issue with your power supply. The follow power supplies are what I suggest for your Frankenstein computer.
FSP Group Blue Storm II 400
SeaSonic SS-550HT 80plus 550W
SeaSonic S12D 750 Silver 750W
All of these power supplies can provide over 150 watts combined of 3.3 volts and 5 volts. Power supplies that are under this rating may not keep an AMD system reliable for long periods and the power supply will age faster than you think even for a quality power supply.
Probably using the molex to SATA power adapter may make you balance the drives for each cable instead putting all SATA drives on the cable that has only SATA power adapters.
I suggest buy an AC outlet checker. This device checks if the wiring is correct. If the AC wiring is configured wrong, computers can have a lot of problems that a quality power supply can not fix.
Thanks for the advice, but this isn't really a "Frankenstein" computer. I have a couple of those but this is only a year or two old. The power situation makes sense since the sata connectors (all on one main power line or w/e you want to call it) might not be able to handle that load. I might just buy another molex to sata connector so I can hook up my dvdrw drive when in Windows. I was planning on buying a power supply in the future, but right now I think I'd rather use the molex to sata trick. Thanks again!
|All times are GMT -5. The time now is 05:17 AM.|