Crash: Suggestion Time! Which component is the culprit? -full description included-
Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Crash: Suggestion Time! Which component is the culprit? -full description included-
Details about my system first -
AMD PhenomII x2 550 Black Edition
500gb HDD Western Digital
400w Corsair PSU
2GB OCZ DDR3 Memory (TESTED USING MEMTEST86+ PASSED WITH NO ERRORS)
Gigabyte GA-MA785GMT-UD2H Motherboard with On-Board HD4200
I ran Linux Mint 7.0
Then Linux Ubuntu 9.10
Then Windows XP with Service Pack 2
Then Ubuntu 8.04
My system started crashing about a week after I built it. It started with a crash every few days, usually either in Firefox or when I had multiple apps open. It started crashing on startup too, before the boot menu after GRUB loads. [Linux]
The crashes are in 2 different forms. Crashes when installing from CD or in the system - the screen freezes, keyboard and mouse unresponsive, screen has interference in the form of small blocks of horizontal bars, kind of like what you see on the output of a graphic equalizer or frequency bars on a fancy audio system. These are arranged in several columns on the screen. [XP and Linux]
The crash after GRUB but before the boot menu is weirder, a few dozen clumps of characters, like ÉÉÉ (mainly them, including square boxes, normal characters, and characters from other languages) dancing over the screen, no pattern is perceivable, refreshing after every quarter second.
The crashes became so frequent that the system was unusable. I thought that it might be a Linux driver issue, or maybe the system was trying to address memory locations which weren't there, or maybe it formatted the HDD incorrectly (I've heard of problems with Windows selecting too big a partition for 500gb HDD's which was causing disk formatting to hang at 100%). Maybe it could even be a power fluctuation with the PSU playing up.
One thing that I doubt is heat. The system has not gone anywhere near critical levels, the bios reports stable temps rarely above 45-50 degrees (when it was usable). Usually (about 95% of the time) it hovers around 40 for the processor.
After several install attempts, and re-formats with GParted and live CD's, the system does not successfully install any OS from a CD without crashing. This crash happens before you even get to formatting the disk. Or it crashes when formatting. Or it crashes when loading the files. Complete waste of space!
So. Before I start sending things back to manufacturers, I have one simple question.
With the above description, which component is the likely culprit and why???
It sounds to me like the video card. At least, whenever I have had a video card go on me, it was usually accompanied by artifacts on the screen (horizontal bars and weird characters). Another possibility is the ram, which is usually a likely culprit when your system is crashing in various, seemingly unrelated places. You can test the ram with most distro's install CDs. Look around in the boot menu for something called memtest or memtest86.
I'm guessing that the HD4200 is an onboard video device, yes? If so, it's not too too likely that it is going bad yet (this build is fairly new???) but if the motherboard is brand new, maybe it's defective/flaky..
At the top of my list though, assuming non-defective motherboard, would be the power supply. Is it adequate? How old is the PSU? Maybe it's defective (too)?
It sounds to me like the video card. At least, whenever I have had a video card go on me, it was usually accompanied by artifacts on the screen (horizontal bars and weird characters). Another possibility is the ram, which is usually a likely culprit when your system is crashing in various, seemingly unrelated places. You can test the ram with most distro's install CDs. Look around in the boot menu for something called memtest or memtest86.
Please read line 6 of the post which informs about test on memory. Yes, I think it may be related to graphics too. But I'm not experienced enough to say for definite... Thanks for the advice, so you don't think its the PSU?
In response to the last post, the PSU is powerful enough (so i've heard) and I specifically shelled out a lot of money in comparison to the total cost of the system for a good unit made by a good company.
Please read line 6 of the post which informs about test on memory. Yes, I think it may be related to graphics too. But I'm not experienced enough to say for definite... Thanks for the advice, so you don't think its the PSU?
In response to the last post, the PSU is powerful enough (so i've heard) and I specifically shelled out a lot of money in comparison to the total cost of the system for a good unit made by a good company.
Anyone think it's the HDD?
Ack that was stupid of me! I guess my reading comprehension is a little shot today. My only experience with bad PSU's has been the machine not booting at all, but my experience is pretty limited in that area.
Bad hard drives are usually pretty easy to spot - the crashes are usually consistent. You can try booting a linux rescue cd such as The Ultimate Boot CD and do a quick disk test. If it is the drive, Western Digital has an awesome RMA program
The power supply is not adequate for AMD systems. The combine wattage for 5 volts and 3.3 volts have to be 150 watts or more. Any less, you could have issues. A Seasonic S12D is the only power supply these that have this power requirement.
I suggest run both memtest86 and memtest86+ again to test the memory. I suggest switch to a discreet graphics card for a more thorough test.
It could be the video card drivers if you are using them from ATI. ATI does not write reliable and stable software in any OS, so I suggest use open source drivers instead. I suggest do a test with vesa as your driver. If that works with out any problems, I suggest use radeon as the driver for X11. Do not use fglrx as your driver.
You may have to use framebuffer for BASH or you may have to remove it. If you are not using framebuffer and provides those errors, include vga=ask to the kernel line in grub.
The power supply is not adequate for AMD systems. The combine wattage for 5 volts and 3.3 volts have to be 150 watts or more. Any less, you could have issues. A Seasonic S12D is the only power supply these that have this power requirement.
I suggest run both memtest86 and memtest86+ again to test the memory. I suggest switch to a discreet graphics card for a more thorough test.
It could be the video card drivers if you are using them from ATI. ATI does not write reliable and stable software in any OS, so I suggest use open source drivers instead. I suggest do a test with vesa as your driver. If that works with out any problems, I suggest use radeon as the driver for X11. Do not use fglrx as your driver.
You may have to use framebuffer for BASH or you may have to remove it. If you are not using framebuffer and provides those errors, include vga=ask to the kernel line in grub.
Running through the post, you say the power supply is not adequate for AMD but then later in the paragraph state that it needs to be 150 watts or more. My PSU is 400w, and I do not understand why it is not adequate?
The crashes happened both with fglrx and open source drivers. The crashes happened a lot less frequently when compiz was turned off and I was using the open source drivers, but the corruption in GRUB was still there. The crash happened twice in Windows XP too, and both times when the system was trying to install the drivers off the disk.
When I used GParted for reformatting the disk I used vesa as the driver but sometimes it still crashed during a reformat.
People, I thank you for trying to sort the system out but I do think that something is wrong. I think a rewording of the question might help.
I know the system is f****d, so, based on information provided, which part would you send back, and why?
All I am trying to do is get opinions on which part may be defective, to make me feel more confident about which part to send back.
I agree with the earlier posts regarding the video interface.
The fact that the system crashes/freezes showing video artifacts (blocks, lines, etc) is consistent with a failure in the video circuitry. The fact that the system operates normally before this problem occurs is consistent with an over heating problem most probably related to the video circuity. You can simulate the same defect with a motherboard with an AGP video adapter, and stopping the fan from cooling the video card circuitry. When the video circuitry fails to operate the system bus is locked up as a consequence resulting in system lockup/freeze.
If the time between system start up and lockup/freeze varies (when tested in a room of stable environment) then the variation will most likely point to over heating.
I'm not familiar with the motherboard you are using, if the video is integrated then I suggest that you re-test the motherboard in another machine. If you are using a separate AGP/PCI/PCI-X display adapter then try replacing that as a first step.
On any system build it is a good idea to run memtest86 (check the web for a copy) to ensure that there are no defects in main memory, and that the memory modules are compatible with the motherboard (correctly recognized - compatible latency values, etc).
Whilst there may be other factors involved, on the basis of your description, this would seem to be the most likely answer.
If you have a copy of MS Windows I suggest a test install. If my suspicions are correct, you should see exactly the same fault occurring.
Running through the post, you say the power supply is not adequate for AMD but then later in the paragraph state that it needs to be 150 watts or more. My PSU is 400w, and I do not understand why it is not adequate?
<snip>
Your total wattage is 400W. Electro has a point about the 'The combine wattage for 5 volts and 3.3 volts have to be 150 watts or more. Any less, you could have issues. A Seasonic S12D is the only power supply these that have this power requirement.'
It does seem to be a power related problem.
What's the power ratings for your equipment;
Code:
AMD PhenomII x2 550 Black Edition
500gb HDD Western Digital
400w Corsair PSU
2GB OCZ DDR3 Memory (TESTED USING MEMTEST86+ PASSED WITH NO ERRORS)
Gigabyte GA-MA785GMT-UD2H Motherboard with On-Board HD4200
From the above links you should be able to assert that indeed you have issues with how you have sized the system. I'll leave it too you in sizing the rest but you should get the point.
What onebuck has describe about total power requirements for 5 volts and 3.3 volts is what I did say. In the past, AMD has stated a 150 watt total power requirement when adding up the power for 5 volts and 3.3 volts on their site. Now, AMD does not state the power requirement for these rails, but they say it is something to think about while building a computer or collecting parts for any computer.
If the Seasonic S12D-750 costs too much, you could go for SeaSonic SS-550HT (550 watt) or FSP Group Blue Storm II 500 (500 watt). These brands specializes in power supplies instead of contracting them to another company.
Some people have suggested the video card, so yes it could be the video card. This is easy to find out by adding a video card. You can use any video card preferably a low end card. After you insert the card, go in the BIOS and turn off the on-board video card. Though it could the the side port memory that might be the cause, so you can disable the side port memory.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.