Issues with HighPoint Rocket 133SB Controller Card
Linux - HardwareThis forum is for Hardware issues.
Having trouble installing a piece of hardware? Want to know if that peripheral is compatible with Linux?
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
I have been running an all-SCSI drive system (except for the DVD-ROM drive, which is IDE) since these systems were first built in 2002 with no problems.
I am undertaking a project with one of my clients which requires a fair amount of disk space, and we need to do this on a budget. So I bought a 300GB ATA133 Seagate drive and a HighPoint Rocket 133SB Controller Card (the on-board IDE does not support LBA48).
The drive comes up as /dev/hdg, and I've partitioned and formatted it with no problems. Then I decided to test the new drive under load to make sure it works properly, because once I deploy this server, I dont want there to be any problems. Alas, there is a problem. Under certain circumstances, the system locks up whenever I do operations on the new drive. The only way out is to either hit the reset switch or power off.
I've tried inserting the card into the 5 different PCI slots on the motherboard, and through some admitted riser-card trickery, I've settled on PCI slot 4 (I'll explain why later, and I'll also go into detail as to the "riser card trickery" I used if anyone is interested or feels it important).
Ideally I would like to have the controller card on it's own IRQ. Unfortunately, with the way this motherboard is configured, I cannot swing that. No matter which slot I try, the card will always share an IRQ with another device. It usually shares with one of the AIC7899 devices (there are 2 on this motherboard), eth0 or eth1, and the only USB device. I've decided the lesser of three evils would be to have it share with eth1 and the usb device. While I do use eth1 in my hosting environment, it is a cross-link with another server, and sees much less net traffic than eth0, which is my outside link to the Internet. Also, I do not use any usb devices. This configuration coincides with the card being in PCI slot 4.
My load test consists of copying the contents of one of my SCSI drives to the new drive, while simultaneously creating a tar of that same drive, plus running "yum update". I also configured eth1 to be my link to the Internet temporarily while running "yum". I also copied files from another system on my LAN via eth1; I wanted to make absolutely sure that sharing the IRQ with eth1 would not be an issue.
Then, after completing the tar of /dev/sdb, I would then run "gzip -9". This is when my system hangs.
At first I thought I identified a problem with the IRQ sharing, but I don't think this is the case. I actually performed a test doing everything else but the "gzip" test, and it all completed successfully. However, when I do "gzip", even if nothing else is going on in the system, the system hangs. I also get this problem no matter which slot I put the card into.
This leads me to believe that there is not an IRQ issue here, although this sure looks like an IRQ problem. Everything I've read about the Rocket 133SB indicates that this board should work out of the box with no special considerations with any modern Linux distro. It also occurs to me that the difference with the "gzip" test is that it is reading a lot of data from the drive, whereas the other tests are mostly writing to the drive.
Here is the plan I've derived to try to narrow-down and solve the problem and the rationale:
1. Try another controller card. I bought 3 of the Rocket 133SB boards to go into three different servers. This would eliminate the card as the problem.
2. Try another drive. As with the controller cards, I bought 3 of these drives. This would eliminate a problem with the drive.
3. Move the drives around and try using the cable that came with the card. I bought and used a longer ATA100/133 cable (2') because the cable that came with the card is not long enough to reach from the card to the drive. If I moved the drives around I could probably use the included cable, but this is a hassle. This would eliminate cable length (or the cable?) as an issue.
4. Try an earlier version of Fedora Core. I currently use FC4 in the field, so I would probably try that. This would eliminate FC6 (a fairly recent release) as the issue.
5. Update the motherboard BIOS. This particular system uses version 2.10, and the latest is 2.14. This would eliminate the system BIOS as the issue.
6. Replace the motherboard. I have noted that IDE0 (the Primary IDE bus) seems to not work properly. Whenever I hook the DVD-ROM drive to IDE0, I get funny characters on the ID in the POST, and bootable DVDs don't boot. But when I use IDE1 (the Secondary IDE bus), it works fine. This could indicate a wider problem with the motherboard. This would eliminate a physical motherboard problem.
But, I'm wondering if anyone else has seen this type of behavior? I will obviously keep you all informed as to my progress, but I'm hoping someone else has seen this problem and has a solution. Obviously, the easiest solution would be to just get a 300GB SCSI drive, but that would cost major dollars that I and my client do not have.
This is what I get for not looking before I post...
I read the other threat "Before you post", and realized I forgot to include other information that could be relevant. I'll provide this info later as A) the system is unplugged and not convenient to access at this time and B) it's 3:30AM and I need to hit the hay.
I guess it pays to follow the instructions on what information to provide when you post on this hardware forum. Because not only will it increase your chances of getting a helpful answer, you just might solve your own problem, like I did with this problem. I'm posting the solution in the hopes that it will be useful to someone else.
Then I looked at /var/log/messages, and then I saw this (snipping out irrelevant stuff):
Feb 11 02:52:16 valiant kernel: AMD7411: 0000:00:07.1 (rev 01) UDMA100 controller
Feb 11 02:52:16 valiant kernel: ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:pio, hdb:pio
Feb 11 02:52:16 valiant kernel: ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio
Feb 11 02:52:16 valiant kernel: hdc: Pioneer DVD-ROM ATAPIModel DVD-116 0122, ATAPI CD/DVD-ROM drive
Feb 11 02:52:16 valiant kernel: ide1 at 0x170-0x177,0x376 on irq 15
Feb 11 02:52:16 valiant kernel: HPT302: IDE controller at PCI slot 0000:00:0b.0
Feb 11 02:52:16 valiant kernel: ACPI: PCI Interrupt 0000:00:0b.0[A] -> GSI 19 (level, low) -> IRQ 16
Feb 11 02:52:16 valiant kernel: HPT302: chipset revision 2
Feb 11 02:52:16 valiant kernel: HPT302: 100% native mode on irq 16
Feb 11 02:52:16 valiant kernel: ide2: BM-DMA at 0x1000-0x1007, BIOS settings: hde:pio, hdf:pio
Feb 11 02:52:16 valiant kernel: ide3: BM-DMA at 0x1008-0x100f, BIOS settings: hdg:DMA, hdh:pio
Feb 11 02:52:16 valiant kernel: hdg: ST3300620A, ATA DISK drive
Feb 11 02:52:16 valiant kernel: ide3 at 0x2418-0x241f,0x2416 on irq 16
Feb 11 02:52:16 valiant kernel: hdg: max request size: 512KiB
Feb 11 02:52:16 valiant kernel: hdg: 586072368 sectors (300069 MB) w/16384KiB Cache, CHS=36481/255/63, UDMA(100)
Notice that ACPI is using the same IRQ as the HPT302 card (IRQ 16). A-HA! I DID have an IRQ conflict.
I don't really need ACPI on this system since it is a server, and ACPI won't be of much use here. So I went into the BIOS settings and turned off ACPI, and I also went ahead and turned off USB, since I don't use it either.
Here is that same section of /var/log/messages after I turned those things off:
Feb 11 22:45:05 valiant kernel: AMD7411: 0000:00:07.1 (rev 01) UDMA100 controller
Feb 11 22:45:05 valiant kernel: ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:pio, hdb:pio
Feb 11 22:45:05 valiant kernel: ide1: BM-DMA at 0xf008-0xf00f, BIOS settings: hdc:DMA, hdd:pio
Feb 11 22:45:05 valiant kernel: hdc: Pioneer DVD-ROM ATAPIModel DVD-116 0122, ATAPI CD/DVD-ROM drive
Feb 11 22:45:05 valiant kernel: ide1 at 0x170-0x177,0x376 on irq 15
Feb 11 22:45:05 valiant kernel: HPT302: IDE controller at PCI slot 0000:00:0b.0
Feb 11 22:45:05 valiant kernel: HPT302: chipset revision 2
Feb 11 22:45:05 valiant kernel: HPT302: 100% native mode on irq 11
Feb 11 22:45:05 valiant kernel: ide2: BM-DMA at 0x1000-0x1007, BIOS settings: hde:pio, hdf:pio
Feb 11 22:45:05 valiant kernel: ide3: BM-DMA at 0x1008-0x100f, BIOS settings: hdg:DMA, hdh:pio
Feb 11 22:45:05 valiant kernel: hdg: ST3300620A, ATA DISK drive
Feb 11 22:45:05 valiant kernel: ide3 at 0x2418-0x241f,0x2416 on irq 11
Feb 11 22:45:05 valiant kernel: hdg: max request size: 512KiB
Feb 11 22:45:05 valiant kernel: hdg: 586072368 sectors (300069 MB) w/16384KiB Cache, CHS=36481/255/63, UDMA(100)
Good, no apparent IRQ conflicts.
Actually, when I do "cat /proc/interrupts", it turns out that IRQ11 is used by both the HPT302 and one of the ethernet ports. But I ran a stress test for almost 9 hours with no problems.
Also, I mentioned in my original post that gzip seems to trigger the hang. That turns out to be a coincidence. At the stage where I would begin using gzip in my stress test, the ACPI would trigger an event and cause the hang.