LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   Random grub errors {16, 17, 18} on boot (https://www.linuxquestions.org/questions/linux-general-1/random-grub-errors-%7B16-17-18%7D-on-boot-659407/)

justanotherkv 07-30-2008 11:54 PM

Random grub errors {16, 17, 18} on boot
 
Lately, I've been getting grub errors upon rebooting my local server, specifically errors 16, 17, and 18 (which one shows up is random). Reinstalling grub from a livecd solves the error for 1 or 2 reboots, but it eventually comes back. I've reinstalled the OS (first was debian stable, second is Ubuntu) which was okay for around 10 reboots, but now it's starting again. The filesystem is okay (at least badblocks doesn't turn up anything (Reiserfs)) and I don't have another disk that I can try out (aside from the terribly old ones).

Any ideas? Should I just dish out for a new disk to try out, since the problem doesn't appear to be on the software side (or is there something I missed trying?)?

ronlau9 07-31-2008 03:27 AM

I could be that the boot sector is just bad .
As last possibility delete the primary sector , it mostly delete every think
Create a new primary sector and use gparted to set the boot flag at on again

storkus 07-31-2008 04:15 AM

Agreed. While logged in as root, use hdparm or smartctl to run some low-level drive tests (see the relevant man pages). You may also want to see if smartd is running, as it is not enabled by default on all distributions (Slackware, for instance, though I frankly don't understand why).
If the SMART is reporting problems, BACK UP IMMEDIATELY and plan to switch out the drives. Ditto if any of those drive-level tests reports serious errors.

The one other possibility is that something is wrong with the GRUB configuration (wrong addressing type, etc). AFAIK, this kind of thing is rare these days as everything is fairly standardized on LBA, but you never know.

Hope this helps,

Mike

justanotherkv 07-31-2008 08:14 PM

okay, I have to leave soon, so a few questions that might be obvious with simple searches:

1. suppose that I have to resort to the last ditch erase the primary sector method: should I just use dd for that?
2. Hmm, hdparm gives off
HDIO_GET_UNMASKINTR failed: Inappropriate ioctl for device
HDIO_GET_DMA failed: Inappropriate ioctl for device
HDIO_GET_KEEPSETTINGS failed: Inappropriate ioctl for device
with default switches. Are those bad? smartctl hasn't turned up anything bad yet: I've enabled smartd to start at next reboot once the long smartctl test is done.

Another note: the grub config's worked for at least around a year, it's just been tripping out lately. Still not sure why.

Thanks!

storkus 08-03-2008 02:26 AM

Another possibility I just remembered from a different thread: bad memory. You may want to run memtest86. I don't think that's the problem, but it doesn't hurt to make sure it's ruled out.

As for the IOCTL problems: is this a SATA drive? If so, there's a problem in older tools because SATA is really the ATA or ATAPI protocol riding on top of SCSI, but the tools don't understand that. For instance, only the latest Slackware (12.1) has a new enough version. Otherwise, you'll need to upgrade. However, if the drive is PATA this doesn't apply and we're back to step 1.

Erasing the primary sector: this will make your drive unbootable as the first chain loader lives there (it's where the BIOS goes to find it). What you can do, if you're desperate, is back up the entire drive (preferably onto another known good drive of equal or greater size) using a drive imager and then perform a low level destructive test on the drive (where data is written on the drive and the old data is not preserved). If this works, and smartctl reports nothing, then your drive is probably sane and you'll have to look elsewhere.

Mike

justanotherkv 08-03-2008 03:47 AM

Hmm, I've done memtest86, but I almost always got impatient and didn't let the test run all the way. Maybe I should try it again...

Hmm, it's a pata drive. Bah.

Smartctl didn't turn up anything bad (the expected lifetime of the drive appeared good), and it looks like the rest of the info looks okay. Oh boy, I'm sleepy. I'll have to look into this tomorrow...

checkmate3001 08-03-2008 05:06 AM

This may sound stupid... but check all the connections on the drives.
Perhaps one has wiggled loose a little bit?

ronlau9 08-04-2008 02:12 AM

Quote:

Originally Posted by storkus (Post 3234906)
Erasing the primary sector: this will make your drive unbootable as the first chain loader lives there (it's where the BIOS goes to find it). e.

Mike

Yes that is why I write before after deleting the primary sector
create a new one , but the boot flag at on again .
Result every thing is new and boot able again
I have done it several time and works well
Mostly I do it with the install CD of Kubuntu , before she actually
start installing I switch off the main power , my HD is clean and boot able again
Most likely gparted will do the job too


All times are GMT -5. The time now is 12:02 PM.