Daws 09-07-2006 08:08 AM

SOLVED:linux-kernel-headers: I think problem boils down to : Unknown Error 990
Right, today has not been a good day.

This morning, a seemingly innocent apt-get dist-upgrade has become a catastrophe. The more I tried to fix it, the worse things got.

Among the packages upgraded was linux-kernel-headers, this is the problem package. After upgrading a few packages it got to linux-kernel-headers. I cannot remember the exact error for linux-kernel-headers (so long ago...) but the remaining packages failed and it ended.


E: Sub-process /usr/bin/dpkg returned an error code (1)
I'd seen this error before, didn't worry me too much. So I installed the other half-upgraded packages and nothing went wrong. I then tried to install linux-kernel-headers again using apt (redownloaded the package in case it was corrupted) and got a similar error.

In retrospect perhaps this is when I should have come here or elsewhere on the net for a solution...but no...

I decided to play with dpkg itself. dpkg reported that linux-kernel-headers was in a poor state and should be reinstalled which I tried... dpkg then segfaults. From this point on every command I type produces:


bash: command: Input/output error
even reboot and halt. eep...

I restart the computer (using the button, something I haven't had to do in years)

To my surprise everything seems to be working ok ... until I try to use apt-get or dpkg at which point I get the Input/output error again and have to reset. So I start browsing the dpkg man-page for a solution no luck. I try reinstalling again this time piping it through less:

Cannot remove file /usr/include/linux/version.h.dpkg.tmp : Unknown Error 990(or something similar, again my memory is hazy, I was so far past the point of no return...)

I took it upon myself to see what was going on


root:/# cd /usr/include
root:/usr/include# cd linux
bash: cd: linux: Unknown Error 990

That was an hour ago and I haven't had any luck since. I don't know whether this caused the problem with dpkg or I caused this problem by messing with dpkg.

Anyone got any ideas?

Edit: Just thought I'd say that everything important is backed up so any potentially fatal suggestions are welcome too. Cheers

Daws 09-07-2006 09:47 AM

Ok. Getting somewhere now. Came across this in the dmesg of knoppix


Filesystem "hdc5": XFS internal error xfs_da_do_buf(1) at line 2119 of file fs/xfs/xfs_da_btree.c.  Caller 0xe105a98d
 <e105a2c9> xfs_da_do_buf+0x225/0x86c [xfs]  <e105a98d> xfs_da_read_buf+0x25/0x2c [xfs]
 <e10615b2> xfs_dir2_data_make_free+0x1ce/0x390 [xfs]  <e105a98d> xfs_da_read_buf+0x25/0x2c [xfs]
 <e106528b> xfs_dir2_node_removename+0x267/0x4e4 [xfs]  <e106528b> xfs_dir2_node_removename+0x267/0x4e4 [xfs]
 <e105f2f6> xfs_dir2_removename+0x10a/0x110 [xfs]  <e1096371> kmem_zone_zalloc+0x21/0x48 [xfs]
 <e10937ef> xfs_remove+0x283/0x400 [xfs]  <c0176522> __link_path_walk+0x8a/0xff4
 <e109cc67> xfs_vn_unlink+0x1f/0x48 [xfs]  <e108fb67> xfs_access+0x2b/0x34 [xfs]
 <c0137ac2> debug_mutex_add_waiter+0x2e/0x94  <c032b57a> __mutex_lock_slowpath+0x1a2/0x3f4
 <c01758a5> vfs_unlink+0x65/0xdc  <c032b5b8> __mutex_lock_slowpath+0x1e0/0x3f4
 <c01758a5> vfs_unlink+0x65/0xdc  <c01758fb> vfs_unlink+0xbb/0xdc
 <c0177da8> do_unlinkat+0xac/0x134  <c010336b> syscall_call+0x7/0xb
Filesystem "hdc5": XFS internal error xfs_trans_cancel at line 1150 of file fs/xfs/xfs_trans.c.  Caller 0xe109383d
 <e108a173> xfs_trans_cancel+0xef/0x110 [xfs]  <e109383d> xfs_remove+0x2d1/0x400 [xfs]
 <e109383d> xfs_remove+0x2d1/0x400 [xfs]  <c0176522> __link_path_walk+0x8a/0xff4
 <e109cc67> xfs_vn_unlink+0x1f/0x48 [xfs]  <e108fb67> xfs_access+0x2b/0x34 [xfs]
 <c0137ac2> debug_mutex_add_waiter+0x2e/0x94  <c032b57a> __mutex_lock_slowpath+0x1a2/0x3f4
 <c01758a5> vfs_unlink+0x65/0xdc  <c032b5b8> __mutex_lock_slowpath+0x1e0/0x3f4
 <c01758a5> vfs_unlink+0x65/0xdc  <c01758fb> vfs_unlink+0xbb/0xdc
 <c0177da8> do_unlinkat+0xac/0x134  <c010336b> syscall_call+0x7/0xb
xfs_force_shutdown(hdc5,0x8) called from line 1151 of file fs/xfs/xfs_trans.c.  Return address = 0xe108a18a
Filesystem "hdc5": Corruption of in-memory data detected.  Shutting down filesystem: hdc5
Please umount the filesystem, and rectify the problem(s)

"hdc5" is /usr, I am beginning to think this problem is bigger than dpkg. So can anyone help me "rectify the problem(s)"?

Daws 09-07-2006 02:28 PM

Problem solved, turns out hdc5 was quite badly corrupted and had been for quite a while.

For future reference, if you ever run into Unknown Error 990:

xfs_repair /dev/hdc5
sorted the problem. (Eventually, I had to update xfs_progs on knoppix 5.01 because I ran into bug 631

farslayer 09-07-2006 02:29 PM

This looks like it could be related....
what kernel version were/are you running ?

Daws 09-07-2006 02:48 PM

Thanks farslayer, looks like this was the problem, I am running the stock debian 2.6.17 kernel. Looks like that partcular patch is not incorporated into debian's kernels yet or was only included recently.

It really was a very nasty bug, lost+found now contains hundreds of entries. I suppose it could have been worse, it only destroyed the linux includes and a few other easily replaced dirs.

Well let this be a solemn warning to anyone using xfs and a 2.6.17 (.6 or earlier) kernel.

