LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices

Reply
 
Thread Tools
Old 11-07-2009, 05:10 AM   #1
aquabubble
LQ Newbie
 
Registered: Nov 2009
Posts: 1
Thanked: 0
Unhappy RAID5 fs errors after new install


[Log in to get rid of this advertisement]
I've got a filsystem error on one of my RAID5 arrays that seems to have occurred in the course of a new install of Karmic.

My machine has two RAID5 arrays, one of 6x250GB HDDs (md0) and another of 4x300GB HDDs (md1). These had previously been created under EVMS though not as EVMS volumes.

A few months ago I had a system drive failure and it's taken me until now to get around to fixing things. To first test that all other hardware was working properly, I booted the SystemRescueCD, which was able to assemble both of my RAID arrays as md0 and md1. Further examination revealed that one drive from md0 had failed though the array was still functioning in degraded mode. Thankfully, I was able to mount them and browse their filesystems.

sudo mdadm --detail /dev/md0

Code:
 
/dev/md0:
        Version : 00.90
  Creation Time : Sat Mar  4 18:31:44 2006
     Raid Level : raid5
     Array Size : 1220992000 (1164.43 GiB 1250.30 GB)
  Used Dev Size : 244198400 (232.89 GiB 250.06 GB)
   Raid Devices : 6
  Total Devices : 5
Preferred Minor : 0
    Persistence : Superblock is persistent
    Update Time : Fri Nov  6 08:58:20 2009
          State : clean, degraded
 Active Devices : 5
Working Devices : 5
 Failed Devices : 0
  Spare Devices : 0
         Layout : left-asymmetric
     Chunk Size : 128K
           UUID : ce9f5c43:5d44a130:2e63d74e:ed30a123
         Events : 0.6778878
    Number   Major   Minor   RaidDevice State
       0       8       80        0      active sync   /dev/sdf
       1       8        0        1      active sync   /dev/sda
       2       8       16        2      active sync   /dev/sdb
       3       0        0        3      removed
       4       8       32        4      active sync   /dev/sdc
       5       8       96        5      active sync   /dev/sdg
Time to start rebuilding the server; downloaded 9.10 Server edition and set about re-installing on a fresh 80GB HDD. Install went well though on reboot, the boot stalled at the "Loading GRUB" message for a few minutes - (that's a different problem though)!

On logging in, I checked that the arrays were assembled and all-present. md1 mounted okay but md0 failed to mount.

sudo mount /dev/md0 /mnt/raid_array_0

Code:
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so
Oh dear. What's this?

dmesg | tail

Code:
[ 1008.730588]  --- rd:6 wd:5
[ 1008.730592]  disk 0, o:1, dev:sdf
[ 1008.730596]  disk 1, o:1, dev:sda
[ 1008.730600]  disk 2, o:1, dev:sdb
[ 1008.730603]  disk 4, o:1, dev:sdc
[ 1008.730607]  disk 5, o:1, dev:sdg
[ 1008.730681] md0: detected capacity change from 0 to 1250295808000
[ 1008.731251]  md0: unknown partition table
[ 2179.042477] EXT2-fs error (device md0): ext2_check_descriptors: Block bitmap for group 3968 not in group (block 126903275)!
[ 2179.042514] EXT2-fs: group descriptors corrupted!
Argh! How could this be. Delve deeper...

sudo fsck -n /dev/md0

Code:
 
fsck from util-linux-ng 2.16
e2fsck 1.41.9 (22-Aug-2009)
fsck.ext2: Group descriptors look bad... trying backup blocks...
Superblock has an invalid journal (inode 8).
Clear? no
fsck.ext2: Illegal inode number while checking ext3 journal for raid5_arr0_vol0
Crikes. Any more info available I wonder?

sudo debugfs -c /dev/md0

Code:
debugfs 1.41.9 (22-Aug-2009)
/dev/md0: catastrophic mode - not reading inode or group bitmaps
debugfs:  ncheck 8
Inode   Pathname
ncheck: Can't read next inode while doing inode scan
Well that didn't tell me anything, apart from I'm in trouble. I wonder what fdisk will tell me?

sudo fdisk -l

Code:
 
Disk /dev/sda: 250.1 GB, 250059350016 bytes
8 heads, 1 sectors/track, 61049646 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Disk identifier: 0x00000000
   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *       16633       16882         995+  c7  Syrinx
/dev/sda2               1           1           0    0  Empty
Partition 2 does not end on cylinder boundary.
/dev/sda3       268452089   268452338         995+  c7  Syrinx
/dev/sda4               1           1           0    0  Empty
Partition 4 does not end on cylinder boundary.
Disk /dev/sdb: 250.1 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sdb doesn't contain a valid partition table
Disk /dev/sdd: 320.1 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sdd doesn't contain a valid partition table
Disk /dev/sdc: 250.1 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sde: 320.1 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sde doesn't contain a valid partition table
Disk /dev/sdf: 250.1 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sdf doesn't contain a valid partition table
Disk /dev/sdg: 250.1 GB, 250059350016 bytes
255 heads, 63 sectors/track, 30401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sdg doesn't contain a valid partition table
Disk /dev/sdh: 320.1 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sdh doesn't contain a valid partition table
Disk /dev/sdj: 80.0 GB, 80026361856 bytes
255 heads, 63 sectors/track, 9729 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000001
   Device Boot      Start         End      Blocks   Id  System
/dev/sdj1               1        9698    77899153+  8e  Linux LVM
/dev/sdj2            9699        9729      249007+   5  Extended
/dev/sdj5            9699        9729      248976   83  Linux
Disk /dev/sdi: 320.1 GB, 320072933376 bytes
255 heads, 63 sectors/track, 38913 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00000000
Disk /dev/sdi doesn't contain a valid partition table
Disk /dev/md1: 960.2 GB, 960218529792 bytes
2 heads, 4 sectors/track, 234428352 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Disk identifier: 0x00000000
Disk /dev/md1 doesn't contain a valid partition table
Disk /dev/md0: 1250.3 GB, 1250295808000 bytes
2 heads, 4 sectors/track, 305248000 cylinders
Units = cylinders of 8 * 512 = 4096 bytes
Disk identifier: 0x00000000
Disk /dev/md0 doesn't contain a valid partition table
Hang on... what's this Syrinx stuff doing on my /dev/sda device? I've not seen that there before, but I can't be sure because it's a while since I looked at the system.

Has the install of 9.10 killed my RAID5 array? How could it? Is it related to the GRUB problem? Is it related to the strange partition table on /dev/sda? More importantly, does anybody know how can I recover from this? - apart from launching into fsck -y with reckless abandon?

I'd be very grateful for any assistance.

Last edited by aquabubble; 11-07-2009 at 05:51 AM.. Reason: Added better introduction to the post
windows_98_nt_2000 aquabubble is offline     Reply With Quote

Reply

Bookmarks


Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Multi Layer RAID50 fail (Intel SRCS14L RAID5 + 3ware 9550SX-4LP RAID5)+Linux RAID 0 BaronVonChickenPants Linux - Server 4 09-27-2009 05:06 AM
RAID5 doesn't work after new Ubuntu install kilbasar Linux - Hardware 5 02-16-2009 08:38 PM
New Install, Old Hardware RAID5 scambro Linux - Hardware 5 02-07-2009 04:04 AM
SuSe Install - LV, RAID5, or both? jeager Linux - Hardware 1 12-11-2006 05:12 PM
SuSe Install - LV, RAID5, or both? jeager Linux - Software 0 12-11-2006 02:56 PM


All times are GMT -5. The time now is 05:29 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
RSS2  LQ Podcast
RSS2  LQ Radio
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: @linuxquestions
Open Source Consulting | Domain Registration