LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Server
User Name
Password
Linux - Server This forum is for the discussion of Linux Software used in a server related context.

Notices


Reply
  Search this Thread
Old 03-06-2012, 10:49 AM   #1
stj5353
LQ Newbie
 
Registered: Mar 2012
Posts: 2

Rep: Reputation: Disabled
Angry mdadm is not rebuilding! Seems hung. How to restart??


Ok. So I have a hotplug backplane and mdadm managing 3 drives in a R5 array.

All works fine 50% of the time. When it fails it needs a reboot and I don't know why!

If I pull a drive and re-insert, it detects and starts rebuilding, but most times it get's hung:


root@iomega-array1:/etc# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath] [faulty]
md2 : active raid5 sdd2[1](F) sde2[2] sdc2[0]
1911562240 blocks super 1.0 level 5, 512k chunk, algorithm 2 [3/2] [U_U]
[>....................] recovery = 1.8% (17464448/955781120) finish=29315.0min speed=533K/sec

md1 : active raid1 sda2[0] sdb2[1]
467405488 blocks super 1.0 [2/2] [UU]

md0 : active raid1 sdd1[3](F) sda1[0] sde1[4] sdc1[2] sdb1[1]
20980816 blocks super 1.0 [5/4] [UUU_U]
resync=DELAYED

unused devices: <none>
root@iomega-array1:/etc#


It will stay at 1.8% until I reboot.

mdadm shows that the array is in rebuild, but its hung. It willnot move.


root@iomega-array1:/etc# mdadm --detail /dev/sdb
mdadm: /dev/sdb does not appear to be an md device
root@iomega-array1:/etc# mdadm --detail /dev/md2
/dev/md2:
Version : 1.00
Creation Time : Mon Mar 5 18:03:59 2012
Raid Level : raid5
Array Size : 1911562240 (1823.01 GiB 1957.44 GB)
Used Dev Size : 955781120 (911.50 GiB 978.72 GB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent

Update Time : Tue Mar 6 10:04:05 2012
State : active, degraded, recovering
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 512K

Rebuild Status : 1% complete

Name : iomega-array1:2 (local to host iomega-array1)
UUID : 9264acfb:af15b0a8:70a55835:3baea6ae
Events : 24

Number Major Minor RaidDevice State
0 8 34 0 active sync /dev/sdc2
1 8 50 1 faulty spare rebuilding /dev/sdd2
2 8 66 2 active sync /dev/sde2
root@iomega-array1:/etc#



I have tried all of these to force a rebuld but failed!

root@iomega-array1:/etc# mdadm --fail /dev/sdd2 /dev/md2
mdadm: error opening /dev/sdd2: No such device or address
root@iomega-array1:/etc# mdadm --remove /dev/md2 /dev/sdd2
mdadm: hot remove failed for /dev/sdd2: Device or resource busy
root@iomega-array1:/etc# mdadm --fail /dev/md2 /dev/sdd2
mdadm: set /dev/sdd2 faulty in /dev/md2
root@iomega-array1:/etc# mdadm --remove /dev/md2 /dev/sdd2
mdadm: hot remove failed for /dev/sdd2: Device or resource busy
root@iomega-array1:/etc# mdadm --stop /dev/md2
mdadm: failed to stop array /dev/md2: Device or resource busy
Perhaps a running process, mounted filesystem or active volume group?
root@iomega-array1:/etc#



I don't get what gives. How in the world can I initiate what would be normally initiated on a reboot? It will recover on reboot. I just don't want to bring down my other arrays when I do a drive swap.

Thanks for any insight.
 
Old 03-06-2012, 01:40 PM   #2
ba.page
Member
 
Registered: Feb 2012
Location: Canada
Distribution: Scientific,Debian
Posts: 35

Rep: Reputation: 7
given that /dev/sdd is also in use in your /dev/md0 array, taking this drive out WILL take degrade both arrays.
that said, I would personally not try to have a drive be a member of multiple device arrays.
also, seeing as how this happens often and hangs, maybe you just have a bad drive?

moving on, try this:
# mdadm /dev/md2 -f /dev/sdd2
# mdadm /dev/md2 -r /dev/sdd2
# mdadm --zero-superblock /dev/sdd2
# mdadm /dev/md2 -a /dev/sdd2
 
Old 03-06-2012, 01:48 PM   #3
stj5353
LQ Newbie
 
Registered: Mar 2012
Posts: 2

Original Poster
Rep: Reputation: Disabled
Oh I'm 100% on board with you WRT multiple partitions in arrays. The problem is that this is a SAN array from iomega, and they have a requirement to create md0 across all drives. I don't know why. It seems crazy to me, but they say it's just for storage and backup of their firmware and should not cause many (if any) IOs during normal operation.

Since I use their GUI to create and manage the arrays, I'm using mdadm via the backend to see whats going on.

mdadm, is still mdadm. Their GUI is just a wrapper. So I'm debugging why drives get stuck in rebuilding since iomega support is not really primed to work low level issues out with customers. They would rather you return the box than solve why this issue is happening. Acutally, they would be happier to just reboot, but that's not something I'm willing to accept...

But.. Digressing...

I've tested the drive in an ubuntu box and it seems fine and dandy. Can't really tie this problem to a drive since it pretty much happens on a rebuld, not necessarily a failure of a given drive.

I already tried those suggestions, just using the long switch. Force didn't seem to help.
root@iomega-array1:/etc# mdadm --fail /dev/sdd2 /dev/md2
mdadm: error opening /dev/sdd2: No such device or address
root@iomega-array1:/etc# mdadm --remove /dev/md2 /dev/sdd2
mdadm: hot remove failed for /dev/sdd2: Device or resource busy
root@iomega-array1:/etc# mdadm --fail /dev/md2 /dev/sdd2
mdadm: set /dev/sdd2 faulty in /dev/md2
root@iomega-array1:/etc# mdadm --remove /dev/md2 /dev/sdd2
mdadm: hot remove failed for /dev/sdd2: Device or resource busy
root@iomega-array1:/etc# mdadm --stop /dev/md2
mdadm: failed to stop array /dev/md2: Device or resource busy
Perhaps a running process, mounted filesystem or active volume group?
root@iomega-array1:/etc#
 
Old 03-06-2012, 01:59 PM   #4
ba.page
Member
 
Registered: Feb 2012
Location: Canada
Distribution: Scientific,Debian
Posts: 35

Rep: Reputation: 7
your problem is syntax

"# mdadm --fail /dev/sdd2 /dev/md2" is not a valid command
syntax should be: mdadm <raiddevice> [options] <component-devices>
(see man mdadm for details)

please try my suggestions explicitly:
# mdadm /dev/md2 -f /dev/sdd2
# mdadm /dev/md2 -r /dev/sdd2
# mdadm --zero-superblock /dev/sdd2
# mdadm /dev/md2 -a /dev/sdd2
 
Old 03-14-2012, 10:58 PM   #5
Atari911
LQ Newbie
 
Registered: Sep 2003
Location: California, USA
Distribution: Slackware 13.1
Posts: 12

Rep: Reputation: 1
sounds like you may have a process that is using the disk when you are attempting to rebuild the degraded array.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
XFS and mdadm (/dev/md0) hung, how do I get it unstuck? MikeyCarter Linux - Software 1 01-23-2012 11:42 PM
[SOLVED] RAID5 broken - mdadm unable to restart (Ubuntu 10.04) Maximilian12 Linux - Server 5 10-06-2011 01:48 PM
mdadm RAID1 failed and not rebuilding indienick Linux - Hardware 7 01-20-2009 10:45 AM
Shell script to check process and restart if "hung" georage Programming 5 10-29-2008 07:10 PM
RAID5 data rebuilding by mdadm results in corrupted data! rtyagi Linux - Newbie 0 03-26-2008 01:59 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Server

All times are GMT -5. The time now is 08:58 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration