LinuxQuestions.org
Latest LQ Deal: Linux Power User Bundle
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-13-2013, 01:38 PM   #1
PeteLindsey
LQ Newbie
 
Registered: Aug 2013
Posts: 15

Rep: Reputation: Disabled
Cannot add replacement drive (mdadm) not large enough to join array


using mdadm (version - v2.6.4 - 19th October 2007) on openfile filer kernel (2.6.29.6-0.24.smp.gcc3.4.x86.i686)

I replaced a faulty 4tb hitachi drive (with another exact model) the error "not large enough to join array"

steps:
installed new drive, using parted, created label and partition identical to existing drives. used partprobe to sync part table, tried to add the disk with: mdadm --add /dev/md1 /dev/sdb1

The troubleshooting so far:

blockdev --getsz
got same results for all 3

hdparam -g
got same results for all 3


I even tweaked the numbers in parted to give me a few blocks more so that there was no possibility of there being not enough space.

After changing command from:
mdadm --add /dev/md1 /dev/sdb1
to:
mdadm -vv --add --force /dev/md1 /dev/sdb1

I finally found what may be the root cause (mdadm: set device faulty failed for /dev/sdb1: No such device)

But the drive is fine, there is no indication from any other system or software that says there is a problem with the drive.

I even tried copying the first 1000 blocks off the bad drive to get the first copy of the GPT data off the drive, but nothing changes the results.

Can anyone give me any advice or directions? -- Very frustrating

TIA-

Cheers
 
Old 08-13-2013, 02:36 PM   #2
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,404

Rep: Reputation: Disabled
Did you remove the old drive with mdadm --manage /dev/md1 --remove /dev/sdb1 (assuming the old drive was /dev/sdb) before trying to add the new /dev/sdb1 partition to the array?

What does mdadm --detail /dev/md1 say?
 
Old 08-13-2013, 02:51 PM   #3
PeteLindsey
LQ Newbie
 
Registered: Aug 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
Follow up information

First of all, thank you very much for reviewing and contributing, this problem is really causing me stress...

Yes the remove command was issued, (sorry that I didn't mention that previous), here is the output from mdadm --detail /dev/md1

/dev/md1:
Version : 01.02.03
Creation Time : Sun Dec 2 21:16:26 2012
Raid Level : raid5
Array Size : 7814034688 (7452.04 GiB 8001.57 GB)
Used Dev Size : 7814034688
Raid Devices : 3
Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Tue Aug 13 13:50:30 2013
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

Layout : left-symmetric
Chunk Size : 128K

Name : 1
UUID : 36ab8587:89ff7d55:cae2e38b:eb115cf8
Events : 323410

Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 0 0 1 removed
2 8 49 2 active sync /dev/sdd1


Cheers
 
Old 08-13-2013, 03:06 PM   #4
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,404

Rep: Reputation: Disabled
OK, that looks good. (Please use [code] tags around information like that, as it makes it much more readable.)

Could you post the output from the following commands:
Code:
parted /dev/sda unit s print

parted /dev/sdb unit s print
 
Old 08-13-2013, 03:20 PM   #5
PeteLindsey
LQ Newbie
 
Registered: Aug 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
Well, for whatever reason parted doesn't respect the unit command (GNU parted version 1.6.22 perhaps an openfiler issue)

Code:
parted /dev/sda print
Disk geometry for /dev/sda: 0.000-3815447.835 megabytes
Disk label type: gpt
Minor    Start       End     Filesystem  Name                  Flags
1          0.017 3815446.836

parted /dev/sda print-fdisk
Disk /dev/sda: 4000.8 GB, 4000787030016 bytes
255 heads, 63 sectors/track, 486401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1      486402  3907017544+   0  (Not managed by Openfiler)

parted /dev/sdb print
Disk geometry for /dev/sdb: 0.000-3815447.835 megabytes
Disk label type: gpt
Minor    Start       End     Filesystem  Name                  Flags
1          0.017 3815447.819

parted /dev/sdb print-fdisk


Disk /dev/sdb: 4000.8 GB, 4000787030016 bytes
255 heads, 63 sectors/track, 486401 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1      486402  3907018551+   0  (Not managed by Openfiler)
You'll note that sdb1 is showing a few extra blocks, I did that while trying to overcome the "not large enough" problem, which I now think that somehow mdadm sees /dev/sdb1 as corrupt, but I can find nothing wrong with it and no log to support that...

Thanks again!
 
Old 08-13-2013, 03:31 PM   #6
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,404

Rep: Reputation: Disabled
I'm running GNU parted 3.1 with a 2012 copyright. 1.6.22 must be positively ancient; it isn't even available from the official FTP server.

You're right about the partition size. The error message is probably misleading, and the real error may be something else entirely. Does the /dev/sdb1 device node even exist?

On my system, messages from "md" are written to /var/log/messages. Could you try running tail -f /var/log/messages or tail -f /var/log/syslog in one session window while you attempt to add /dev/sdb1 to the array in another?
 
Old 08-13-2013, 03:38 PM   #7
PeteLindsey
LQ Newbie
 
Registered: Aug 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
Yes, and strangely enough nothing shows up (when I assemble I see that it kicks the drive but still doesn't offer a reason why)

Code:
mdadm --add /dev/md1 /dev/sdb1

mdadm: set device faulty failed for /dev/sdb1:  No such device
from a previous assemble attempt
Code:
mdadm --assemble --force /dev/md1 /dev/sd[abd]1
Aug 13 11:11:18 filer kernel: [ 3493.090650] md: md1 stopped.
Aug 13 11:11:18 filer kernel: [ 3493.090662] md: unbind<sda1>
Aug 13 11:11:18 filer kernel: [ 3493.100066] md: export_rdev(sda1)
Aug 13 11:11:18 filer kernel: [ 3493.100122] md: unbind<sdd1>
Aug 13 11:11:18 filer kernel: [ 3493.116042] md: export_rdev(sdd1)
Aug 13 11:11:51 filer kernel: [ 3526.496360] md: md1 stopped.
Aug 13 11:11:51 filer kernel: [ 3526.518747] md: bind<sdb1>
Aug 13 11:11:51 filer kernel: [ 3526.518902] md: bind<sdd1>
Aug 13 11:11:51 filer kernel: [ 3526.518980] md: bind<sda1>
Aug 13 11:11:51 filer kernel: [ 3526.518999] md: kicking non-fresh sdb1 from array!
Aug 13 11:11:51 filer kernel: [ 3526.519043] md: unbind<sdb1>
Aug 13 11:11:51 filer kernel: [ 3526.528538] md: export_rdev(sdb1)
Aug 13 11:11:51 filer kernel: [ 3526.537870] raid5: device sda1 operational as raid disk 0
Aug 13 11:11:51 filer kernel: [ 3526.537872] raid5: device sdd1 operational as raid disk 2
Aug 13 11:11:51 filer kernel: [ 3526.538219] raid5: allocated 3176kB for md1
Aug 13 11:11:51 filer kernel: [ 3526.538222] raid5: raid level 5 set md1 active with 2 out of 3 devices, algorithm 2
Aug 13 11:11:51 filer kernel: [ 3526.538290] RAID5 conf printout:
Aug 13 11:11:51 filer kernel: [ 3526.538324]  --- rd:3 wd:2
Aug 13 11:11:52 filer kernel: [ 3526.538358]  disk 0, o:1, dev:sda1
Aug 13 11:11:52 filer kernel: [ 3526.538392]  disk 2, o:1, dev:sdd1
Aug 13 11:11:52 filer kernel: [ 3526.541166]  md1: unknown partition table
 
Old 08-13-2013, 03:42 PM   #8
PeteLindsey
LQ Newbie
 
Registered: Aug 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
obviously the last entry there, showing md1: unknown partition table is what it believes, but I can't figure why it feels the partition is no good?!?
 
Old 08-13-2013, 03:47 PM   #9
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,404

Rep: Reputation: Disabled
That error message is a red herring. /dev/md1 doesn't have a partition table at at all, as it contains a file system. It's just the kernel checking every new device for the existence of a partition table and getting confused when it doesn't see one.

I'm more interested in the "no such device" message. Please run ls -l /dev/sd* and post the output.
 
Old 08-13-2013, 03:54 PM   #10
PeteLindsey
LQ Newbie
 
Registered: Aug 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
I wish I could see where the system has a problem accessing the drive... I can access the drive and it's partition, i had used dd at one point in the troubleshooting to try and trick the system into believing the drive was the old one that had failed.

Code:
ls -l /dev/sd*
brw-r-----  1 root disk 8,  0 Aug 13 08:24 /dev/sda
brw-r-----  1 root disk 8,  1 Aug 13 08:24 /dev/sda1
brw-r-----  1 root disk 8, 16 Aug 13 08:24 /dev/sdb
brw-r-----  1 root disk 8, 17 Aug 13 08:24 /dev/sdb1
brw-r-----  1 root disk 8, 32 Aug 13 08:24 /dev/sdc
brw-r-----  1 root disk 8, 33 Aug 13 13:25 /dev/sdc1
brw-r-----  1 root disk 8, 34 Aug 13 08:24 /dev/sdc2
brw-r-----  1 root disk 8, 48 Aug 13 08:24 /dev/sdd
brw-r-----  1 root disk 8, 49 Aug 13 08:24 /dev/sdd1
 
Old 08-13-2013, 04:08 PM   #11
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,404

Rep: Reputation: Disabled
You shouldn't try to copy anything from an old array component to a new one unless the array is non-functional. You be copying the RAID metadata, and there's no way that will contain valid information unless you actually manage to copy the entire partition without errors.

Perhaps the md driver picked up the partition and considered it part of a foreign array. What does mdadm --examine /dev/sdb1 return? Do you by any chance have any unexpected md devices in /dev? (ls -l /dev/md* should tell you)
 
Old 08-13-2013, 04:14 PM   #12
PeteLindsey
LQ Newbie
 
Registered: Aug 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
because I had done the dd for the first 1000 blocks ( I was trying to grab the GPT blocks ) there was metadata from the old drive so after that didn't work I did mdadm --zero-superblock /dev/sdb1
to go back to normal. output of first command
Code:
mdadm --examine /dev/sdb1
mdadm: No md superblock detected on /dev/sdb1.
as far as additional md's I dont think there is anything rogue here
Code:
ls -l /dev/md*
brw-r-----  1 root disk 9, 1 Aug 13 08:24 /dev/md1
fyi the mdadm.conf is here:
Code:
cat /etc/mdadm.conf
ARRAY /dev/md1 level=raid5 num-devices=3 name=1 UUID=36ab8587:89ff7d55:cae2e38b:eb115cf8
   devices=/dev/sda1,/dev/sdb1,/dev/sdd1
 
Old 08-13-2013, 04:20 PM   #13
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,404

Rep: Reputation: Disabled
If you run these commands in sequence, what feedback do you get and what (if anything) appears in the log?
Code:
mdadm --manage /dev/md1 --fail /dev/sdb1

mdadm --manage /dev/md1 --remove /dev/sdb1

mdadm --add /dev/md1 --add /dev/sdb1
 
Old 08-13-2013, 04:27 PM   #14
PeteLindsey
LQ Newbie
 
Registered: Aug 2013
Posts: 15

Original Poster
Rep: Reputation: Disabled
Code:
mdadm --manage /dev/md1 --fail /dev/sdb1
mdadm: set device faulty failed for /dev/sdb1:  No such device
mdadm --manage /dev/md1 --remove /dev/sdb1
mdadm: hot remove failed for /dev/sdb1: No such device or address
mdadm --add /dev/md1 --add /dev/sdb1
mdadm: /dev/sdb1 not large enough to join array
output from /var/log/messages had no events
Code:
tail /var/log/messages
Aug 13 15:00:01 filer crond(pam_unix)[5445]: session opened for user root by (uid=0)
Aug 13 15:00:02 filer crond(pam_unix)[5445]: session closed for user root
Aug 13 15:00:02 filer crond(pam_unix)[5446]: session closed for user openfiler
Aug 13 15:01:01 filer crond(pam_unix)[5456]: session opened for user root by (uid=0)
Aug 13 15:01:01 filer crond(pam_unix)[5456]: session closed for user root
Aug 13 15:10:01 filer crond(pam_unix)[5478]: session opened for user root by (uid=0)
Aug 13 15:10:02 filer crond(pam_unix)[5478]: session closed for user root
Aug 13 15:20:01 filer crond(pam_unix)[5503]: session opened for user root by (uid=0)
Aug 13 15:20:01 filer crond(pam_unix)[5503]: session closed for user root
Aug 13 15:25:58 filer ntpd[3163]: synchronized to 50.116.27.42, stratum 2
 
Old 08-13-2013, 04:40 PM   #15
Ser Olmy
Senior Member
 
Registered: Jan 2012
Distribution: Slackware
Posts: 2,404

Rep: Reputation: Disabled
At least the source of problem is clear: As far as mdadm is concerned, /dev/sdb1 simply does not exist or is invalid.

I ran the same commands on one of my servers, and running the "--fail" or "--remove" commands multiple times still results in the same "set /dev/xxx faulty" and "hot removed /dev/xxx" messages. You should not be getting a "No such device" message.

If blockdev --report /dev/sdb1 returns valid data, I must say I'm out of ideas right now. I guess you could try re-reading all partition tables with partprobe but other than that, all I can suggest is a reboot. Which I suspect you've already tried.

Last edited by Ser Olmy; 08-13-2013 at 04:46 PM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
MDADM Raid 5 Array - OS Drive Failure. Help Urgently Needed. wazz72 Linux - Server 10 06-09-2011 07:40 PM
RAID mdadm cant add disks to array vockleya Linux - Software 4 09-13-2010 06:37 PM
Add disk to raid1 array in imsm container with mdadm (not losing data). 82801ER ICH5R moon300 Linux - General 7 01-06-2010 04:39 PM
Mdadm: reporting 2 drive failures in RAID5 array wolfywolf Linux - Software 3 04-26-2009 12:54 PM
mdadm raid 5 and failing drive doesn't drop out of array elpresidente44 Linux - Server 1 05-24-2008 08:38 PM


All times are GMT -5. The time now is 09:19 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration