LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 08-08-2016, 07:31 AM   #1
davejones
LQ Newbie
 
Registered: Aug 2016
Posts: 13

Rep: Reputation: Disabled
Raid1 problem


I have recently bought a dedicated server, I could not afford a managed server and I'm will to learn but this problem has come to quick for me so excuse my ignorance.

Within a few weeks of getting the server it's got a fault drive in a 2 drive raid. After whatt seems like days of reading I've managed to a) remove from the array the wrong drive and b) re add it :-) I've now got part way to removing the fauilty drive but this won't work, here is the print out; can anyone give me any help on whats wrong here, I need to remove drive sda as that's the faulty drive.


Filesystem Size Used Avail Use% Mounted on
/dev/md2 1008G 20G 937G 3% /
tmpfs 16G 0 16G 0% /dev/shm
/dev/md1 496M 35M 436M 8% /boot
/dev/md3 1.7T 5.3G 1.6T 1% /home

[root@svr1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[2]
16777088 blocks super 1.0 [2/1] [_U]

md3 : active raid1 sdb5[1]
1839216960 blocks super 1.0 [2/1] [_U]
bitmap: 11/14 pages [44KB], 65536KB chunk

md2 : active raid1 sdb3[1]
1073741632 blocks super 1.0 [2/1] [_U]
bitmap: 6/8 pages [24KB], 65536KB chunk

md1 : active raid1 sda2[0]
524224 blocks [2/1] [U_]

unused devices: <none>
[root@svr1 ~]# mdadm --manage /dev/md1 --fail /dev/sda2
mdadm: set device faulty failed for /dev/sda2: Device or resource busy
[root@svr1 ~]# mdadm --manage /dev/md1 --remove /dev/sda2
mdadm: hot remove failed for /dev/sda2: Device or resource busy
[root@svr1 ~]# mdadm --manage /dev/md1 --stop
mdadm: Cannot get exclusive access to /dev/md1:Perhaps a running process, mounted filesystem or active volume group?
 
Old 08-08-2016, 08:02 AM   #2
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
None of your devices has more than one disk... you can't fail the last disk of a raid1. You have to install a new disk, partition it appropriately, and then add the appropriate partition to the raid1. Once you have two disks in a raid1 you can then fail one of the two out. If in the past you DID have more than two disks installed, the system has already removed the faulty one (though I thought that was a manual operation and not automatic; but I can see the possibility that it happened during a boot - I didn't test for that).

/dev/sda2 is in use by md1 - and there are no other disks in use.

md0, md2, and md3 are all on the SAME disk (/dev/sdb), so you have no redundancy anywhere. Lose that disk and you lose all three filesystems, and with no recovery possible.
 
Old 08-08-2016, 08:31 AM   #3
davejones
LQ Newbie
 
Registered: Aug 2016
Posts: 13

Original Poster
Rep: Reputation: Disabled
I really don't understand raid at all do I :-)

This is how the raid was before I attempted to remove the faulty drive.


Software RAID:
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
16777088 blocks super 1.0 [2/2] [UU]

md3 : active raid1 sdb5[1]
1839216960 blocks super 1.0 [2/1] [_U]
bitmap: 11/14 pages [44KB], 65536KB chunk

md2 : active raid1 sdb3[1]
1073741632 blocks super 1.0 [2/1] [_U]
bitmap: 6/8 pages [24KB], 65536KB chunk

md1 : active raid1 sda2[0] sdb2[1]
524224 blocks [2/2] [UU]

unused devices:
Partition info:
Filesystem Size Used Avail Use% Mounted on
/dev/md2 1008G 20G 937G 3% /
tmpfs 16G 0 16G 0% /dev/shm
/dev/md1 496M 35M 436M 8% /boot
/dev/md3 1.7T 5.3G 1.6T 1% /home


From their I attempted to remove sdb by mistake.
I then added it back and it did rebuild.
Last thing was I tried to do was removed sda and added grub to sdb.
I don't really know where I am now, are you saying I can now get the host to physically replace the damaged drive sda ?

Bare in mind for other reasons I've not slept for 36 hours now :-)
 
Old 08-08-2016, 09:56 AM   #4
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,573

Rep: Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142
Even in that output you have two "raid" arrays with only one partition in each, md2 and md3. Who set up this system originally? Is it possible sda had already ejected itself from those two arrays before you got that output? Do you have the mdadm status from when everything was working correctly?

If you want to remove sda, you'll need to add sdb2 back into md1, let it rebuild and sync, and then you can remove sda2 from md1. At that point, sda will not be in use by any arrays and can be removed from the system. If you were to remove sda now, you would lose md1 which contains your /boot partition.

Last edited by suicidaleggroll; 08-08-2016 at 09:58 AM.
 
Old 08-08-2016, 10:05 AM   #5
davejones
LQ Newbie
 
Registered: Aug 2016
Posts: 13

Original Poster
Rep: Reputation: Disabled
As I said this is an unmanaged server, my first so I installed the OS from the given image in the host admin area.

I don't have the details of the system before the HD went faulty unles this is it

Software RAID:
Personalities : [raid1]
md0 : active raid1 sda1[0] sdb1[1]
16777088 blocks super 1.0 [2/2] [UU]

md3 : active raid1 sdb5[1]
1839216960 blocks super 1.0 [2/1] [_U]
bitmap: 11/14 pages [44KB], 65536KB chunk

md2 : active raid1 sdb3[1]
1073741632 blocks super 1.0 [2/1] [_U]
bitmap: 6/8 pages [24KB], 65536KB chunk

md1 : active raid1 sda2[0] sdb2[1]
524224 blocks [2/2] [UU]

unused devices:
Partition info:
Filesystem Size Used Avail Use% Mounted on
/dev/md2 1008G 20G 937G 3% /
tmpfs 16G 0 16G 0% /dev/shm
/dev/md1 496M 35M 436M 8% /boot
/dev/md3 1.7T 5.3G 1.6T 1% /home





. it's possible that I messed it up in my efforts.

I'll do as you suggest, this help is very much appreciated. Once I have this sorted I think I will setup some old hardware I have and really nail the understand of raid. Then perhaps set this server up correctly.

Last edited by davejones; 08-08-2016 at 10:07 AM.
 
Old 08-08-2016, 10:08 AM   #6
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,573

Rep: Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142
No need for dedicated hardware for testing, just use virtual machines. Give your VM two disks, and then inside the VM you can paritition and raid them however you like. If you screw something up, just restore from a backup or snapshot.
 
Old 08-08-2016, 11:16 AM   #7
davejones
LQ Newbie
 
Registered: Aug 2016
Posts: 13

Original Poster
Rep: Reputation: Disabled
Still can not remove sda2


[root@svr1 ~]# mdadm --manage /dev/md1 --add /dev/sdb2

[root@svr1 ~]# cat /proc/mdstat

Personalities : [raid1]
md0 : active raid1 sdb1[2]
16777088 blocks super 1.0 [2/1] [_U]

md3 : active raid1 sdb5[1]
1839216960 blocks super 1.0 [2/1] [_U]
bitmap: 11/14 pages [44KB], 65536KB chunk

md2 : active raid1 sdb3[1]
1073741632 blocks super 1.0 [2/1] [_U]
bitmap: 6/8 pages [24KB], 65536KB chunk

md1 : active raid1 sdb2[1] sda2[0]
524224 blocks [2/2] [UU]

unused devices: <none>
[root@svr1 ~]# mdadm --manage /dev/md1 --remove /dev/sda2
mdadm: hot remove failed for /dev/sda2: Device or resource busy
[root@svr1 ~]#
 
Old 08-08-2016, 11:52 AM   #8
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,573

Rep: Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142
Use "mdadm --detail /dev/md1" to see the current status of md1. It's likely using sda2 to rebuild sdb2. As I mentioned in my steps before:
Quote:
you'll need to add sdb2 back into md1, let it rebuild and sync, and then you can remove sda2 from md1
This is the output from a sync'd and clean raid 1:
Code:
# mdadm --detail /dev/md0
/dev/md0:
        Version : 1.2
  Creation Time : Fri Oct 17 10:00:28 2014
     Raid Level : raid1
     Array Size : 1953381184 (1862.89 GiB 2000.26 GB)
  Used Dev Size : 1953381184 (1862.89 GiB 2000.26 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

    Update Time : Mon Aug  8 10:52:28 2016
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : gauss:0  (local to host gauss)
           UUID : 7b746352:9db52f98:c2b0b38e:fb041063
         Events : 5416

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1

Last edited by suicidaleggroll; 08-08-2016 at 11:54 AM.
 
Old 08-08-2016, 11:59 AM   #9
davejones
LQ Newbie
 
Registered: Aug 2016
Posts: 13

Original Poster
Rep: Reputation: Disabled
Filesystem Size Used Avail Use% Mounted on
/dev/md2 1008G 20G 937G 3% /
tmpfs 16G 0 16G 0% /dev/shm
/dev/md1 496M 35M 436M 8% /boot
/dev/md3 1.7T 5.3G 1.6T 1% /home

[root@svr1 ~]# mdadm --detail /dev/md1
/dev/md1:
Version : 0.90
Creation Time : Tue Jul 5 14:29:48 2016
Raid Level : raid1
Array Size : 524224 (511.94 MiB 536.81 MB)
Used Dev Size : 524224 (511.94 MiB 536.81 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 1
Persistence : Superblock is persistent

Update Time : Mon Aug 8 16:58:09 2016
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0

UUID : c3cfb83a:5b7936f1:776c2c25:004bd7b2
Events : 0.90

Number Major Minor RaidDevice State
0 8 2 0 active sync /dev/sda2
1 8 18 1 active sync /dev/sdb2
[root@svr1 ~]# mdadm --manage /dev/md1 --remove /dev/sda2
mdadm: hot remove failed for /dev/sda2: Device or resource busy
[root@svr1 ~]#
 
Old 08-08-2016, 12:01 PM   #10
suicidaleggroll
LQ Guru
 
Registered: Nov 2010
Location: Colorado
Distribution: OpenSUSE, CentOS
Posts: 5,573

Rep: Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142Reputation: 2142
How about if you add the fail flag:
Code:
mdadm /dev/md1 --fail /dev/sda2 --remove /dev/sda2
 
Old 08-08-2016, 12:12 PM   #11
davejones
LQ Newbie
 
Registered: Aug 2016
Posts: 13

Original Poster
Rep: Reputation: Disabled
Ha ha ! that's got it thanks.

[root@svr1 ~]# cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb1[2]
16777088 blocks super 1.0 [2/1] [_U]

md3 : active raid1 sdb5[1]
1839216960 blocks super 1.0 [2/1] [_U]
bitmap: 11/14 pages [44KB], 65536KB chunk

md2 : active raid1 sdb3[1]
1073741632 blocks super 1.0 [2/1] [_U]
bitmap: 6/8 pages [24KB], 65536KB chunk

md1 : active raid1 sdb2[1]
524224 blocks [2/1] [_U]

unused devices: <none>

Now I need to contact the host to replace the drive. Is the anything else I should do first?

When they hand it back I THINK have to do this, is this ok?

sfdisk -d /dev.sdb | sfdisk --force /dev/sda

then add parts

mdadm /dev/md0 -a /dev/sda1
mdadm /dev/md1 -a /dev/sda2
mdadm /dev/md2 -a /dev/sda3
mdadm /dev/md3 -a /dev/sda5

grub -install /dev/sda
 
Old 08-08-2016, 03:08 PM   #12
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,912

Rep: Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513Reputation: 1513
Yes, but make sure the partitioning is correct first (as in verify the resulting partitions). The reason for verifying is that SOMETIMES (due to physical disk geometry) sizes will not match. Also there can be other things that affect it - such as a 4Kb block vs 512b block. A new disk with 4k blocks will not perform very well when it is treated as a 512b block. It should work - but will be much slower than a real 512b block device. What happens is that the disk has to read the 4K block, update the appropriate 512 byte section, then write the entire 4K block back.

For a raid 1 (mirroring) they have to be at least the same size as the active partition.

NORMALLY, for something like this I would have expected a single raid 1 device, which is then partitioned for use.

It makes it simpler when a disk fails - only one raid device has to be dealt with as all partitions would be processed simultaneously.

The way it is, each raid device has to be handled separately, which increases the possibility of error.

Last edited by jpollard; 08-08-2016 at 03:27 PM.
 
Old 08-08-2016, 03:25 PM   #13
davejones
LQ Newbie
 
Registered: Aug 2016
Posts: 13

Original Poster
Rep: Reputation: Disabled
Got this to reference https://www.youtube.com/watch?v=jZp2IP27pcQ

Thanks for you help, I'll just wait for the drive to be replaced.

Last edited by davejones; 08-08-2016 at 03:29 PM.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with software RAID1 circus78 Linux - Server 2 05-22-2014 07:19 AM
raid1 remote problem terry1738 Linux - Newbie 4 03-28-2013 04:24 AM
raid1 problem on booting ita Linux - Software 2 08-25-2011 01:44 AM
Have software raid1, but like to change to raid1+0 or 0+1, how? spaceuser Debian 8 03-17-2008 02:07 PM
RAID1 Problem DrTebi Linux - General 0 11-26-2003 07:36 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 10:01 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration