LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   degraded CentOS7 Raid 1 Server rebuilt fails (https://www.linuxquestions.org/questions/linux-server-73/degraded-centos7-raid-1-server-rebuilt-fails-4175646530/)

stitcho84 01-19-2019 10:51 AM

degraded CentOS7 Raid 1 Server rebuilt fails
 
Hi everybody!

Im happy to find such a good forum and I hope, that you can help me with my problem...

I am responsible for an importent server. But now it is in degraded mode. So not good... :(

Here is a screenshot of the mdstat and lsblk too.

https://www.directupload.net/file/d/...6zsr88_png.htm

When trying to start the rebuilt I just get the massage: Cannot open sda1 Device or resource is busy...

I hope you can help me. Thanks a lot!!!

Greetings

Stitcho

EDIT: The Picture is now in higher resolution...

Rickkkk 01-19-2019 10:55 AM

Hi stitcho84,

Welcome to LQ.

Could you upload a larger screenshot, please ? The one you uploaded is so small that, even with zooming, it is unreadable.

Thanks.

dc.901 01-19-2019 04:33 PM

Please look at syslog and dmesg to see it shows anything about sda.
And, have you looked at smartctl output?

stitcho84 01-19-2019 05:09 PM

Hi, Thanks for your answer!

In dmesg.dat I found that:

Quote:

[ 1.854629] sd 2:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)
[ 1.854703] sd 2:0:0:0: [sda] Write Protect is off
[ 1.854704] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
[ 1.854727] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 1.858682] sda: sda1 sda2
[ 1.953687] sd 2:0:0:0: [sda] Attached SCSI disk
and

Quote:

[ 17.299919] XFS (sda1): Mounting V5 Filesystem
[ 18.497296] XFS (sda1): Ending clean mount
SMART of SDA:
https://s15.directupload.net/images/190120/22sm577t.png

Greetings and thanks a lot!

syg00 01-19-2019 07:40 PM

According to the first listing, you are running of /dev/sda. Are you trying to create a new RAID environment ?

stitcho84 01-20-2019 01:35 AM

Quote:

Are you trying to create a new RAID environment ?
No

The System was a Raid 1. Weeks or even mounths ago... And now I realised, that the Raid is in degrded Mode.

I belive, that the boot order was changed in the Bios (accidentally) and now we have this situation...

Do I understand you right, that the whole system is booting from sda? So theoretically I could remove the sdb, because on that drive is an old version of the system?

Maybe this helps:

https://s15.directupload.net/images/190120/sky8oiob.png

So somehow I have die mirrow sda on sdb? And sdb is still part of the Raid-Array?

syg00 01-21-2019 10:58 PM

I've been thinking about how you got into this situation, and the only thing I can come up with is that your boot-loader on /dev/sda had never been updated for the RAID. So you were always booting off /dev/sdb for the RAID support.
But I would expect lsblk to show /dev/sda as containing mdadm metadata if the arrays had been running successfully. And it doesn't explain why the intrd and fstab weren't complaining about missing mapper devices (md0 and md1). It's also possible you were always running degraded even when on mdadm - no way I/we can know that.

Basically I reckon you are going to have to proceed as if setting up the RAID1 from the beginning. Copy the running system to /dev/sdb and boot it then add /dev/sda to the arrays.
Best done from a liveCD, with an outage window that allowed for a couple of re-boots and the data copy.

Note my sigline.

stitcho84 01-22-2019 02:51 AM

Dear syg00,

thanks for your answer.

I was now able to speak with the person who setup the system and now I now, that the Raid was never working correctly...

So, this helped me nothing :) Only that we know...

My plan is: First I will make a Clone with ClonZilla from sda. I will remove both HDDs from the Computer and will try to boot the clone. This should work.

Then I will use this Clone and try to setup a new Raid 1 System from that HDD. Doing it like that is the most save way I think...

Now I need some help how to migrate a whole CentOS System to a raid 1. Can you recommend I good working step by step manual?

I found several with new installation. But I want to avoid that...

Greetings

Stitcho

syg00 01-22-2019 07:08 AM

With Centos7, I would advise using the built-in RAID features of LVM rather than using mdadm directly. Works a treat, and you can define policies to handle devices failures automatically. Shouldn't be too hard to convince your boss to pay for a spare disk (or two) that can be brought up automatically in the event of a failure after all this.
All documented in the RHEL LVM Admin guide which you can download freely.

Edit: I haven't tried a RAID1 /boot under LVM ... might be interesting.

stitcho84 01-23-2019 01:12 AM

But LVM Raid 1 works, when you set is up during the installation of centOS 7. That is a easy way to configurate a Raid 1. But of course you have to reinstall the Operating system...

At the moment making a new backup of the system. Afterwards I will boot the system only with sda and check if everything is working correct. If this is the case, I will continue setting up the raid 1.

I found the RHEL Admin guide. But It is really not that easy to set up the raid 1 with this guide... The Chapter 4.4.3 seems to be the correct one...

Greetings

Stitcho

dobradude45 01-23-2019 05:17 AM

Hi folks
Can you install (if it's not already installed) ssm-storage-manager -- you might need to add the epel-release repository though.

then as root list the output of ssm list

the raid devices should show up as /dev/mdx depending on how you built the arrays.
Personally unless you have decent hardware RAID (most consumer RAID cards are junk) I'd use the built in software RAID (mdadm) and build the arrays again from scratch.

the output also shows the devices making up the array.


You could try taking the devices offline and running xfs-repair

Personally I'd use mdadm manually when setting up the arrays -- there's plenty of easy info on google. The actual Redhat documentation isn't always the best.

I'm also not a big fain of LVM - it's almost impossible to repair if it breaks - however YMMV -- LVM is fairly resilient so long as the undelying file system is xfs.

Note if re-building the OS manually and you are on a UEFI system don't forget to create in the install process as well as /boot, another boot partition - efi one - the drop down will give you the correct details -- you only need to make these partitions 500mb each.

Cheers
-D

stitcho84 05-22-2019 03:53 AM

Hi folks!

just to inform you. The final solution was now to completely reinstall the whole system... Ok, this was a lot of work, but finaly the server is now running fine :)

Thanks for your help!!!

Greetings

Stitcho


All times are GMT -5. The time now is 09:12 AM.