LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 08-11-2018, 07:37 AM   #1
LQParsons
Member
 
Registered: Feb 2010
Location: North of Boston, Mass (North Shore/Cape Ann)
Distribution: CentOS 7.0 (and kvm/qemu)
Posts: 65

Rep: Reputation: 11
CentOS Software RAID:1 disk replacement


Hi.
I have two 2TiB drives. When I installed my CentOS on my new machine oh so many years ago, I chose to let it do a software RAID. The next thing I chose was to let LVM handle everything. So those things have been invoked using the usual incantations, I didn't make a lot of choices.

The RAID is checked occassionally, and comes up clean.
However, logwatch is giving me lots of UC errors on my sdb.
So perhaps I should change it out.
This link seems to be a clear and simple procedure to follow.
Code:
https://linuxadminonline.com/replace-faulty-hard-disk-software-raid-1-centos-7/
My question is, does my use of LVM complicate matters?
Is there something else I need to do before or after?

My mdadm output follows:
Code:
$ sudo mdadm --detail /dev/md127

/dev/md127:
           Version : 1.2
     Creation Time : Mon Jun 15 20:18:05 2015
        Raid Level : raid1
        Array Size : 1952870400 (1862.40 GiB 1999.74 GB)
     Used Dev Size : 1952870400 (1862.40 GiB 1999.74 GB)
      Raid Devices : 2
     Total Devices : 2
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Mon Aug  6 09:33:52 2018
             State : clean 
    Active Devices : 2
   Working Devices : 2
    Failed Devices : 0
     Spare Devices : 0

Consistency Policy : bitmap

              Name : localhost:pv00
              UUID : 3683bca4:d82b68ff:fa27cb84:d69dd1b1
            Events : 698350

    Number   Major   Minor   RaidDevice State
       0       8        2        0      active sync   /dev/sda2
       1       8       17        1      active sync   /dev/sdb1
Attached Files
File Type: txt mdadm.txt (903 Bytes, 6 views)
File Type: txt fdisk.txt (1.7 KB, 6 views)
 
Old 08-11-2018, 10:51 AM   #2
jlinkels
LQ Guru
 
Registered: Oct 2003
Location: Bonaire, Leeuwarden
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,186

Rep: Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037
You should check if your phsycial volume is on /dev/md0. Just to be sure. It is the normal way to create LVM on top of RAID, not the other way around. Use pvdisplay.
It also seems that you boot partition is on /dev/sda1. The RAID comprised sda2 and sdb1. It seems that you can safely replace sdb.
Note that mdadm is very resilient. If you fail one disk, and things go wrong with the new disk you can re-install and re-add the old disk and that is surprisingly successful.
As always, I recommend to create a VM and simulate the complete process on a test environment before doing this in production.
I miss the boot partition on /dev/sdb. It means that you cannot boot if /dev/sda fails. That is something which should have been covered during installation of the RAID, but apparently it is not.

jlinkels
 
1 members found this post helpful.
Old 08-12-2018, 03:20 PM   #3
LQParsons
Member
 
Registered: Feb 2010
Location: North of Boston, Mass (North Shore/Cape Ann)
Distribution: CentOS 7.0 (and kvm/qemu)
Posts: 65

Original Poster
Rep: Reputation: 11
Hi.
Thanks.
I can't be sure of the order I built things, it was so long ago.
I suspect that the build would do it correctly, that neophytes in building a CentOS system like myself at the time, would just follow the line of questioning.

As to /dev/md, this is what I get, and it's the only 'md' device.

Code:
sudo pvdisplay
  --- Physical volume ---
  PV Name               /dev/md127
  VG Name               centos
  PV Size               <1.82 TiB / not usable 4.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              476774
  Free PE               31
  Allocated PE          476743
  PV UUID               mq3sH4-h8wr-ciOf-BsQ1-Tnh5-6wNL-qVufCT
The boot worries me as well.
The "fdisk.txt" enclosed in the original post shows /dev/sda2 equivalent in size to /dev/sdb1 so they are "raid'd" up, which leads me believe I can easily replace my second TiB drive (which, fortunately, is my problem at the moment), but I'm S*OuttaLuck if I need to replace my first TiB drive because of the missing "boot" designation. I was hoping I was mis-reading something, but it seems you've confirmed my fears.
 
Old 08-12-2018, 03:38 PM   #4
LQParsons
Member
 
Registered: Feb 2010
Location: North of Boston, Mass (North Shore/Cape Ann)
Distribution: CentOS 7.0 (and kvm/qemu)
Posts: 65

Original Poster
Rep: Reputation: 11
I'd love to VM and test before I go live, but, my 'raid' is my physical system.
Essentially the only thing I do when I physically boot the system, other than check root's email and do a
Code:
# yum update
weekly, is run the KVM (initiates at boot) then with
Code:
virt-manager
access the VMs that I use for work-stations.

If it's of any interest/use, my disk in my Linux VM looks like this:

Code:
sudo fdisk -l

Disk /dev/vda: 34.4 GB, 34359738368 bytes, 67108864 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x0008c089

   Device Boot      Start         End      Blocks   Id  System
/dev/vda1   *        2048     1026047      512000   83  Linux
/dev/vda2         1026048    67108863    33041408   8e  Linux LVM

Disk /dev/mapper/centos_gamgee-root: 30.3 GB, 30349983744 bytes, 59277312 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/mapper/centos_gamgee-swap: 3435 MB, 3435134976 bytes, 6709248 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Looks like I'm sunk if my primary TiB drive goes bad.
 
Old 08-12-2018, 07:46 PM   #5
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 17,157

Rep: Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626
It's only a boot partition - the data should be safe. And of course, you could always restore your backup.
It is interesting that anaconda would build a system like that when a boot partition was allocated. Let's see this.
Code:
lsblk -f
 
2 members found this post helpful.
Old 08-12-2018, 08:19 PM   #6
jlinkels
LQ Guru
 
Registered: Oct 2003
Location: Bonaire, Leeuwarden
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,186

Rep: Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037
Like syg00 it is only a boot sector which is missing. It is just annoying if the boot disk fails you'd have more work and it takes more time to get a running system again. Fortunately "not booting" is a small problem in Linux.

As for the VM, you'd be able to build a VM on this very system on which you have the failing disk problem. With 10GB space you have plenty to install Centos and a few RAID disks. It is all in the VDI or VMX file, remember?

jlinkels
 
Old 08-13-2018, 10:10 AM   #7
LQParsons
Member
 
Registered: Feb 2010
Location: North of Boston, Mass (North Shore/Cape Ann)
Distribution: CentOS 7.0 (and kvm/qemu)
Posts: 65

Original Poster
Rep: Reputation: 11
Code:
sudo lsblk -f
[sudo] password for petc: 
NAME              FSTYPE            LABEL          UUID                                   MOUNTPOINT
sda                                                                                       
├─sda1            xfs                              5ce4e9c8-6a9f-49d8-b82d-a40a26c39383   /boot
└─sda2            linux_raid_member localhost:pv00 3683bca4-d82b-68ff-fa27-cb84d69dd1b1   
  └─md127         LVM2_member                      mq3sH4-h8wr-ciOf-BsQ1-Tnh5-6wNL-qVufCT 
    ├─centos-swap swap                             b5da9226-c983-48e0-90cc-1fa7f6a62600   [SWAP]
    ├─centos-root xfs                              2eca4343-7f4b-411c-a260-f96fe60c0d1b   /
    └─centos-home xfs                              23c9f0a4-ebf0-4b12-8dbc-599020e6e4b8   /home
sdb                                                                                       
└─sdb1            linux_raid_member localhost:pv00 3683bca4-d82b-68ff-fa27-cb84d69dd1b1   
  └─md127         LVM2_member                      mq3sH4-h8wr-ciOf-BsQ1-Tnh5-6wNL-qVufCT 
    ├─centos-swap swap                             b5da9226-c983-48e0-90cc-1fa7f6a62600   [SWAP]
    ├─centos-root xfs                              2eca4343-7f4b-411c-a260-f96fe60c0d1b   /
    └─centos-home xfs                              23c9f0a4-ebf0-4b12-8dbc-599020e6e4b8   /home
sr0
 
Old 08-13-2018, 10:15 AM   #8
jlinkels
LQ Guru
 
Registered: Oct 2003
Location: Bonaire, Leeuwarden
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,186

Rep: Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037Reputation: 1037
Quote:
Originally Posted by syg00 View Post
Code:
lsblk -f
Woooow... that is a nice command!

jlinkels
 
1 members found this post helpful.
Old 08-13-2018, 10:19 AM   #9
LQParsons
Member
 
Registered: Feb 2010
Location: North of Boston, Mass (North Shore/Cape Ann)
Distribution: CentOS 7.0 (and kvm/qemu)
Posts: 65

Original Poster
Rep: Reputation: 11
Thanks.

Code:
Like syg00 it is only a boot sector which is missing. 
It is just annoying if the boot disk fails you'd have more work and it takes more time to 
get a running system again. 
Fortunately "not booting" is a small problem in Linux.

As for the VM, you'd be able to build a VM on this very system on which you have the 
failing disk problem. 
With 10GB space you have plenty to install Centos and a few RAID disks. 
It is all in the VDI or VMX file, remember?

jlinkels
I'll keep a link to this discussion in my notes.
So far, the uncorrectable errors on the 'spare' disks are annoying.
When I'm ready, later in the Fall, I'll start doing the step-by-step as you recommend.

Thank you for your help, counsel and advice.
I'll NOT mark this 'solved', as it won't be solved until I actually do it -- I may have further questions later.
(Unless you'd rather I do otherwise.)

Enjoy.
-d
 
Old 08-13-2018, 09:09 PM   #10
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 17,157

Rep: Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626Reputation: 2626
It's not just the MBR code that would need replacing - the boot partition will need creating and the grub package itself re-installed to get the code installed on the second disk as well. Then grub2-install (I assume, Fedora uses that naming), then mkconfig.

Presumably there is sufficient free space on that second disk you could allocate a partition for /boot to be mirrored to. Not trivial, but you could arrange a RAID1 set yourself for /boot after that - the MBR would still need updating on that (second) disk as well. Not sure about dracut for the booting - must be some doco on the web somewhere.

Last edited by syg00; 08-14-2018 at 02:05 AM. Reason: hopefully add some clarity
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
RAID-1 software CENTOS - no boot second disk igor012 Linux - Server 4 08-30-2012 04:58 AM
[SOLVED] RAID 1 disk is maked as Spare after replacement aasami Linux - Server 7 12-29-2011 04:48 AM
[SOLVED] Raid 1 Disk Replacement not working frankkky Linux - Hardware 8 09-27-2010 10:02 AM
[SOLVED] Software RAID (mdadm) - RAID 0 returns incorrect status for disk failure/disk removed Marjonel Montejo Linux - General 4 10-04-2009 07:15 PM
Failed disk replacement - RAID 1 madia Linux - Hardware 1 06-28-2007 10:17 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 03:21 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration