LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   System Problem Detected On Login [RAID] (https://www.linuxquestions.org/questions/linux-server-73/system-problem-detected-on-login-%5Braid%5D-4175674489/)

thund3rstruck 05-03-2020 08:13 AM

System Problem Detected On Login [RAID]
 
Hi all,

Approx one week ago I built a new home file server, assembling 3x 10 TB disks into a single RAID5 mdadm volume (/mnt/md0). Array has been working great for the week and I transferred my ~15TB of data from the old server to this one.

This morning I am getting an error on login 'System program problem detected'.

Per the screenshot above, it seems the mdcheck_start.service is failing due to MD array scrubbing.

What does this mean exactly?

UPDATE

It seems this mdcheck_start daemon is trying to launch a script in the path /usr/share/mdadm/mdcheck. Ubuntu 20.04 does not ship with this file and the mdadm packages do not install this file.

Is this service even valid? Should I disable it?

Code:

systemctl status mdcheck_start.service
● mdcheck_start.service - MD array scrubbing
    Loaded: loaded (/lib/systemd/system/mdcheck_start.service; static; vendor preset: enabled)
    Active: failed (Result: exit-code) since Sun 2020-05-03 09:18:05 EDT; 5min ago
TriggeredBy: ● mdcheck_start.timer
    Process: 196602 ExecStart=/usr/share/mdadm/mdcheck --duration $MDADM_CHECK_DURATION (code=exited, status=203/EXEC)
  Main PID: 196602 (code=exited, status=203/EXEC)

May 03 09:18:05 BAILEYFS02 systemd[1]: Starting MD array scrubbing...
May 03 09:18:05 BAILEYFS02 systemd[196602]: mdcheck_start.service: Failed to execute command: No such file or directory
May 03 09:18:05 BAILEYFS02 systemd[196602]: mdcheck_start.service: Failed at step EXEC spawning /usr/share/mdadm/mdcheck: No such file or directory
May 03 09:18:05 BAILEYFS02 systemd[1]: mdcheck_start.service: Main process exited, code=exited, status=203/EXEC
May 03 09:18:05 BAILEYFS02 systemd[1]: mdcheck_start.service: Failed with result 'exit-code'.
May 03 09:18:05 BAILEYFS02 systemd[1]: Failed to start MD array scrubbing.


shruggy 05-03-2020 09:34 AM

Looks like Ubuntu just copied this patch from OpenSUSE without much thought. The mdadm.spec from OpenSUSE contains this line:
Code:

install -m 755 misc/mdcheck %{buildroot}/usr/share/mdadm/mdcheck
Moreover, while I see the file SUSE-mdadm_env.sh in Ubuntu source, it doesn't seem to be installed. OpenSUSE package has another patch for this. And another bunch of patches for mdcheck (1, 2, 3, 4).

thund3rstruck 05-03-2020 09:47 AM

Quote:

Originally Posted by shruggy (Post 6118748)
Looks like Ubuntu just copied this patch from OpenSUSE without much thought. The mdadm.spec from OpenSUSE contains this line:
Code:

install -m 755 misc/mdcheck %{buildroot}/usr/share/mdadm/mdcheck
Moreover, while I see the file SUSE-mdadm_env.sh in Ubuntu source, it doesn't seem to be installed. OpenSUSE package has another patch for this. And another bunch of patches for mdcheck (1, 2, 3, 4).

Thanks for the detailed response!

What should/can I do on my end to resolve this or is this something that I have to wait for Ubuntu to fix on their end?

I'm guessing its kind of important for mdadm to be periodically scrubbing my array to keep it healthy, which is not currently happening...

shruggy 05-03-2020 10:24 AM

Well, nothing prevents you from downloading the OpenSUSE package, extracting the relevant files (those would be two shell scripts, /usr/lib/mdadm/mdadm_env.sh and /usr/share/mdadm/mdcheck as well as /etc/sysconfig/mdadm), adjusting them to your liking (e.g. changing /etc/sysconfig to /etc/default as is customary on Debian-based distros) and using them.

thund3rstruck 05-03-2020 12:53 PM

Quote:

Originally Posted by shruggy (Post 6118775)
Well, nothing prevents you from downloading the OpenSUSE package, extracting the relevant files (those would be two shell scripts, /usr/lib/mdadm/mdadm_env.sh and /usr/share/mdadm/mdcheck as well as /etc/sysconfig/mdadm), adjusting them to your liking (e.g. changing /etc/sysconfig to /etc/default as is customary on Debian-based distros) and using them.

This RAID contains 20 years of vital family and business data and I'm not willing to experiment with various untested solutions. I need something stable (verified and tested) which is what I though long-term support meant.

Thanks again for the responses!

Update

I found the script they forgot to include here.

I made it executable, copied it to the appropriate directory, and started the service. The service now starts, though I am still getting this stupid error dialog on login, whatever it is isn't related to this broken service as the systemctl --failed command shows no failed services now.

syg00 05-03-2020 07:13 PM

Open a bug against Ubuntu so others don't get bitten. As for the annoying pop-up, read this.


All times are GMT -5. The time now is 07:02 PM.