LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Kernel (https://www.linuxquestions.org/questions/linux-kernel-70/)
-   -   what does a dm_integrity mdadm RAID1 error looks like? (https://www.linuxquestions.org/questions/linux-kernel-70/what-does-a-dm_integrity-mdadm-raid1-error-looks-like-4175682435/)

quantox 09-20-2020 12:04 PM

what does a dm_integrity mdadm RAID1 error looks like?
 
I've recently started running a dm_integrity mdadm RAID1 array on my Debian 10 system. It seems to be working OK, with no obvious errors.

If it does detect an error, what will it look like? I just want to know what sort of thing to look for in my logs. When I start it up and scrub the array I see a pair of messages like this in /var/log/kern.log:

Sep 20 10:53:44 kernel: [ 359.689681] md: data-check of RAID array md127
Sep 20 11:20:37 kernel: [ 1972.998119] md: md127: data-check done.

I'm guessing that if the scrub detected an error, some sort of message would appear between these lines, so I could look for lines containing "md:" and/or "md127". Or should I look for those strings *anywhere* in kern.log, since presumably an error might be detected outside of the active scrub? It would be really useful to have something specific to search for, to avoid the distraction of general info messages.

smallpond 10-02-2020 07:58 AM

The low-level error will usually be detected first by the disk driver, then it will percolate up to md. So something like:

Code:

kernel: end_request: I/O error, dev sdb, sector 2045799
kernel: raid1: Disk failure on sdb5, disabling device
Operation continuing on 1 devices

Set up mdadm in monitor mode to send a message when an event occurs. I hear all the time from people who set up RAID and their system fails. When you look in the logs, the first drive failed 6 months ago, but they weren't checking.

quantox 10-02-2020 10:04 AM

Thanks smallpond, that's exactly the sort of thing I was hoping for.

lvm_ 10-04-2020 05:06 AM

"data-check of RAID array" message doesn't indicate an error of any kind, it means that array self-check was started probably triggered from crontab - mdadm schedules checkarray run when installed.

quantox 10-04-2020 05:32 AM

I probably wasn't clear enough in my original post. I know the data-check messages are not errors; I just wondered what sort of error messages I might expect to see in the half-hour or so between those data-check messages, if the scrub detected an error.


All times are GMT -5. The time now is 09:39 PM.