LinuxQuestions.org - fsck aborts: "/dev/sdc1 is in use" but not mounted

- Linux - Desktop (https://www.linuxquestions.org/questions/linux-desktop-74/)

- - fsck aborts: "/dev/sdc1 is in use" but not mounted (https://www.linuxquestions.org/questions/linux-desktop-74/fsck-aborts-dev-sdc1-is-in-use-but-not-mounted-4175572000/)

fsck aborts: "/dev/sdc1 is in use" but not mounted

I think I have hardware problems that prevent Kubuntu 14.04 from runn ing normally, see my post
http://www.linuxquestions.org/questi...ad-4175571478/
and want to check my disks but I can't because fsck claims they are "in use". This is happening under 3 different Linux systems booted from optical drive: Knoppix, Ubuntu, and SystemRescueCD, all downloaded and burned recently. I'm writing this on the SystemRescueCD. The hardware is a Gigabyte motherboard and AMD A6 CPU, both <1 year old, with 3 SATA disks. They are called "in use" before I ever make any move to mount anything. This is what I tried:

root@sysresccd /etc % e2fsck /dev/sdc1
e2fsck 1.42.13 (17-May-2015)
/dev/sdc1 is in use.
e2fsck: Cannot continue, aborting.

root@sysresccd /etc % mount
udev on /dev type devtmpfs (rw,nosuid,relatime,size=10240k,nr_inodes=882330,mode=755)
tmpfs on /livemnt/boot type tmpfs (rw,relatime,size=2097152k)
/dev/loop0 on /livemnt/squashfs type squashfs (ro,relatime)
tmpfs on /livemnt/memory type tmpfs (rw,relatime)
none on / type aufs (rw,noatime,si=144c061a3e47e95d)
tmpfs on /livemnt/tftpmem type tmpfs (rw,relatime,size=524288k)
none on /tftpboot type aufs (rw,relatime,si=144c06180df5c95d)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /run type tmpfs (rw,nodev,relatime,size=708580k,mode=755)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
shm on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
tmpfs on /tmp type tmpfs (rw,relatime)

root@sysresccd /etc % lsof -n |grep sdc
(no output)

root@sysresccd /etc % ps aux |grep sdc
root 4078 0.0 0.0 3944 1292 pts/2 S+ 06:00 0:00 grep sdc

From research on the web I take it this means the kernel has my disks open - but why, for heaven's sake, in a rescue system? And how can I wrest control of my disks from it?

One strange thing I notice in the output of fdisk -l: /dev/sdc appears twice under different names (excerpt):
Disk /dev/sdc: 465.8 GiB, 500107862016 bytes, 976773168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x1c511c51

Device Boot Start End Sectors Size Id Type
/dev/sdc1 * 63 976773167 976773105 465.8G 83 Linux

Disk /dev/mapper/nvidia_geahifbj: 465.8 GiB, 500107860992 bytes, 976773166 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x1c511c51

Device Boot Start End Sectors Size Id Type
/dev/mapper/nvidia_geahifbj1 * 63 976773167 976773105 465.8G 83 Linux
where /dev/mapper/nvidia_geahifbj1 is a symlink to block device /dev/dm-1
But I can't fsck any of the 3 disks.

Thank you for any pointers.

What is the output from "lsblk --fs /dev/sdc" ?

root@sysresccd /etc % lsblk --fs /dev/sdc
NAME FSTYPE LABEL UUID MOUNTPOINT
sdc nvidia_raid_member
├─sdc1 ext3 MAX501 580076cf-9aef-4880-b3d6-2bd29d8733fe
└─nvidia_geahifbj

But sdc is not a member of any RAID. I have two RAID1s /dev/md0=sda6+sdb6 and /dev/md1=sdca7+sdb7. They are not recognized correctly by lsblk:

root@sysresccd /etc % lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 465.8G 0 disk
├─sda1 8:1 0 64G 0 part
├─sda5 8:5 0 486.3M 0 part
├─sda6 8:6 0 152.8G 0 part
│ └─md0 9:0 0 152.8G 0 raid1
├─sda7 8:7 0 246.7G 0 part
│ └─md1 9:1 0 246.7G 0 raid1
└─sda8 8:8 0 1.9G 0 part
sdb 8:16 0 465.8G 0 disk
├─sdb1 8:17 0 64G 0 part
├─sdb2 8:18 0 1K 0 part
├─sdb5 8:21 0 486.3M 0 part
├─sdb6 8:22 0 152.8G 0 part
├─sdb7 8:23 0 246.7G 0 part
├─sdb8 8:24 0 1.9G 0 part
└─nvidia_geahifbj 253:1 0 465.8G 0 dmraid
sdc 8:32 0 465.8G 0 disk
├─sdc1 8:33 0 465.8G 0 part
└─nvidia_geahifbj 253:1 0 465.8G 0 dmraid
sr0 11:0 1 459M 0 rom
loop0 7:0 0 337.6M 0 loop /livemnt/squashfs

Those are line-drawing characters at the beginnings of many lines.
This all used to work with the Ubuntu /boot partition on sda5, a copy of it on sdb5, and the root partition (/) on md0. Do you think the SATA controller is bad? Should I try to switch SATA ports? I think I have 8 of them.

Thanks

It seems awfully unlikely that a bad SATA port would make a disk partition appear to be a RAID member. Switching ports would be the simplest thing to try. That would affect the order in which the drives are detected, so /dev/sdc might not be the same disk any more. Perhaps that's part of the problem.

I'm afraid I have to defer to others to help with RAID issues. I just have no experience with that.

Different symptoms, but a while back I ran into a race condition that seemed to be mdadm fighting udev. Have a read of this - helped me, and can't hurt to try.

Thank you for the RAID suggestion, it said to avoid a race condition turn off udev before trying to assemble a RAID; alas it did not help:

sysresccd etc # mdadm -A /dev/md0 /dev/sd[ab]6
mdadm: /dev/sda6 is busy - skipping
mdadm: /dev/sdb6 is busy - skipping
sysresccd etc # udevadm control --stop-exec-queue
sysresccd etc # mdadm -A /dev/md0 /dev/sd[ab]6
mdadm: /dev/sda6 is busy - skipping
mdadm: /dev/sdb6 is busy - skipping

Still, there is a smell of race condition to this. Under Knoppix I could assemble a degraded md0 of just sda6 but it called sdb6 busy. And the Ubuntu live system didn't make device files for about half of the partitions until I ran partprobe. Does that ring a bell? Am I fighting 2 different battles here, one with my hardware and one with the rescue systems?

I will try moving the SATA plugs next.

I did also think of my power supply, it's 8 or 10 years old. But don't desktop hard drives stay on all the time so booting does not increase the power draw? And the optical drive does turn only when needed so the rescue CDs should have more trouble. Or not?

I moved all the SATA plugs to other ports and also unplugged sdc, and now RescueCD correctly identifies all other partitions and even assembles my two RAIDs for me. I verified that I can mount and list everything.

This gave me high hopes that I found my hardware problem and all is well, but no, when I boot off the hard drives into the installed Ubuntu system again, KDE and Gnome both are broken in exactly the same way they were before, see my post http://www.linuxquestions.org/questi...ad-4175571478/

I'll restore from backup into the now fully functional RAID root partition one more time but I'm not optimistic.

Any other ideas? - Thanks

UPDATE:
After restoring from a month-old backup set nothing at all has changed: KDE and Gnome are still broken in the same way as before. This backup reflects a system that I used for a whole month without problems.

Forgot to mention: The vmlinuz and initrd files are on their own small non-RAID partition which fsck finds clean.

???

dmraid is the clue

[QUOTE=LinuxOnly;5498884 ---------------------------------------------------
]root@sysresccd /etc % lsblk --fs /dev/sdc
NAME FSTYPE LABEL UUID MOUNTPOINT
sdc nvidia_raid_member
├─sdc1 ext3 MAX501 580076cf-9aef-4880-b3d6-2bd29d8733fe
└─nvidia_geahifbj

But sdc is not a member of any RAID. I have two RAID1s /dev/md0=sda6+sdb6 and /dev/md1=sdca7+sdb7. They are not recognized correctly by lsblk:

root@sysresccd /etc % lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 465.8G 0 disk
├─sda1 8:1 0 64G 0 part
├─sda5 8:5 0 486.3M 0 part
├─sda6 8:6 0 152.8G 0 part
│ └─md0 9:0 0 152.8G 0 raid1
├─sda7 8:7 0 246.7G 0 part
│ └─md1 9:1 0 246.7G 0 raid1
└─sda8 8:8 0 1.9G 0 part
sdb 8:16 0 465.8G 0 disk
├─sdb1 8:17 0 64G 0 part
├─sdb2 8:18 0 1K 0 part
├─sdb5 8:21 0 486.3M 0 part
├─sdb6 8:22 0 152.8G 0 part
├─sdb7 8:23 0 246.7G 0 part
├─sdb8 8:24 0 1.9G 0 part
└─nvidia_geahifbj 253:1 0 465.8G 0 dmraid
sdc 8:32 0 465.8G 0 disk
├─sdc1 8:33 0 465.8G 0 part
└─nvidia_geahifbj 253:1 0 465.8G 0 dmraid
sr0 11:0 1 459M 0 rom
loop0 7:0 0 337.6M 0 loop /livemnt/squashfs
------ end of QUOTE ----------------------------------------------------]

OK, at least I could clear up that mystery, that "dmraid" at the ends of the "nvidia..." lines should have been the clue. It turns out that for some reason the system at boot thought I had a dmraid RAID set of format nvidia made up of parts or all (not sure which) of sdb and sdc. After reading the dmraid manpage I ran
dmraid -an # (-an="assemble - not!", i.e. stop the RAID)
and got
ERROR: dos: partition address past end of RAID device
RAID set "nvidia_geahifbj" is not active.
So apparently some random data bits were interpreted as a RAID signature.

Now I can fsck the device.