LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Hardware (https://www.linuxquestions.org/questions/linux-hardware-18/)
-   -   xfs partitioned disk unmountable (superblock errors) - any hope of rescuing data? (https://www.linuxquestions.org/questions/linux-hardware-18/xfs-partitioned-disk-unmountable-superblock-errors-any-hope-of-rescuing-data-628387/)

mattp52 03-16-2008 04:06 AM

xfs partitioned disk unmountable (superblock errors) - any hope of rescuing data?
 
For reasons unknown, my Ubuntu upgrade was interrupted mid-install by an automated shutdown of my machine. On restart I get these errors shortly after Grub loads:

Code:

[ 27.466347] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0.0)

Kernel alive
kernel direct mapping tables up to 10000000 @ 8000-4000

I booted off a Live CD and tried to mount the hard drive. On boot I noticed a bunch of console lines like this when the OS was trying to mount file systems...
Code:

Buffer I/O error on device sda1, logical block 0
Buffer I/O error on device sda1, logical block 1
...

Once loaded I tried to mount the disk manually...

$mount -t xfs /dev/sda /mnt/temp

But got a superblock read failed error.

Similarly with:

$xfs_repair /dev/sda
Code:

  - creating 4 worker thread(s)
  Phase 1 - find and verify superblock
  superblock read failed, offset 0, size 524288, ag 0, rval 0
 
  fatal error -- Invalid argument

I don't think it's an xfs issue though, because, following other threads, I tried running this test:

$dd if=/dev/hdd1 bs=512 count=1 | hexdump

But got read errors instead of the expected hex values.

The HD is a Seagate Barracuda, less than a year old. It seems unlikely there's been a hardware failure - particularly after that shutdown mid-install. I opened the case and isolated the drive - it's definitely spinning and not making any odd noises.

What are my options here? As a relative noob, it seems to me the disk geometry or partition map is corrupted. Is there any possibility of repairing the disk to a point where I can at least mount it and retrieve my data. Does SystemRescue disk provide tools to do this?

Thanks for you help!

raskin 03-16-2008 09:54 AM

First, run
Code:

fdisk -l
to see the partition list. You seem to use /dev/sda - which is entire disk, most likely, your HDD - instead of /dev/sda1 or whatever you need, which represents actual partition - a slice of space of the disk where FS normally resides. Probably just doing all you tried in the beginning but with correct partition argument will do the trick. /dev/hdd1 returned errors because it, well, doesn't exist. It should be someting like IDE secondary slave (fourth disk on an old ATA data bus system).

mattp52 03-16-2008 04:11 PM

Thanks for the reply. I did also try the commands with /dev/sda1 thru to /dev/sda3 but with the same result. The second example was a typo sorry. The correct commands and output were:

Code:

$dd if=/dev/sda bs=512 count=1 | hexdump
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0 seconds, Infinity B/s

Code:

$dd if=/dev/sda1 bs=512 count=1 | hexdump
0+0 records in
0+0 records out
0 bytes (0 B) copied, 0 seconds, Infinity B/s

I had also tried fdisk but the command returns no output at all (just returns a new command prompt).

$fdisk -l
$fdisk -l /dev/sda
$fdisk -l /dev/sda1
etc.

syg00 03-16-2008 04:43 PM

With Ubuntu you need sudo - even for fdisk

jailbait 03-16-2008 04:59 PM

Quote:

Originally Posted by mattp52 (Post 3090115)
Once loaded I tried to mount the disk manually...

$mount -t xfs /dev/sda /mnt/temp



$xfs_repair /dev/sda
Code:

  - creating 4 worker thread(s)
  Phase 1 - find and verify superblock
  superblock read failed, offset 0, size 524288, ag 0, rval 0
 
  fatal error -- Invalid argument


You have to mount a partition, not an entire hard drive.

You have to run xfs_repair against a partition, not an entire hard drive.

So what partition is your filesystem on? If it is on /dev/sda1 then run:

xfs_repair /dev/sda1

-----------------
Steve Stites

Electro 03-16-2008 05:33 PM

Assuming the xfs module is built into the kernel or is loaded from initrd or initramfs. Also assuming the module for IDE/SCSI/SATA controller is loaded. Try running xfs_check. If that comes up with some problems, run the hard drive utility from the hard drive manufacture to scan the sectors to make sure they are not corrupted. IMHO, Seagate is poor at this. I prefer Hitachi because their hard drive utility is very thorough. Then use xfs_repair using the -n option to find out what will it do. If all else fails try to use the -L option to zero out the metadata log. If the drive is good, there is a higher chance of getting the data back. However, the files will be stored in lost+found.

Check the logs to make sure the drive is being detected at startup.

If there is no partitions found, the partition table is corrupted and needs to be created. I suggest use gpart to guess the partitions and where they start and end. Changing the partition table will not effect the data.

I strongly recommend make an image of the drive before doing any data recovery. This gives you a chance to repeat the steps or do the steps differently.

TIP:
Always include the kernel option ro (read only) for any file system. This way the boot loader will not corrupt the data. The distribution should remount it as read-write to continue the boot process.

mattp52 03-16-2008 06:32 PM

Thanks for all your tips. Syg00, I have been running these commands under sudo, left that out for brevity - sorry should have mentioned that.

Running either the mount or the xfs_repair commands against a partition also fails with the same output, i.e.:

$mount -t xfs /dev/sda1 /mnt/temp
$xfs_repair /dev/sda1

Electro wrote:
Quote:

assumingng the xfs module is built into the kernel or is loaded from initrd or initramfs. Also assuming the module for IDE/SCSI/SATA controller is loaded. Try running xfs_check
I'm assuming this is the case as the live CD is Ubuntu 7.10 with an upt-to-date kernel and includes tools for manipulating xfs disks. Unfortunately I get errors similar to what I'm getting for xfs_repair with xfs_check - the tools
cannot even read the device.

So I've built a disk from the latest SystemRescueCD as I wanted to try running TestDisk on it to rebuild the partition map. Unfortunately during the boot-up it goes it cycles through various SATA link speed and UDMA settings trying to mount the device. I'm getting this kind of output on each cycle (with small variations with each setting tried):

Code:

ata4: SATA link up x.xGps (SStatus 123 Scontrol 3xx) (Attempts 300,310 etc.)
ata4.00: configured for UDMA/1xx (attempts 133,100 etc)
ata4:EH complete
ata4.00: exception Emask 0x0 SAct 0x0 Serr 0x0 action 0x2 frozen
ata4.00: cmd c8/00:08:00:00:00/00:00:00:00:00/e0 tag 0 dma 4096 in
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
sta4.00:status: {DRDY}
ata4: soft resetting link

Occasionally I also get Buffer I/O errors showing up. None are successful in mounting the disk. Eventually the screen goes black and I have to reboot.


I'm going to try and hunt out a tool from the disk manufacturer and see if I can get anywhere with that, otherwise it's not looking good!

mattp52 03-16-2008 10:40 PM

Ok, so I ran Seagate's diagnostic tool (SeaGateTools v2.07) which runs on FreeDOS. The software scanned for devices and found the disk. It said SMART was activated and hadn't been tripped. So I ran the short and long tests and the tool passed both.

Then I downloaded the 7.04 build of Rescubuntu... a live CD with rescue tools on it. On boot I got similar startup errors (Buffer I/O errors etc) as I got on other Linux Live CDs. Unfortunately no tools, including testdisk seem to be able to read the drive.

This is really frustrating as I'm confident the data is still intact on the disk but I cannot get anything to mount the thing!

Electro 03-17-2008 12:00 AM

S.M.A.R.T. is dumb. It is not accurate telling about failure. Running the short and long tests is not enough. To check if the drive is going, check for the noise. A sector to sector scan will be better to check the quality of the platters.

IMHO, Ubuntu just sucks because is not reliable. I suggest Knoppix or System Rescue CD.


Good luck.

mattp52 03-17-2008 03:25 PM

OK, success! Basically I went into the BIOS set-up and reset the ACPI control. The system booted up into console mode. From there I was able to recover the partially installed upgrade with:

$dpkg --configure -a

The package manager completed it's configuration and the system reboots cleanly into GDM with all services running. Not sure what tripped the ACPI controller - I did a BIOS Flash upgrade a couple of months ago but experienced no problems after it.

Ah, well. Good to have the drive back - thanks again for all your help!


All times are GMT -5. The time now is 12:51 AM.