PLEASE HELP - Smart, DD, FSCK, millions of unattached inodes
Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
PLEASE HELP - Smart, DD, FSCK, millions of unattached inodes
Need help , short story:
* Smartctl start showing errors on 4TB drive
* I decide to do clone drive with DD
* DD was working for 28 hours, at the end it shows Not Enough Space but print out that (4.0T) copied
* After reboot from clone drive it ask to do FSCK
* I run fsck with attribute Y , to auto fix all problems
* FSCK is working for more than 40 Hours
with following content:
Unattached node xxxxxxxxx
Connect to /lost+found? yes
-It looks like its going thru millions of those innodes.
-I dont know how long it will take to complete as I run fsck without -C option.
Hard drive had about 700G information with millions of small files (Total Size of hard drive is 4TB)
Please suggest me, if its worth to wait for FSCK to complete or it looks more like new clone drive is corrupted and its better to cancel
fsck and try to install OS on new drive and try to copy files directly ? I was really hoping to do clone drive so I dont have to reinstall
many small programs and do necessary configurations.
Thank You very much for any feedbacks or suggestions
I suggest that you format the partition and copy the files across that you want to put on the drive. The file system is completely buggered which is fairly common when you use dd. When fsck ends you will have a jumble of directories in lost+found which would take you weeks to sort out by hand.
If SMART data shows the disk is failing then you better use ddrescue or dd_rescue (see their respective manual pages for more nfo) or, as jailbait suggested just try to copy over the files you can salvage.
Quote:
Originally Posted by deriklogov1983
DD was working for 28 hours, at the end it shows Not Enough Space but print out that (4.0T) copied
Depending on the file system used it may, or may not, "just" affect the tail of the partition. As with the "showing errors" part, the more verbose you are, the more exact nfo you share the better.
Quote:
Originally Posted by deriklogov1983
I was really hoping to do clone drive so I dont have to reinstall many small programs and do necessary configurations.
If the data was unique and valuable then I'd go through the motions salvaging whatever possible but if it's only installations and configuration then with all due respect I'd wouldn't. Anyway, now you've found one reason why people make backups.
S.M.A.R.T Errors on /dev/sda
From Command: /usr/sbin/smartctl -q errorsonly -H -l selftest -l error /dev/sda
ATA Error Count: 36 (device log contains only the most recent five errors)
Error 36 occurred at disk power-on lifetime: 11544 hours (481 days + 0 hours)
Error 35 occurred at disk power-on lifetime: 11544 hours (481 days + 0 hours)
Error 34 occurred at disk power-on lifetime: 11544 hours (481 days + 0 hours)
Error 33 occurred at disk power-on lifetime: 11544 hours (481 days + 0 hours)
Error 32 occurred at disk power-on lifetime: 11544 hours (481 days + 0 hours)
and first it starts with 1 , then every couple days it starts increasing.
So what do you think happens with FSCK ? why it found so many inodes on brand new drive ? And as it looks like that error with inode number is increasing by 1,
Inode 7216654 ref count is 2, should be1.
and next
Inode 7216655 ref count is 2, should be1.
So that inode number is increasing by 1 , so basically every inode.
Why is that, and what is happening ?
Using "-q errorsonly" effectively ensures no information is shown that could help us help you. With all due respect if you don't know what switches do or cause please first read the manual or don't use them.
Quote:
Originally Posted by deriklogov1983
So what do you think happens with FSCK ? why it found so many inodes on brand new drive ? And as it looks like that error with inode number is increasing by 1, Inode 7216654 ref count is 2, should be1. and next Inode 7216655 ref count is 2, should be1. So that inode number is increasing by 1 , so basically every inode. Why is that, and what is happening ?
What happens during a file system check roughly depends what you cloned (whole disk, partition) and what parts went missing, when you cloned it (Live system with open files in use or powered down), the type (journaling or not) and state of the file system ("dirty" flag set or not) and if it can check its integrity using its (backup) meta data.
The simplest way to proceed would be to power the rig down, then run fsck on the source disk just to make sure, use a disk that's the same (brand, type and) size or larger, boot a Live CD, and if the source disk is fsck / bad blocks / SMART OK(-ish) try cloning it then. Then compare images using piece-wise mode of 'md5deep' in say 100m or 1g blocks.
that smartctl is copy pasted from cpanel Email notification.
Hard drive were cloned 1 to 1, so I cloned whole drive, not just partion.
I clone that drive from Live Cd, so no drives were in use during cloning.
All parts /sda1 /sda2 /sda3 were present after cloning.
so question still the same, why are so many inodes are unattached ?
Simply put common file systems store meta data centrally. Once the mapping between file and (backup) meta data is gone the file may still exist but the file system can't "place" it properly. Think of a file system like a tree: cut off one of the lower branches and everything attached to it will go as well. In that case running 'fsck' is not like trying to glue the structure back together but rather like trying to pin all leaves in the same location.
So because of so many inodes unattached, does it mean that structure corrupted and all files would be corrupted as well ? or "it found some old tree with old leaves" ? Is that normal number of unattached inodes after cloning drive ?
I want to understand if that fsck working for so many hours with so many inodes unattached , is that normal stuff or should i stop waste time and start from another side ?
I access server thru IPMI using iKVM , when I said drives were not in use I mean that drives were not mounted.
So because of so many inodes unattached, does it mean that structure corrupted and all files would be corrupted as well ? or "it found some old tree with old leaves" ? Is that normal number of unattached inodes after cloning drive ?
I see I shouldn't use analogies ;-p and no it is not normal.
Quote:
Originally Posted by deriklogov1983
I want to understand if that fsck working for so many hours with so many inodes unattached , is that normal stuff or should i stop waste time and start from another side ?
Yes, like I said stop the fsck and start over again.
* Smartctl start showing errors on 4TB drive
* I decide to do clone drive with DD
Exactly what options did you use with dd? If there are I/O errors from the drive, dd will give up after the first one. That might tempt you to use the "conv=noerror" (continue after read errors) option, but that is the wrong thing to do and will result in a massively corrupted filesystem image at the destination.
For copying from a drive that has I/O errors, ddrescue is the proper tool. It will deal intelligently with those errors.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.