Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
My NAS died yesterday and I'm a little bit worried about which data can be recovered. I put the hard drive in an external USB case and started inspecting (I'm waiting for a new hard drive to make a copy before doing any operation).
Context:
Data are in a LUKS container in a file.
Filesystem are both (container and content) ext4.
I know the passphrase and have a backup of LUKS header.
Issue:
I think disk started dying because the file is normally 1.2To and is now recognized as 1.2Go (hard drive mounted in USB case). So I'm unable to mount LUKS container ("is not a valid LUKS device"). And of course, my NAS can't boot.
Questions:
How can I recover encrypted data?
Can I launch fsck on the filesystem containing the file and hope this will restore its real size?
Otherwise, can I force filesystem to set the file right size (since file is non-sparse), eventually restore LUKS header, and then mount container and recover as much data as I can even if some of them are corrupted?
I'm not sure, SMART reports some "pre-fail".
It seems that disk is not yet dying but not to far.
Since I'm not very familiar with SMART, so here is are some parts of SMART tests :
Code:
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x84) Offline data collection activity
was suspended by an interrupting command from host.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 121) The previous self-test completed having
the read element of the test failed.
Total time to complete Offline
data collection: (39180) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 378) minutes.
Conveyance self-test routine
recommended polling time: ( 5) minutes.
SCT capabilities: (0x3035) SCT Status supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 242 163 021 Pre-fail Always - 2858
4 Start_Stop_Count 0x0032 074 074 000 Old_age Always - 26474
5 Reallocated_Sector_Ct 0x0033 197 197 140 Pre-fail Always - 71
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 051 051 000 Old_age Always - 36448
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 448
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 290
193 Load_Cycle_Count 0x0032 001 001 000 Old_age Always - 609088
194 Temperature_Celsius 0x0022 120 081 000 Old_age Always - 30
196 Reallocated_Event_Count 0x0032 169 169 000 Old_age Always - 31
197 Current_Pending_Sector 0x0032 200 199 000 Old_age Always - 292
198 Offline_Uncorrectable 0x0030 200 199 000 Old_age Offline - 247
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 198 196 000 Old_age Offline - 541
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed: read failure 90% 36448 34728
# 2 Short offline Completed: read failure 10% 36446 34730
# 3 Short offline Completed without error 00% 32641 -
# 4 Short offline Completed without error 00% 32623 -
# 5 Short offline Completed without error 00% 31128 -
# 6 Short offline Completed: read failure 60% 30649 4157514
# 7 Short offline Completed: read failure 10% 30649 4157618
# 8 Short offline Completed: read failure 10% 24416 814220443
# 9 Short offline Completed: read failure 50% 24416 814220440
How to know if it's a software error, and how to repair it?
Thanks!
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,524
Rep:
If you can't open the LUKS container, you can try a professional data recovery service. If the data is recoverable, it will be a lot of specialized work. But if the disk is OK--not a failing disk--you could try to recover the missing section of the file.
One thing you could try is to make a luks container on some other drive. Read it with 'dd'. Observe the header and footer bytes. Use dd to search the drive with the truncated luks container for the proper header and footer bytes, and then use 'dd' to copy the entire luks container to a different drive.
edit, you're correct, the disk appears to possibly be near failure, but not yet.
Last edited by AwesomeMachine; 07-21-2018 at 10:57 AM.
Ignore the "Pre-fail" indications. All the "TYPE" column is telling you is, "What does it indicate when something shows up in the "WHEN_FAILED" column, and you have nothing there.
What you do have are 292 current pending sectors. Those are unreadable sectors that will cause an I/O error when the OS tries to read them. These are in addition to the 71 reallocated sectors, which are failed sectors that have already been reallocated to spare sectors.
Conclusion, the disk is in very bad shape. You will need to use ddrescue to copy as much as can be recovered to a new disk, and then try to repair the filesystem. Ideally, you should make two copies and attempt the repairs on one. Sometimes the repair efforts can actually make the filesystem damage much worse for forensic recovery.
Ok, thanks for your help and advices.
I'll play safe, make a first copy using ddrescue, and then try AwesomeMachine suggestions on a second copy.
Not sure I'll have enough time doing this before holidays, but I will keep you informed as soon as I have news.
Can I launch fsck on the filesystem containing the file and hope this will restore its real size?
As it's ext4 and you've mounted it, fsck has already been run. fsck fixes filesystem consistency, not (necessarily) the files within. Truncation is quite common. Do you have anything in the lost+found directory of that filesystem ?. You'll need root/sudo to check.
Actually, I use fuseext2 to mount partition since blocksize is 65536 (NAS specification). I'm not sure if fsck is launched in this case.
I've checked and lost+found directory is empty. But I know my file is non-sparse, so as previously said, I can try using dd to recover the non truncated file, and deal with it. This make me thing I'll probably need to do a little math (and read man pages) to use dd with two disks using different blocksize.
Thanks!
Since I'm still waiting for my new hard drive, I played a little with LUKS containers in a file.
Code:
dd if=file.luks of=test
dd if=/dev/urandom bs=1M count=500 >> test
cryptsetup luksOpen test test
mount /dev/mapper/test /mnt/test
Turns out everything works and /mnt/test contains file.luks decrypted content.
I also tried to copy only a portion of file.luks, then cryptsetup always allows me to open that file (of course portion must be greater than LUKS header), and then trying to mount filesystem will report an error (ex: EXT4-fs (dm-4): bad geometry: block count 523776 exceeds size of device (25088 blocks))
This let me think that instead of searching for LUKS footer (I found nothing interesting about that on the Internet), I could just dd from beginning of my LUKS file until drive end (since I know my file is non-spare). Then, ext4 filesystem should handle itself needed size to be mounted.
Last edited by Kallys; 07-23-2018 at 11:10 AM.
Reason: add precision
This let me think that instead of searching for LUKS footer (I found nothing interesting about that on the Internet), I could just dd from beginning of my LUKS file until drive end (since I know my file is non-spare). Then, ext4 filesystem should handle itself needed size to be mounted.
There is no LUKS footer. There is also no problem with copying too much data. It is quite OK to have a filesystem that is smaller than its container. The excess will be ignored.
You should also note that in an ext2/3/4 filesystem it is impossible for a really big file to be contiguous on the disk. It might be contiguous in terms of filesystem block numbers, but the space in the filesystem is divided into block groups (formerly called "cylinder groups"), and each block group has a header with the group descripter and the inodes for that group. I don't know how large a block group is for your filesystem with a 64K block size, but for regular ext4 with a 4K block size it's 32768 blocks or 128 MiB. If your filesystem also uses 32768 blocks per group, (32768 * 65536) = 2 GiB per block group. You should be able to see the block size and blocks per group in the output from "tune2fs -l".
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,524
Rep:
footer bytes
Quote:
Originally Posted by rknichols
There is no LUKS footer.
I think he meant luks footer bytes. Certain types of files have common footer bytes which can be useful to find the end of the file. I'm not sure if luks container has common footer bytes.
I think he meant luks footer bytes. Certain types of files have common footer bytes which can be useful to find the end of the file. I'm not sure if luks container has common footer bytes.
It does not. The LUKS container is the size of the file/partition/LV that holds it. You can temporarily change the kernel's notion of the size of an open LUKS volume (the same as you can for any active cryptsetup mapping), but that change affects only kernel memory and is not recorded anywhere else. The next time you open that volume it will be the size of the file/partition/LV again.
Since I'm still waiting for my new hard drive, I played a little with LUKS containers in a file.
Good - saves me looking for a spare machine to play on ...
I shall watch with interest.
My concern is that the NAS won't boot - so the filesystem itself is (still) broken. Even if you can carve out the LUKS container, are all the data blocks readable. No amount of fiddling with the LUKS meta-data will bring back lost blocks.
Gotta try, but also gotta be realistic - the real solution is in my sigline.
That's what I supposed. I guess luksOpen only decrypts given bytes regardless of its size, that's why LUKS header does not contain anything about size.
Quote:
Originally Posted by AwesomeMachine
I think he meant luks footer bytes.
Correct.
Quote:
Originally Posted by syg00
My concern is that the NAS won't boot - so the filesystem itself is (still) broken. Even if you can carve out the LUKS container, are all the data blocks readable. No amount of fiddling with the LUKS meta-data will bring back lost blocks.
Gotta try, but also gotta be realistic - the real solution is in my sigline.
I'm not really worry about the NAS, it's old and buggy : each time I got a power cut, I have to restore the OS partition.
I'm more worry about the fact that I tried to boot NAS before noticing something went wrong. And this boot may tried and certainly failed to clean up filesystem...
Since I have a backup of my LUKS header, I can assume LUKS meta-data safe and I should be able to decrypt encrypted bytes. I don't know how much bytes will be undecryptable for a single lost sector, but it worth the try to recover as much as I can.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.