LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 07-21-2018, 04:39 AM   #1
Kallys
LQ Newbie
 
Registered: Jul 2018
Posts: 6

Rep: Reputation: Disabled
Unhappy Backup a LUKS container in file on dying disk


Hi!

My NAS died yesterday and I'm a little bit worried about which data can be recovered. I put the hard drive in an external USB case and started inspecting (I'm waiting for a new hard drive to make a copy before doing any operation).

Context:
Data are in a LUKS container in a file.
Filesystem are both (container and content) ext4.
I know the passphrase and have a backup of LUKS header.

Issue:
I think disk started dying because the file is normally 1.2To and is now recognized as 1.2Go (hard drive mounted in USB case). So I'm unable to mount LUKS container ("is not a valid LUKS device"). And of course, my NAS can't boot.

Questions:
How can I recover encrypted data?
Can I launch fsck on the filesystem containing the file and hope this will restore its real size?
Otherwise, can I force filesystem to set the file right size (since file is non-sparse), eventually restore LUKS header, and then mount container and recover as much data as I can even if some of them are corrupted?

Thanks a lot for your help!
 
Old 07-21-2018, 08:11 AM   #2
smallpond
Senior Member
 
Registered: Feb 2011
Location: Massachusetts, USA
Distribution: CentOS 6 & 7
Posts: 2,974

Rep: Reputation: 795Reputation: 795Reputation: 795Reputation: 795Reputation: 795Reputation: 795Reputation: 795
Are there disk errors reported by smartctl or in the logs? If not, it's more likely a software error that truncated your file.
 
Old 07-21-2018, 10:29 AM   #3
Kallys
LQ Newbie
 
Registered: Jul 2018
Posts: 6

Original Poster
Rep: Reputation: Disabled
I'm not sure, SMART reports some "pre-fail".
It seems that disk is not yet dying but not to far.
Since I'm not very familiar with SMART, so here is are some parts of SMART tests :

Code:
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
					was suspended by an interrupting command from host.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 121)	The previous self-test completed having
					the read element of the test failed.
Total time to complete Offline 
data collection: 		(39180) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 378) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   242   163   021    Pre-fail  Always       -       2858
  4 Start_Stop_Count        0x0032   074   074   000    Old_age   Always       -       26474
  5 Reallocated_Sector_Ct   0x0033   197   197   140    Pre-fail  Always       -       71
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   051   051   000    Old_age   Always       -       36448
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       448
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       290
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       609088
194 Temperature_Celsius     0x0022   120   081   000    Old_age   Always       -       30
196 Reallocated_Event_Count 0x0032   169   169   000    Old_age   Always       -       31
197 Current_Pending_Sector  0x0032   200   199   000    Old_age   Always       -       292
198 Offline_Uncorrectable   0x0030   200   199   000    Old_age   Offline      -       247
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   198   196   000    Old_age   Offline      -       541

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     36448         34728
# 2  Short offline       Completed: read failure       10%     36446         34730
# 3  Short offline       Completed without error       00%     32641         -
# 4  Short offline       Completed without error       00%     32623         -
# 5  Short offline       Completed without error       00%     31128         -
# 6  Short offline       Completed: read failure       60%     30649         4157514
# 7  Short offline       Completed: read failure       10%     30649         4157618
# 8  Short offline       Completed: read failure       10%     24416         814220443
# 9  Short offline       Completed: read failure       50%     24416         814220440
How to know if it's a software error, and how to repair it?
Thanks!
 
Old 07-21-2018, 10:53 AM   #4
AwesomeMachine
LQ Guru
 
Registered: Jan 2005
Location: USA and Italy
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,321

Rep: Reputation: 965Reputation: 965Reputation: 965Reputation: 965Reputation: 965Reputation: 965Reputation: 965Reputation: 965
If you can't open the LUKS container, you can try a professional data recovery service. If the data is recoverable, it will be a lot of specialized work. But if the disk is OK--not a failing disk--you could try to recover the missing section of the file.

One thing you could try is to make a luks container on some other drive. Read it with 'dd'. Observe the header and footer bytes. Use dd to search the drive with the truncated luks container for the proper header and footer bytes, and then use 'dd' to copy the entire luks container to a different drive.

edit, you're correct, the disk appears to possibly be near failure, but not yet.

Last edited by AwesomeMachine; 07-21-2018 at 10:57 AM.
 
Old 07-21-2018, 12:34 PM   #5
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 3,963

Rep: Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738
Ignore the "Pre-fail" indications. All the "TYPE" column is telling you is, "What does it indicate when something shows up in the "WHEN_FAILED" column, and you have nothing there.

What you do have are 292 current pending sectors. Those are unreadable sectors that will cause an I/O error when the OS tries to read them. These are in addition to the 71 reallocated sectors, which are failed sectors that have already been reallocated to spare sectors.

Conclusion, the disk is in very bad shape. You will need to use ddrescue to copy as much as can be recovered to a new disk, and then try to repair the filesystem. Ideally, you should make two copies and attempt the repairs on one. Sometimes the repair efforts can actually make the filesystem damage much worse for forensic recovery.
 
Old 07-21-2018, 03:51 PM   #6
Kallys
LQ Newbie
 
Registered: Jul 2018
Posts: 6

Original Poster
Rep: Reputation: Disabled
Ok, thanks for your help and advices.
I'll play safe, make a first copy using ddrescue, and then try AwesomeMachine suggestions on a second copy.
Not sure I'll have enough time doing this before holidays, but I will keep you informed as soon as I have news.
 
Old 07-21-2018, 07:03 PM   #7
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 16,837

Rep: Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507
Quote:
Originally Posted by Kallys View Post
Can I launch fsck on the filesystem containing the file and hope this will restore its real size?
As it's ext4 and you've mounted it, fsck has already been run. fsck fixes filesystem consistency, not (necessarily) the files within. Truncation is quite common. Do you have anything in the lost+found directory of that filesystem ?. You'll need root/sudo to check.
 
Old 07-22-2018, 05:27 AM   #8
Kallys
LQ Newbie
 
Registered: Jul 2018
Posts: 6

Original Poster
Rep: Reputation: Disabled
Actually, I use fuseext2 to mount partition since blocksize is 65536 (NAS specification). I'm not sure if fsck is launched in this case.
I've checked and lost+found directory is empty. But I know my file is non-sparse, so as previously said, I can try using dd to recover the non truncated file, and deal with it. This make me thing I'll probably need to do a little math (and read man pages) to use dd with two disks using different blocksize.
Thanks!
 
Old 07-23-2018, 11:07 AM   #9
Kallys
LQ Newbie
 
Registered: Jul 2018
Posts: 6

Original Poster
Rep: Reputation: Disabled
Since I'm still waiting for my new hard drive, I played a little with LUKS containers in a file.
Code:
dd if=file.luks of=test
dd if=/dev/urandom bs=1M count=500 >> test
cryptsetup luksOpen test test
mount /dev/mapper/test /mnt/test
Turns out everything works and /mnt/test contains file.luks decrypted content.
I also tried to copy only a portion of file.luks, then cryptsetup always allows me to open that file (of course portion must be greater than LUKS header), and then trying to mount filesystem will report an error (ex: EXT4-fs (dm-4): bad geometry: block count 523776 exceeds size of device (25088 blocks))

This let me think that instead of searching for LUKS footer (I found nothing interesting about that on the Internet), I could just dd from beginning of my LUKS file until drive end (since I know my file is non-spare). Then, ext4 filesystem should handle itself needed size to be mounted.

Last edited by Kallys; 07-23-2018 at 11:10 AM. Reason: add precision
 
Old 07-23-2018, 01:00 PM   #10
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 3,963

Rep: Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738
Quote:
Originally Posted by Kallys View Post
This let me think that instead of searching for LUKS footer (I found nothing interesting about that on the Internet), I could just dd from beginning of my LUKS file until drive end (since I know my file is non-spare). Then, ext4 filesystem should handle itself needed size to be mounted.
There is no LUKS footer. There is also no problem with copying too much data. It is quite OK to have a filesystem that is smaller than its container. The excess will be ignored.
 
Old 07-23-2018, 01:36 PM   #11
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 3,963

Rep: Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738
You should also note that in an ext2/3/4 filesystem it is impossible for a really big file to be contiguous on the disk. It might be contiguous in terms of filesystem block numbers, but the space in the filesystem is divided into block groups (formerly called "cylinder groups"), and each block group has a header with the group descripter and the inodes for that group. I don't know how large a block group is for your filesystem with a 64K block size, but for regular ext4 with a 4K block size it's 32768 blocks or 128 MiB. If your filesystem also uses 32768 blocks per group, (32768 * 65536) = 2 GiB per block group. You should be able to see the block size and blocks per group in the output from "tune2fs -l".
 
Old 07-23-2018, 03:03 PM   #12
AwesomeMachine
LQ Guru
 
Registered: Jan 2005
Location: USA and Italy
Distribution: Debian testing/sid; OpenSuSE; Fedora; Mint
Posts: 5,321

Rep: Reputation: 965Reputation: 965Reputation: 965Reputation: 965Reputation: 965Reputation: 965Reputation: 965Reputation: 965
footer bytes

Quote:
Originally Posted by rknichols View Post
There is no LUKS footer.
I think he meant luks footer bytes. Certain types of files have common footer bytes which can be useful to find the end of the file. I'm not sure if luks container has common footer bytes.
 
Old 07-23-2018, 05:26 PM   #13
rknichols
Senior Member
 
Registered: Aug 2009
Distribution: CentOS
Posts: 3,963

Rep: Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738Reputation: 1738
Quote:
Originally Posted by AwesomeMachine View Post
I think he meant luks footer bytes. Certain types of files have common footer bytes which can be useful to find the end of the file. I'm not sure if luks container has common footer bytes.
It does not. The LUKS container is the size of the file/partition/LV that holds it. You can temporarily change the kernel's notion of the size of an open LUKS volume (the same as you can for any active cryptsetup mapping), but that change affects only kernel memory and is not recorded anywhere else. The next time you open that volume it will be the size of the file/partition/LV again.
 
Old 07-23-2018, 06:10 PM   #14
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 16,837

Rep: Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507Reputation: 2507
Quote:
Originally Posted by Kallys View Post
Since I'm still waiting for my new hard drive, I played a little with LUKS containers in a file.
Good - saves me looking for a spare machine to play on ...
I shall watch with interest.

My concern is that the NAS won't boot - so the filesystem itself is (still) broken. Even if you can carve out the LUKS container, are all the data blocks readable. No amount of fiddling with the LUKS meta-data will bring back lost blocks.
Gotta try, but also gotta be realistic - the real solution is in my sigline.
 
Old 07-24-2018, 05:55 AM   #15
Kallys
LQ Newbie
 
Registered: Jul 2018
Posts: 6

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by rknichols View Post
There is no LUKS footer.
That's what I supposed. I guess luksOpen only decrypts given bytes regardless of its size, that's why LUKS header does not contain anything about size.

Quote:
Originally Posted by AwesomeMachine View Post
I think he meant luks footer bytes.
Correct.

Quote:
Originally Posted by syg00 View Post
My concern is that the NAS won't boot - so the filesystem itself is (still) broken. Even if you can carve out the LUKS container, are all the data blocks readable. No amount of fiddling with the LUKS meta-data will bring back lost blocks.
Gotta try, but also gotta be realistic - the real solution is in my sigline.
I'm not really worry about the NAS, it's old and buggy : each time I got a power cut, I have to restore the OS partition.
I'm more worry about the fact that I tried to boot NAS before noticing something went wrong. And this boot may tried and certainly failed to clean up filesystem...
Since I have a backup of my LUKS header, I can assume LUKS meta-data safe and I should be able to decrypt encrypted bytes. I don't know how much bytes will be undecryptable for a single lost sector, but it worth the try to recover as much as I can.


Thanks guys !
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
LXer: How to make remote incremental backup of LUKS-encrypted disk/partition LXer Syndicated Linux News 0 02-16-2015 12:50 PM
How to recover deleted file from LUKS encrypted hard disk kumariarjun@gmail.com Linux - Newbie 1 06-20-2013 01:47 AM
Is my hard disk dying? Nylex Linux - Hardware 6 07-16-2006 02:31 PM
Hard drive dying, good backup method cadj Linux - Software 3 12-14-2004 07:28 PM
Dying disk???? Mux Linux - Hardware 2 10-22-2002 06:27 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 02:19 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration