LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (https://www.linuxquestions.org/questions/linux-general-1/)
-   -   file recovery / data carving (https://www.linuxquestions.org/questions/linux-general-1/file-recovery-data-carving-720963/)

goncalopp 04-22-2009 01:00 PM

file recovery / data carving
 
I accidentally reformated an NTFS partition, believing it was empty, but I forgot a file there, a truecrypt volume file.
Now, I've done a bit of data recovery before, undeletion in windows and data carving with photorec and foremost; the problem here is, since the partition was formated, all the inode info was probably deleted (right?), and the encrypted volumes are supposed to be undetectable - they don't have a identifiable header or footer, thus not showing on the data carving reports.

Now, I believe the file must be in the beginning of the partition, since it was the first file I copied there.

Any ideas?
I'd rather not have to 'hexdump /dev/sdb1 | less' until I find it :P

pljvaldez 04-22-2009 01:03 PM

Did you try using testdisk to repair the filesystem? If you only formatted it and haven't written any data to the drive, you may just be able to restore the partition table and then go browse for your file.

goncalopp 04-22-2009 01:21 PM

I have reformated the partition inplace, that is, I just run mkfs.ntfs, so the old master file table was probably wiped clean. I'll try a deep search anyway, thanks for the tip

**EDIT**
as i suspected, testdisk did not find the old partition (though it found several, older, others at different places)

is there any kind of graphic disk editor? I was trying bless, but unfortunately it seems to only work with regular files

goncalopp 04-26-2009 04:14 PM

come on people! no one here has ever had to edit device files interactively? :(

H_TeXMeX_H 04-27-2009 03:32 AM

I don't think there is any plausible solution, without a header no data carver can find it, and I doubt you could either even with hexdump.

If you really wanted to find it, you should dump the image to another HDD and meticulously use a process of elimination to find it. So try to carve out all files and then search the data left behind ... depending on the size of the HDD this could take maybe a few decades.

This is all if you could not recover the partitions using testdisk, the only plausible solution.

goncalopp 04-30-2009 09:05 AM

Quote:

If you really wanted to find it, you should dump the image to another HDD and meticulously use a process of elimination to find it. So try to carve out all files and then search the data left behind
I was thinking along those lines, yes. The hdd is quite big, but there's lots of old data on it, so maybe it'll work.
The bigger problem will probably be guessing the exact beggining and ending of the file... I'll probably look into the truecrypt command-line and do some scripting.
Thanks

H_TeXMeX_H 05-01-2009 06:29 AM

Hey, I found something that might help:


Forensics Tool Finds Headerless Encrypted Files

http://it.slashdot.org/article.pl?sid=09/04/30/201222

Quote:

"Forensics Innovations claims to have for sale a product that detects headerless encrypted files, such as TrueCrypt Dynamic files.
Unfortunately it's a window$ program, no Linux support, and it's not free.

goncalopp 05-01-2009 08:30 AM

Quote:

Unfortunately it's a window$ program, no Linux support, and it's not free.
I have a xp virtual machine, so I tried the trial version.
It is somehow able to identify truecrypt volumes as "encrypted data (headerless)"; unfortunately, it only works on files - that is, it's a file identifier, not a data carving utility.
It's things like this that makes me wish more companies released FOSS, or at least the sources... It'd probably be easy to include the algorithm in photorec.
Damn.

Still, good find

H_TeXMeX_H 05-01-2009 08:43 AM

Well that's too bad, if you knew the algorithm that they used, you might be able to get programs like foremost to carve it out (it supports user-defined types). Oh well, I was hoping it included some type of file recovery / carving feature, but I guess not.

goncalopp 05-01-2009 12:01 PM

So, I did try the carving, and there seem to be only a couple places where the file may be "hidding" - it's a large file (several GiB).

Meanwhile, I remembered a program I had used some years ago on windows, to test data for true randomness - ENT (http://www.fourmilab.ch/random/)

I compiled that (it's cross platform), and this is the output from a 100MB truecrypt volume:
Code:

goncalopp@will:~/Desktop$ cat tc_volume.dat | ent
Entropy = 7.999998 bits per byte.

Optimum compression would reduce the size
of this 104857600 byte file by 0 percent.

Chi square distribution for 104857600 samples is 272.73, and randomly
would exceed this value 25.00 percent of the times.

Arithmetic mean value of data bytes is 127.5024 (127.5 = random).
Monte Carlo value for Pi is 3.141283613 (error 0.01 percent).
Serial correlation coefficient is -0.000016 (totally uncorrelated = 0.0).

It seems truecrypt volumes are nearly "perfect random data", so, I wrote a little script:

Code:

for i in $(seq 0 1024 238000)
do
echo 'OFFSET' $i MB >> log
echo 'OFFSET' $i MB
dd if=/media/backup/external_hdd.img bs=1M count=1024 skip=$i | ent >> log
done

I'll keep you guys updated.

H_TeXMeX_H 05-01-2009 01:03 PM

That's interesting, I didn't know about this.

goncalopp 05-05-2009 02:49 PM

So... Unfortunately, I was not able to recover the volume. I have come to the conclusion that I must have overwritten the data.
The method I mentioned proved useful and solid, and I will elaborate on it, in case this happens to someone else.


First, as H_TeXMeX_H said, it'd be recommended (tough not strictly necessary, if you don't have enough disk space) to make an image of the lost partition.
Code:

dd if=/dev/sdX of=/home/user/hdd_image.bin bs=1M
Then proceed to get ENT. If you have Debian/Ubuntu - and maybe other distributions - it's is the repositories; otherwise, download and compile from http://www.fourmilab.ch/random/

Make sure to read the manpage.

I used the following script to try to locate the truecrypt volume.

Code:

for i in $(seq 0 1024 238000)
do
echo 'OFFSET' $i MB
dd if=/media/backup/external_hdd.img bs=1M count=1024 skip=$i | ent -b -t >> log
done

Obviously, you'll want to modify it to match your case.
Particularly, if your lost volume has X bytes, you'll want to make dd output X/2 bytes chunks - so you're sure at least one of the chunks is entirely made of data from the truecrypt volume. (I'm sure there's a mathematical proof for that, but I won't bother, it seems obvious)

If you're not familiar with bash loops, you may want to check http://www.cyberciti.biz/faq/bash-for-loop/

That script may take a while. On my desktop P4 machine, ENT processed roughly 6 MB/s. (measured in pv). Thats about 10h for a 250GB disk.

You may want to re-run the script with a finer chunck once you have the rough location of the volume (you don't need to process the entire image again, use the "skip" dd parameter). Do that over and over again, until you have a good estimate where the file begins and ends.
I suggest you graph your data in OpenOffice Formula or the likes. ENT option "-b" outputs a Comma Separated List, which is easily imported.
For a +-1MB estimate, you may also look for it manually in a hex editor.
I used Bless (http://home.gna.org/bless)

After you know where you file begins and ends, just use dd to extract it.

Note: on my case, I noticed files are usually surrounded by 00s, so it is easy to identify them. What follows applies only if you are unable to determine the *exact* location of the file

I've read on some webpage that truecrypt only uses the first 512 bytes of a volume to check a password. I've confirmed it for v6.1a - if you feed it with only the first 512 bytes from a volume, it will give different error messages for the wrong/right password. I *believe* you may use this to identify the location of the lost volume -
dump several 512 bytes chunks, offset 1 byte from each other, in the interval where you believe the beginning of the file is. Then, write a shell script using the truecrypt CLI to mount each of these files, until you have a ioctl error message, as opposed to a password error message.
Good luck, and remember to post your improvements ;)


All times are GMT -5. The time now is 04:50 PM.