LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 10-25-2012, 05:46 PM   #1
linuxfordummies
LQ Newbie
 
Registered: Oct 2012
Posts: 5

Rep: Reputation: Disabled
Recovering files with unique header and known file size


Hi Everyone,

I have been trying to recover files using foremost and photorec, but so far have been unsuccessful. The drive is read-only so hopefully I will not continue to overwrite deleted data.

The deleted files have no footer, but the following header: ISOIdwR'COIMSOFT filename

Each deleted file is the same size: 659715948 bytes.

Foremost:
I modified the foremost.conf file by adding the following "raw n 659715948 ISOIdwR'COIMSOFT"

Using foremost I tried the following:
foremost -dv -c /path/to/foremost.conf -i /dev/sda6 -o /path/to/outputfolder

I was getting many files written to the output folder. Files had headers starting with ISOIdWR'COIMSOFT, but the file sizes varied and all were much smaller than 659715948 bytes (usually by an order of magnitude).

I tried the following. I found the inode number and block group for a given file, generated a .dat file using blkls (formerly known as dls) with sleuth kit and then ran foremost on this .dat file.

blkls -e -f ext3 /dev/sda6 39976960-40008507 > /output/path/block.dat

foremost -dv -c /path/to/foremost.conf -i /path/to/block.dat -o /path/to/outputfolder


Again, the header had ISOIdWR'COIMSOFT, but the rest of the header showed a different file name (e.g., t_159.0100 instead of t_159.0300) and the file size was much too small. I also did this for a file that has not been deleted and is allocated, and it produced the same results, header with ISOIdWR'COIMSOFT but different file name and small file size.


Photorec:
I made a custom signature file in photorec that using fidentify could recognize the file. When running photorec I picked /dev/sda, no partition (whole disk), ext3, and the custom signature file only. Is this correct? Nothing found yet.

It says " pass 0 - reading sector xxxx/xxxx, 0/10 headers found, elapsed time xxxx - estimated time to completion xxx"


My questions:

1. Using foremost, why are the file sizes much smaller than the byte size I designated in the configuration file given that there is no footer? And why is it picking a different file when I give it the block range?

Thank you!

Last edited by linuxfordummies; 10-25-2012 at 08:11 PM.
 
Old 10-25-2012, 07:46 PM   #2
linuxfordummies
LQ Newbie
 
Registered: Oct 2012
Posts: 5

Original Poster
Rep: Reputation: Disabled
follow-up

I also tried deleting a jpg file and immediately trying to recover it using foremost. I was not able to recover it using foremost on the entire drive or on a block range. Foremost recovered other files, but none of the size of the jpg file that I deleted.

Last edited by linuxfordummies; 10-25-2012 at 07:48 PM.
 
Old 10-25-2012, 08:50 PM   #3
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,331
Blog Entries: 55

Rep: Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529
First of all welcome to LQ, hope you like it here. Secondly I suggest you have your handle changed, I mean this isn't really a linux-for-dummies question, isn't it? ;-p

What's IMHO missing from your OP is an account or guesstimate of the time between deletion and recovery (same partition free space write ops) and knowing the problems that journaling causes wrt any guarantee for recovery (let alone a successful one). Wrt Photorec signatures its Wiki has a page on adding one and I'd guess you'd have to recompile (I mean AFAIK it doesn't use plugins). Wrt size of recovered data by omitting the footer you basically enable a sort of brute force / best guess mode where the only thing which stops or completes a carve op is encountering another file header or the end of the file system / partition / disk. What also works against you is the way allocation of primary, secondary and tertiary blocks work, simply put severing links breaks the chain which makes "walking the tree" difficult if not impossible.

One other approach could be to use a copy of the disk or partition, piece-wise hash it (md5deep) and then look for (-M) and zero out or exclude (-X) and carve for what you know.


Quote:
Originally Posted by linuxfordummies View Post
I also tried deleting a jpg file and immediately trying to recover it using foremost.
I sure hope you didn't do that on the partition the ISO is supposed to still reside on.

Last edited by unSpawn; 10-25-2012 at 08:52 PM. Reason: //More *is* more
 
Old 10-26-2012, 01:18 AM   #4
linuxfordummies
LQ Newbie
 
Registered: Oct 2012
Posts: 5

Original Poster
Rep: Reputation: Disabled
response

Thanks for your feedback.

Some of these files were deleted a few days ago and some even longer back.

I am a beginner for Linux, so not sure if I fully understand, but it sounds like the tree may have been broken and it might be hard to walk the tree to recover the complete file. If the tree is broken would recovery of the file (even if full size) be corrupt?

Do you think that photorec and foremost should have worked if the files were not corrupted?

An issue that I had with blkls/foremost is that I could not extract the raw binary data of unallocated blocks using blkls (formerly known as dls). blkls /dev/sda6 39976960-40008507 > /output/path/block.dat output a 0 byte block.dat file, so I had to use -e which outputs all the binary data. -A which is for unallocated blocks also outputs a 0 byte block.dat



I read this about md5deep: http://md5deep.sourceforge.net/start-md5deep.html

So if I understand correctly, I can do the following with md5deep:

1. md5deep inputfilename (this input file should be a file like the one I am trying to recover, but that has not been deleted?)

I will get something like this: b08b18e0a3d2440feb0b321ea8080b36 /path/to/inputfilename

2. make a text file that contains "b08b18e0a3d2440feb0b321ea8080b36 inputfilename"

3. md5deep -M textfile.txt *

I will get all the files that match the hashes in the textfile.txt.

I don't know what I do with this information.

Last edited by linuxfordummies; 10-26-2012 at 02:20 AM.
 
Old 10-26-2012, 10:46 AM   #5
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,331
Blog Entries: 55

Rep: Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529
Quote:
Originally Posted by linuxfordummies View Post
Some of these files were deleted a few days ago and some even longer back. (..) If the tree is broken would recovery of the file (even if full size) be corrupt? Do you think that photorec and foremost should have worked if the files were not corrupted?
Disk I/O cost, file system specifics and other aspects make that the kernel (generally speaking) doesn't write a raw data stream to a disk device but uses an abstraction layer (VFS) for integrity, efficiency and performance reasons. In practice this means you can't for example predict the last say 500 megs of a file being written in the order or the exact location that you would. (If you want to read more about that find yourself a copy of "Understanding the Linux Kernel" version 3.)
Less theoretically speaking (if there's such a thing) depending on partition layout and use of the system, a file system may see any number of write operations. Simplifying things you could say there are "hot data" areas seeing regular, "involuntary" writes for system or user purposes like /var and /home, areas that see incidental involuntary writes like /tmp or /var/tmp and "cold data" areas like a deliberately and maybe temporarily (u)mounted storage partition or not at all like /usr which should only see writes on system update (or if one runs applications that don't give a fsck about the FSSTND / LFS).

Condensing what I posted now and before:
- there is no recovery guarantee, let alone successful ones recovering usable data.
- journaling greatly diminishes your chances of recovery.
- the more post-deletion writes, the larger the time between deletion and file system "freeze", the less chance of recovery you have. Exponentially.


Quote:
Originally Posted by linuxfordummies View Post
I read this about md5deep
Well done. Unlike what a lot of people think reading first is good step.


Quote:
Originally Posted by linuxfordummies View Post
I don't know what I do with this information.
*Note these commands should be run from a Live CD so you don't inadvertently hash /proc, /sys or /dev.
Two modes basically:
- if you have a copy of the ISO you could use inclusion mode:
Code:
md5deep -p [blocksize]k /path/ISO > /externaldevice/hashes.md5 2>&1
md5deep -p [blocksize]k -m /path/hashes.md5 /dev/partition > /externaldevice/result.log 2>&1
result.log then shows the blocks the data resides at which you could then extract from your disk image file to a new one.

- if you don't have a copy of the ISO you could use exclusion mode:
Code:
md5deep -p [blocksize]k /rootmountpoint > /externaldevice/knownfiles.md5 2>&1
md5deep -p [blocksize]k -x /externaldevice/hashes.md5 /dev/partition > /externaldevice/result.log 2>&1
result.log then shows the blocks existing files reside at which you could then zero out from your disk image file.

This will not present the end result but a "cleaner" image to work on. I emphasize there is no guarantee wrt recovery. You may conclude the prospects discouraging and the effort involved disproportional. So if you can buy the ISO from the vendor, or download it that does not violate laws, if you have backups then do that / exercise those options, else if data is invaluable consider a professional recovery service.
 
Old 10-26-2012, 04:26 PM   #6
linuxfordummies
LQ Newbie
 
Registered: Oct 2012
Posts: 5

Original Poster
Rep: Reputation: Disabled
photorec custom file carving

Before I try that method, I want to give photorec a fair shot.

I tried a very simple photorec.sig file that it uses to search for the files I am trying to restore.

The photorec.sig has the following: raw 0 "ISOId"

Basically, the string "ISOId" is found in the header of the files that I am trying to recover. However, this file (as far as I know) cannot be used to specify file size or footer (in my case the same as header). The output were many files some of which are much larger than the actual file size. It is recognizing the header correctly. fidentify also recognizes the files correctly using this signature file.

I'd like to give photorec more inputs so that it can carve out the data better, so I read this about modifying the photorec code: http://www.cgsecurity.org/wiki/Devel...at_to_PhotoRec

Looking at the different file type codes (gif, mov, gz) I am a bit at a loss as to how to simply specify the string header "ISOId", the maximum file size, and a footer "ISOId". Any thoughts as to which code I can use as a template?

Thank you!
 
Old 10-26-2012, 04:40 PM   #7
unSpawn
Moderator
 
Registered: May 2001
Posts: 29,331
Blog Entries: 55

Rep: Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529Reputation: 3529
Quote:
Originally Posted by linuxfordummies View Post
Any thoughts as to which code I can use as a template?
No but Photorec / Testdisk has a mailing list. I suggest you ask for support there.
 
Old 10-27-2012, 02:03 AM   #8
linuxfordummies
LQ Newbie
 
Registered: Oct 2012
Posts: 5

Original Poster
Rep: Reputation: Disabled
photorec

Ah yes... I will post that question on the photorec forum. Thanks!
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Create populated files with unique file names from a list geokker Linux - Server 6 05-16-2012 04:50 PM
Batch merging pairs of files in a directory to unique file names bonissen Linux - Newbie 5 11-29-2011 02:57 PM
How to check missing header files included from another header file adisan82 Linux - Software 1 01-28-2011 04:57 AM
copying files and give new unique names to each file by using xargs command gnim66 Programming 6 06-22-2005 09:29 PM
need help recovering files on an ext2 file system rmanocha Linux - Software 2 11-06-2003 11:37 AM


All times are GMT -5. The time now is 06:48 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration