Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Sorry, but that command line is a bit incomprehensible to me at this stage. I was just saying to Charles that maybe sed might be better. I think that looks 'between the cracks' too.
Okay guys, well thanks for the suggestions. Never mind if nothing comes to mind. There's a very long thread called 'understanding the dd command' by some chap called Awesome Machine where I'm pretty sure it's explained how to do this with sed. I'll check it out before bothering you further.
Thanks again, all. :-)
Where reinventing the wheel equals efficiency and
"using the wrong tool for the right job" not misperception,
there "forensics" is more than just a label.
OK, have had time to test. The problem was using seek (skip output blocks) instead of skip (skip input blocks).
Thanks to Peter for the smarter grep options and the sanity of hexdumping the output.
Some gotchas ...
What sort of file was the email address in? If it was a wordprocessor file or a double byte character set file or ... anything but a plain text file, then the grep match expression is not going to be easy. Maybe good idea to set up a similar file to test the grep match expression first:
How big is the partition? A huge amount of output could be produced if the match expression is loose. May be prudent to introduce an outer loop to switch output files or to start with a limited number of blocks first to see how it goes before going for the final run.
dd reports what it did on stderr -- best lost to /dev/null.
It's going to take a long time to run so nice to have an indication of activity. Maybe up the count= so it runs a bit faster.
Which leads to
Code:
echo "" > out
i=0
#while true # For final run
while [[ $i -lt 300 ]] # While testing
do
block="$(dd if=/dev/sda4 skip=$i count=4 2>/dev/null)"
[[ $? -ne 0 ]] && exit
echo "$block" | grep -iaC 5 <match expression> | hexdump -C >> out
echo -n '.'
let i++
done
OK, have had time to test. The problem was using seek (skip output blocks) instead of skip (skip input blocks).
Thanks to Peter for the smarter grep options and the sanity of hexdumping the output.
Some gotchas ...
What sort of file was the email address in?
Hello again Charles,
Thanks for your follow-up. I shall run your code a little later. I'm just a bit concerned by this statement of yours "what sort of file was the email address is?"_could_ imply that the code only searches within existing files rather than the entire space of the partition in question (50Gb). Just a thought. I'll report back on the results later....
Last edited by Completely Clueless; 07-31-2009 at 06:31 AM.
Thanks for your follow-up. I shall run your code a little later. I'm just a bit concerned by this statement of yours "what sort of file was the email address is?"_could_ imply that the code only searches within existing files rather than the entire space of the partition in question (50Gb). Just a thought. I'll report back on the results later....
The code searches the whole 50 GB (50 GB! That's going to take a wile to search!). The dd command is very simple minded -- it just ploughs through the whole of its input and copies it to output. By giving /dev/sda4 as input it's reading the raw blocks, not paying any attention to file system structures.
I hope you've umounted /dev/sda4 or later operations may overwrite the block(s) you want.
Aren't Firefox cache files compressed? If so there is no grep match expression that will find email addresses i them.
Depending on how you've set it up, you may have several day's data in you FF cache. Have you tried entering about:cache in your FF address bar and looking at which sites are in your disk cache?
And, if the cache file you need is not there, I suspect that you'll be out of luck since, as catkin noted, the cache files are compressed, so a simple string search will not work.
You might be able to recover some cache file with tools like foremost, but getting them back into your cache is not simple. (You'll probably need to recover the _CACHE_ control file(s) as well. And, since FF will have unlinked the files, they are most probably unrecoverable.
Note that, if you had fully described the problem you were trying to solve in your first post, instead of just asking about one possible solution, you would have probably received the advice about using FF's built-in cache exploration tool (the about:cache address) soon enough to be helpful.
The code searches the whole 50 GB (50 GB! That's going to take a wile to search!). The dd command is very simple minded -- it just ploughs through the whole of its input and copies it to output. By giving /dev/sda4 as input it's reading the raw blocks, not paying any attention to file system structures.
I hope you've umounted /dev/sda4 or later operations may overwrite the block(s) you want.
Aren't Firefox cache files compressed? If so there is no grep match expression that will find email addresses i them.
Hi Charles,
Well PTrenholme reckons they are. Bummmer. Fortunately, the people I was needing to contact have emailed ME, so the urgency is no longer a problem.
Now I've tried your revised code as it strikes me this would still be a useful tool to have. It now runs, although not for long. It prints a few screenfuls of dots to stdout and then exits (without error!). So something still not right. In your suggested script, you use something like <search expression> which I've edited into 'xyz@yahoo.com' (not the real address published here for obvious reasons, but within single quotes instead of '<>'. I'm just wondering if it's correct to enclose the string in single quotes? Could that be the problem?
It prints a few screenfuls of dots to stdout and then exits (without error!). So something still not right. In your suggested script, you use something like <search expression> which I've edited into 'xyz@yahoo.com' (not the real address published here for obvious reasons, but within single quotes instead of '<>'. I'm just wondering if it's correct to enclose the string in single quotes? Could that be the problem?
Glad you got that important email address!
If you ran my script as posted the behaviour you describe is expected. To scan the whole partition you need to uncomment #while true and comment out while [[ $i -lt 300 ]]. The output is in file "out" which you can watch grow (or not!) using tail -f out (from another terminal unless you bacground the script). The single quotes may not be essentail but they are good practice, in case the match expression matches something in the current directory.
If you ran my script as posted the behaviour you describe is expected. To scan the whole partition you need to uncomment #while true and comment out while [[ $i -lt 300 ]]. The output is in file "out" which you can watch grow (or not!) using tail -f out (from another terminal unless you bacground the script). The single quotes may not be essentail but they are good practice, in case the match expression matches something in the current directory.
Oh right, thanks! I didn't spot the comments aspect and apart from the search term just ran it as it stood. I'll have another try later and report back in due course....
Okay. Sorry for the delay. I've changed the code according to your instructions and have arrived at:
Code:
#!/bin/bash
echo "" > out
i=0
while true # For final run
#while [[ $i -lt 300 ]]
do
block="$(dd if=/dev/sda4 skip=$i count=4 2>/dev/null)"
[[ $? -ne 0 ]] && exit
echo "$block" | grep -iaC 5 'xyz@hotmail.com' | hexdump -C >> out
echo -n '.'
let i++
done
But it still won't run any sense. It creates a file named 'out' consisting of one byte and then promptly exits! I've not yet read-up on while loops in Bash, but even to me it still doesn't look right. :-/
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.