LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 03-13-2009, 04:24 PM   #1
Completely Clueless
Member
 
Registered: Mar 2008
Location: Marbella, Spain
Distribution: Many and various...
Posts: 899

Rep: Reputation: 70
Smile whole-disk text-string scanning utility


Hi guys,

Can anyone recommend a Linux utility to scan an entire physical disk (of only 12Gb)for selected text strings which searches (obviously) not just files and folders but cluster tips and unused space? Something which shows up all instances of hits found, where they are, and preferably has a "search and replace xyz with abc" facility. Many thanks!

CC.
 
Old 03-14-2009, 12:39 AM   #2
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,356

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
I think you probably want http://www.cgsecurity.org/wiki/PhotoRec_Step_By_Step. This primarily for recovering from corrupt or deleted files.
Not sure its do replace, there's usually no point. You recover first, then fix-up if possible.
For extant files try a loop with find & sed.
 
Old 03-14-2009, 12:45 AM   #3
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 21,120

Rep: Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120Reputation: 4120
If you want to hit unused space et al, you'll probably need a full-on forensic tool.
Been discussed plenty of times - there are even forensic liveCDs.
 
Old 03-14-2009, 11:48 AM   #4
Completely Clueless
Member
 
Registered: Mar 2008
Location: Marbella, Spain
Distribution: Many and various...
Posts: 899

Original Poster
Rep: Reputation: 70
Quote:
Originally Posted by syg00 View Post
If you want to hit unused space et al, you'll probably need a full-on forensic tool.
Been discussed plenty of times - there are even forensic liveCDs.
Well I have the Knoppix DVD which has a comprehensive Forensic Toolkit on it and a baffling array of other utilities, so I may well have something to do the job already. But all of those program names mean nothing to me; I need a specific pointer to a particular piece of software which will do the job. I need a program name to search for.
 
Old 03-14-2009, 11:55 AM   #5
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
You can always search this way:

dd if=/dev/sda bs=512 skip=<START> count=<RANGE> | hexdump -C | grep <keyword>

Replace <START> with the number of the 1st sector to search
<RANGE> with the number of sectors to search
<keyword> with the string to look for
 
Old 03-14-2009, 12:55 PM   #6
H_TeXMeX_H
LQ Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301
Quote:
Originally Posted by pixellany View Post
You can always search this way:

dd if=/dev/sda bs=512 skip=<START> count=<RANGE> | hexdump -C | grep <keyword>

Replace <START> with the number of the 1st sector to search
<RANGE> with the number of sectors to search
<keyword> with the string to look for
Sorry, but I have to disagree. It would seem to work, but as the output of hexdump -C is something like:

Code:
000076b0  2e 66 72 69 68 6f 73 74  2e 63 6f 6d 2f 22 20 63  |.frihost.com/" c|
000076c0  6c 61 73 73 3d 22 62 6f  74 74 6f 6d 5f 6c 69 6e  |lass="bottom_lin|
000076d0  6b 73 22 3e 46 72 69 68  6f 73 74 3c 2f 61 3e 2c  |ks">Frihost</a>,|
000076e0  20 3c 61 20 68 72 65 66  3d 22 68 74 74 70 3a 2f  | <a href="http:/|
What if I were to grep "class" from this ... it wouldn't work because it's split between lines.

Probably the best solution is to write a C program and use the image of the whole disk, but I'm not sure why anyone would do this.

You could also use:

Code:
find / | grep whatever
and foremost.
 
Old 03-15-2009, 10:21 PM   #7
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
touche!!

In this particular brute-force method, you would need to try several different key words until you established where the file was. Some different hexdump options might help also....
 
Old 03-16-2009, 09:18 AM   #8
Completely Clueless
Member
 
Registered: Mar 2008
Location: Marbella, Spain
Distribution: Many and various...
Posts: 899

Original Poster
Rep: Reputation: 70
I wonder if a hex editor would do the job satisfactorily? Presumably this kind of program can 'see' *everything* on a disk?
 
Old 03-16-2009, 10:06 AM   #9
farslayer
LQ Guru
 
Registered: Oct 2005
Location: Northeast Ohio
Distribution: linuxdebian
Posts: 7,249
Blog Entries: 5

Rep: Reputation: 191Reputation: 191
http://www.forensicswiki.org/wiki/The_Sleuth_Kit
The Sleuth Kit can search for keywords..

If that doesn't work for you check out some of the other Forensics tools available..
http://www.forensicswiki.org/wiki/Main_Page
 
Old 03-16-2009, 12:02 PM   #10
H_TeXMeX_H
LQ Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301
Hey, neat, I didn't know about this kit, now if only they supported more filesystems.
 
Old 03-16-2009, 12:08 PM   #11
Completely Clueless
Member
 
Registered: Mar 2008
Location: Marbella, Spain
Distribution: Many and various...
Posts: 899

Original Poster
Rep: Reputation: 70
Talking

Quote:
Originally Posted by farslayer View Post
http://www.forensicswiki.org/wiki/The_Sleuth_Kit
The Sleuth Kit can search for keywords..

If that doesn't work for you check out some of the other Forensics tools available..
http://www.forensicswiki.org/wiki/Main_Page
Thanks, Farslayer. You really are a most helpful guy. I'd tip you another "thanks" but it might start to look as if you're paying me, or we're related in some way. ;-)
 
Old 03-16-2009, 01:00 PM   #12
farslayer
LQ Guru
 
Registered: Oct 2005
Location: Northeast Ohio
Distribution: linuxdebian
Posts: 7,249
Blog Entries: 5

Rep: Reputation: 191Reputation: 191
No worries.. The check is in the mail
 
Old 03-17-2009, 06:42 AM   #13
Kenhelm
Member
 
Registered: Mar 2008
Location: N. W. England
Distribution: Mandriva
Posts: 360

Rep: Reputation: 170Reputation: 170
The 'strings' command extracts text from binary data.
The following code scans /dev/sda for strings containing '.jpg'
It has to be run as root. Use 'CTRL c' to stop the command.
Code:
dd if=/dev/sda | strings -n 4 -t d | grep  '\.jpg'

3005553932 ElectronicsCapacitorscapacitor_codes_filestop_img6.jpg
3112021438 Sunset2.jpg, and Sunset3.jpg.
3112022948 the pictures are saved as Sunset1.jpg, Sunset2.jpg
3119203911 http://www.perl.com/graphics/perlhome_header.jpg</

# 'grep -C 2'  adds 2 lines of context before and after
dd if=/dev/sda | strings -n 4 -t d | grep -C 2 '\.jpg'
--
3005442040 Bashlinuxcommand.orghtml_textsizeof.html
3005442088 Bashlinuxcommand.orghtml_textsizeof.README.html
3005443248 Bashlinuxcommand.orgimagesxterm.jpg
3005443292 Bashlinuxcommand.orgman_pagesa2p1.html
3005443340 Bashlinuxcommand.orgman_pagesa2ps1.html
--
3005552864 ElectronicsCapacitorscapacitor_codes_filesactuators.gif
3005552928 ElectronicsCapacitorscapacitor_codes_filesArticles.gif
3005552992 ElectronicsCapacitorscapacitor_codes_filesback_green.jpg
3005553056 ElectronicsCapacitorscapacitor_codes_filesback_stone.jpg
3005553120 ElectronicsCapacitorscapacitor_codes_filesBasics.gif
3005553180 ElectronicsCapacitorscapacitor_codes_filescp51.gif
--
-n 4 means only extract strings of 4 or more characters.
-t d means precede each extracted string with the decimal offset of its first character.
(This isn't the offset of '.jpg' unless it's at the start of the string.)
I'm using the version of 'strings' supplied with Mandriva.
The version supplied with Puppy 4.1.1 does not support '-t d' for decimal offset.
It only has '-o' which gives the offset in octal.

n.b. The dd command is dangerous; typing 'of=$device' instead of 'if=$device' can destroy the $device file system.
 
Old 03-17-2009, 02:59 PM   #14
Completely Clueless
Member
 
Registered: Mar 2008
Location: Marbella, Spain
Distribution: Many and various...
Posts: 899

Original Poster
Rep: Reputation: 70
Quote:
Originally Posted by Kenhelm View Post
n.b. The dd command is dangerous; typing 'of=$device' instead of 'if=$device' can destroy the $device file system.
Good point about the dangers of transposing your input and output files and another good reason why 'dd' should be re-written to become rather more 'intelligent.'

Thanks for the 'strings' command suggestion. I've never heard of it but will certainly check it out.

CC.
 
Old 03-17-2009, 03:01 PM   #15
H_TeXMeX_H
LQ Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301
Quote:
Originally Posted by Completely Clueless View Post
Good point about the dangers of transposing your input and output files and another good reason why 'dd' should be re-written to become rather more 'intelligent.'
heh heh, I doubt it. But, you could probably write a wrapper script if you knew what you wanted to protect from.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bash script: Scanning for text tboss888 Programming 3 01-27-2009 10:27 AM
Easy string/text manipulation/indentation for restructured text brianmcgee Linux - Software 1 04-22-2008 08:27 PM
Looking for text insertion utility... tisource Linux - Software 2 12-14-2005 04:06 PM
Scanning Disk or Disk Repairing mhkhalqani Linux - Hardware 4 09-30-2005 09:28 AM
Searching for software with scanning, text/graphic editing, etc. functions satimis Linux - Software 0 11-06-2004 06:07 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 02:03 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration