Need the right command for this operation...

pan64 · 01-19-2015, 09:45 AM

no, it is not true. If you go byte by byte you may find different occurrences for example at track boundaries, or at the end of files (partially inside and outside, but on the same track), if you go at filesystem level you will never find these, but will find others (when part of the files are not stored next to each other).

rtmistler · 01-19-2015, 09:56 AM

Quote:

Originally Posted by Completely Clueless

So as not to miss any instances that may have 'slipped through the cracks' as it were - deleted files, cluster tips, that sort of thing.
But it seems it's necessary to do two sweeps, then. One byte-by-byte and another at file system level. These things are always, it seems, more complicated than at first sight!

That still doesn't make any sense. You either care about a certain file or you don't. Per Astrogeek's point, you may find innocuous matches of a pattern, in binary files as well as text files. The correct thing, or shall I say the normal thing, would be to find files based on a file filter type and then see which ones meet your criteria. But you're specifying open criteria as if to say that literally every file with a particular sequence meets this criteria. The point being made is that not properly qualifying which files you target to do modifications for, you could then end up de-stabilizing your system.

Academically if you're looking for find literally every single sequence which matches, then you should not be doing this on your system disk but instead a data disk where you do not know, nor care about the content of the file system, but instead the simple fact that there's data on the disk. And then with the expectation that once you're done with your substitutions, the end result will likely be that the file system is no longer intact.

And a final thought is that while you may detect any possible instances of Jane be they contiguous versus not and replacing them with Mary. So what? Really, so what? You've managed to obliterate any semblance of the term Jane from a particular disk, but now you have exactly the same number of instances of the term Mary on that disk. You've done really nothing but identify and replace, and if you intend to do this to obliterate a term, but keep your file system intact, then you can't do it via a byte by byte search. If you need to obliterate a term, for instance say a really bad swear word is permeated everywhere and you wish to remove that word entirely from your system. Well, take that disk, remove the partition information, write repeatedly using dd to set the entire disk all to zeros, all to ones, all data from /dev/random, and then finally partition the disk and format it, then put all clean data onto it. Even then I'm betting Astrogeek's point is that you can find innocuous instances of a sequence somewhere on the disk, but they're inadvertent, and/or unintentional.