LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-28-2014, 08:59 PM   #1
Automatic
Member
 
Registered: Mar 2013
Posts: 42

Rep: Reputation: Disabled
Corrupt X percent of a file?


I want to test rsync's ability to detect, and, replace, a 'corrupted' file. Is there an easy way to corrupt, say, 10% of a file? I wish to keep the base file the same (Otherwise rsync will literally just delete the file and start again), but, modify 10% of it, which, is why I can't literally just read in from /dev/urandom or something.

If at all possible, it'd be great to also state to either group the corruption together (I.E. 30% valid, 10% corrupt, 60% valid), and, having just random bits flipped all over the entire file.

EDIT:- I suppose I could do something like this for the grouped corruption (To corrupt 10MiB 500MiB in):-
Code:
dd if=/dev/urandom of=mySoonToBeCorrupted.file bs=1M count=10 seek=500
I haven't tested it yet, but, that seems as though, logically, it should work. Unfortunately, dd doesn't really help for non-grouped, unless I want some insane loop of inefficiency.

Last edited by Automatic; 04-28-2014 at 09:19 PM.
 
Old 04-29-2014, 03:13 AM   #2
TenTenths
Senior Member
 
Registered: Aug 2011
Location: Dublin
Distribution: Centos 5 / 6 / 7
Posts: 2,166

Rep: Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751
Quote:
Originally Posted by Automatic View Post
insane loop of inefficiency.
I've just found my new band name!


On topic though, why not just run a loop 10 times and start at a random place in the file each time?
 
Old 04-29-2014, 03:28 AM   #3
Doc CPU
Senior Member
 
Registered: Jun 2011
Location: Stuttgart, Germany
Distribution: Mint, Debian, Gentoo, Win 2k/XP
Posts: 1,099

Rep: Reputation: 343Reputation: 343Reputation: 343Reputation: 343
Hi there,

Quote:
Originally Posted by Automatic View Post
I want to test rsync's ability to detect, and, replace, a 'corrupted' file. Is there an easy way to corrupt, say, 10% of a file? I wish to keep the base file the same (Otherwise rsync will literally just delete the file and start again), but, modify 10% of it, which, is why I can't literally just read in from /dev/urandom or something.
I don't know all that much about how rsync really works - but in general, there are two ways to detect if a file has been altered.
  1. By checking the timestamp against a reference:
    When a file is modified, its timestamp is usually updated. However, it is possible to change the timestamp independently from the file contents, so a malevolent guy could change the file, then set the timestamp back to the original value. You wouldn't notice, and neither would programs that rely entirely on timestamp. That's why I hope rsync doesn't rely on that.
  2. By checking the actual file contents:
    This is of course the safer method, but of course it's much more effort. Whether you compare the original and the possibly modified file byte by byte or compute a hash value and compare that - either way, you have to read both files entirely.

Whatever method a program uses - it doesn't make a difference whther there's just one byte of difference somewhere inside the file, or if the two files in question are totally different. So I don't quite understand your intention of corrupting a certain percentage. Tainting one single byte must be enough.

Code:
dd if=/dev/urandom of=mySoonToBeCorrupted.file bs=1M count=10 seek=500
That's basically what came to my mind, too.

[X] Doc CPU
 
Old 04-29-2014, 03:36 AM   #4
TenTenths
Senior Member
 
Registered: Aug 2011
Location: Dublin
Distribution: Centos 5 / 6 / 7
Posts: 2,166

Rep: Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751Reputation: 751
Quote:
Originally Posted by Doc CPU View Post
Tainting one single byte must be enough.
Indeed, however that doesn't take in to account the 1 in 256 chance that the one byte gets "corrupted" with the value it originally held. Unless of course you're reading the value first and ensuring it's not being overwritten.

Each additional "taint" exponentially increases the odds of actually corrupting the file.

Of course an easy way to corrupt a file would be to append anything to it, if you're not worried about the length changing.

Code:
echo "Oooh, I Broke This File, I'm a BAD Admin" >> testfile
 
Old 04-29-2014, 08:38 AM   #5
jpollard
Senior Member
 
Registered: Dec 2012
Location: Washington DC area
Distribution: Fedora, CentOS, Slackware
Posts: 4,599

Rep: Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241Reputation: 1241
You can easily use dd to corrupt one bit, though it takes two dd commands.

The keys are:

The record length has to be one
The offset position is in bytes

The first dd command reads the byte for storage in a shell variable (remember to convert the byte to a number).

Add one to it (this is the corruption) and AND it back to a byte (or if >= 256 subtract 256...).

Then echo that byte (remember to format it as a byte) into a dd using the same record length and offset position.

I know I'm lazy - this is more complicated using bash than a very short C program...
 
Old 04-29-2014, 09:48 AM   #6
Beryllos
Member
 
Registered: Apr 2013
Location: Massachusetts
Distribution: Debian
Posts: 304

Rep: Reputation: 121Reputation: 121
Quote:
Originally Posted by Automatic View Post
I want to test rsync's ability to detect, and, replace, a 'corrupted' file...
Quoting from man rsync:
Rsync finds files that need to be transferred using a "quick check"
algorithm (by default) that looks for files that have changed in size
or in last-modified time...
and
-c, --checksum

This changes the way rsync checks if the files have been changed
and are in need of a transfer. Without this option, rsync uses
a "quick check" that (by default) checks if each file’s size and
time of last modification match between the sender and receiver.
This option changes this to compare a 128-bit checksum for each
file that has a matching size...
Based on that, I expect rsync will always fail to detect data corruption in its default mode, but will always detect corruption, even 1 byte, in checksum mode.

Well, almost always. There is a small probability that two files with different contents will generate the same hash. I wouldn't worry much about that, because there is another problem: If the hashes don't match, that alone can not tell which copy of the file was corrupted. In that situation, rsync will simply copy the source file to the destination. If you need protection against random changes to files, you need to do something more sophisticated. I think md5deep might be able to help there.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Konqueror to Dolphin 0.8.2 as default File manager is only 95 percent complete???? soupnsandwich Linux - Desktop 2 07-26-2009 11:24 AM
ifstate file corrupt TheYa Debian 2 06-14-2009 09:14 AM
LXer: Linux at 1 percent?! Ha! It's more like 45 percent LXer Syndicated Linux News 0 05-05-2009 02:11 PM
Root file system shows 100% percent utilization simplyrahul Linux - Software 1 11-26-2004 11:53 AM
Delete corrupt file hagman Linux - General 4 07-26-2004 04:15 PM


All times are GMT -5. The time now is 04:18 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration