data transfer rate decreases with dd command?

japa-fi · 04-27-2010, 12:21 AM

Fastest one of my external USB drives seems to max out at 24MB/sec, the slower one is approx 18Mb/sec.
The _most_ important item affecting transfer speed with dd is the block size (bs parameter). Some HDs are devastated by the default of 512 bytes. Increasing bs to 2048 makes a huge difference, doubling it to 4096 usually makes some additional increase and 8192 starts to be on the "did I notice anything?" range.
In my experience the error checking as referred above (conv=notrunc,noerror) does not impact the writing speed. It's not error checking, it's about error handling. Error checking is done no matter what. The parameters are about what to do when error is detected.

If you are creating a clone of partition, I strongly suggest using conv=sync,noerror -parameters with dd.
The reason being: if dd experiences a read error and cannot read the block, it will write "empty(null) " block to destination to that same spot. Otherwise it will not write anything at all to that spot. That will result to everything being off by 1 block from the point of the error.

Examples
dd without conv=sync,noerror

Code:

block	1	2	3	4	5
Input	AA	BB	error	DD	EE
output	AA	BB	DD	EE

dd conv=sync,noerror

Code:

block	1	2	3	4	5
Input	AA	BB	error	DD	EE
output	AA	BB	null	DD	EE

If everything is off by 1 block (sector), the filesystem is fubar after that point. I've learned that the hard way (failing HD, created clone without sync -parameter).

As for the suggested notrunc parameter: Why would one need that?

catkin · 04-27-2010, 12:33 AM

Quote:

Originally Posted by minrich

I also assume that you are reformatting the external hdd between running the dd commands in your first post - otherwise you are trying to overwrite existing data, which while not usually a disaster, does mean that the process will be slower than writing to a clean partition.

Why would that be? In the OP dd was writing to a file so was not overwriting any data.

catkin · 04-27-2010, 12:40 AM

Quote:

Originally Posted by koooee

I have implemented a MRU cache ...

Could that be significant? The reported dd behaviour is unusual and so is changing the file system cache ... ? I do understand that it should (TM) be irrelevant but there is something unusual going on here. OTOH few people send dd USR1 signals to see what's going on; maybe this behaviour is normal; I'm just looking for ~100 GB space on a USB HDD to test it.

jiml8 · 04-27-2010, 12:44 AM

Drive write rates for SATA vary according to how full the drive is. I don't know why this is, but I've seen it in test after test after test. I think it has to do with the relatively simple-minded way the drive decides where to write next and as there is more stuff on the drive, the time spent figuring out where to write next increases.

My *guess* is that the dominant effect here is a SATA drive filling up and therefore slowing its writes.

catkin · 04-27-2010, 12:49 AM

Quote:

Originally Posted by koooee

I have seen this command in different forms

Code:

kill -SIGUSR1 PID_of_dd
kill -USR1 PID_of_dd
kill -s USR1 PID_of_dd

I can't condone for the first two, as the last one is what I use. The first two don't work for me, nor are they specified in the grammar of my kill command, I would check out 'man kill' for grammar specs.

kill is a bash builtin with slightly different syntax from the external command and hence the man page.

For convenience

Code:

c@CW8:~$ type kill
kill is a shell builtin
c@CW8:~$ help kill
kill: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
     Send the processes named by PID (or JOBSPEC) the signal SIGSPEC.  If
    SIGSPEC is not present, then SIGTERM is assumed.  An argument of `-l'
    lists the signal names; if arguments follow `-l' they are assumed to
    be signal numbers for which names should be listed.  Kill is a shell
    builtin for two reasons: it allows job IDs to be used instead of
    process IDs, and, if you have reached the limit on processes that
    you can create, you don't have to start a process to kill another one.
c@CW8:~$ /bin/kill --help
usage: kill [ -s signal | -p ] [ -a ] pid ...
       kill -l [ signal ]

catkin · 04-27-2010, 01:20 AM

Quote:

Originally Posted by jiml8

My *guess* is that the dominant effect here is a SATA drive filling up and therefore slowing its writes.

Promising theory

Perhaps it's not so much the SATA drive as the NTFS file system which is prone to fragmentation. In this post kooee answered pixellany's question "How much free space is on the target drive?" with 300GB.

@kooee: I'd like to develop pixellany's question. How big is the target file system, how much free space does it have before starting dd and how fragmented is it?

allend · 04-27-2010, 02:11 AM

I suspect that this is actually in ntfs-3g. I saw this happening last weekend when I was resizing an NTFS file system from a ~40GB special image created using ntfsclone. I was reading the special image from an NTFS format external USB drive and then writing the full image back to the same drive. Simply reading the special image from the drive was rapid, but writing back to the drive became slower as the file became larger. The 'kill -USR1 $PID' command showed that the write rate was becoming slower and this was reflected in the LED light on the drive, a longer pause between write activity being observed. I put it down to the way that ntfs-3g handles the index data in an NTFS file system.

I have not seen the same behaviour on the ext3 partition of the same external USB drive.

koooee · 04-27-2010, 07:20 AM

@catkin regarding MRU cache, no, no, I have implemented an MRU cache in C for a simulated software environment before, not for the hard drive.

regarding the target size and fragmentation, It is a 400GB hard drive with 300GB of free space, as far as fragmentation, I am unsure.

I did find that the speed issue is largely due to the external hard drives capabilities, and most likely fragmentation on the drive as catkin has stated.

I did run this command on the internal hard drive I wanted to wipe.

Code:

dd if=/dev/zero of=/dev/sdc

and I got fairly reasonable transfer rate of 20mb/s which declined to 15mb/s out of the total 160GB drive....so there is still a decline...just not in the rapid linear fashion....

So...now it is down to that target and source??, as others have mentioned as a possibility?!?

koooee · 04-27-2010, 08:39 AM

Quote:

Originally Posted by allend

I suspect that this is actually in ntfs-3g. I saw this happening last weekend when I was resizing an NTFS file system from a ~40GB special image created using ntfsclone. I was reading the special image from an NTFS format external USB drive and then writing the full image back to the same drive. Simply reading the special image from the drive was rapid, but writing back to the drive became slower as the file became larger. The 'kill -USR1 $PID' command showed that the write rate was becoming slower and this was reflected in the LED light on the drive, a longer pause between write activity being observed. I put it down to the way that ntfs-3g handles the index data in an NTFS file system.

I have not seen the same behaviour on the ext3 partition of the same external USB drive.

Humm....this is interesting....I will dig deeper into ntfs-3g, I honestly do not have a deep knowledge of this file system...although, I still do see declining transfer rates for ext3 and NTFS file systems....although not as drastic, but still interesting. Clearly as the drives become more crowded with data it will take longer to write to them. But, I was under the impression that it would be a fairly miniscule, Especially with SDD's. Opposed to 20%+ declines in transfer rates(SDD's) and 80%+ for the mechanical. ...I will test some more when I get home from work.

stefan · 05-03-2010, 10:33 AM

Quote:

Originally Posted by pixellany

I'm still stuck on the question of how to get dd to print out multiple progress reports----did I miss it?

I have NEVER seen dd give anything except the total time at the end.

Not sure if you have seen it yet or not, but the way to get dd to update you on the progress is by sending a kill USR1 signal:

kill -s USR1 <pid of your dd command>

Of course if you are running your dd command as root, you must be root to send the kill command.

catkin · 12-11-2012, 04:43 AM

Quote:

Originally Posted by japa-fi

If you are creating a clone of partition, I strongly suggest using conv=sync,noerror -parameters with dd.
The reason being: if dd experiences a read error and cannot read the block, it will write "empty(null) " block to destination to that same spot. Otherwise it will not write anything at all to that spot. That will result to everything being off by 1 block from the point of the error.

Although this advice is more than two years old it offers sufficiently dangerous advice to merit correction.

The problem is that conv=noerror makes dd ignore read errors. That is useful (in conjuction with sync) when copying from a failing drive but is dangerous when cloning a good partition as per the topic of this thread. If there is a read error on the input, using conv=noerror results in a defective copy without any indication of error.

To add weight to my opininion, here's what syg00 wrote in this LQ post

Quote:

Originally Posted by syg00

"dd" is absolutely unequivocally the worst option for backup. Especially with "noerror".