[SOLVED] How can I determine "bs" parameter in "dd" command?

hack3rcon · 08-15-2016, 08:44 AM

Hello.
How can I determine "bs" parameter in "dd" command?

Tnx.

jpollard · 08-15-2016, 08:55 AM

You don't have to.

the "bs" option specifies a size for block I/O - basically, just a buffer. If you are copying disks you can use a "bs=1M" and things go fairly fast. If you have lots of memory, use a "bs=10M".

Getting it "wrong" only causes a short record message at the end (either for reading or writing).

The only time the bs option is really of any use is if you have to do odd things like covert data. Sometimes you have to specify the "ibs" (input block/buffer size) as different than the "obs" (output block/buffer size). But these are rarely used these days (I used to use it to covert EBCDIC to ASCII now and then, and the block sizes were used to determine the size of the records to be processed, and that adds a "cbs" option for the conversion).

By default, specifying the "bs" option causes dd to use the same value for "ibs" and "obs".

If you ignore the block size, dd will use 512 bytes, which tends to make copies rather slow.

tronayne · 08-15-2016, 09:28 AM

The default block size is 512 (bytes). That will be pretty slow.

You can use the system BUFSIZ, 8192 (bytes) which is the default size the system uses for buffering data. That will be pretty fast... well, faster than 512.

You can check what your BUFSIZ is -- it most likely is 8192 -- by

Code:

grep  BUFSIZ /usr/include/*.h

which, on a 64-bit system, will show you

Code:

grep BUFSIZ /usr/include/*.h
/usr/include/_G_config.h:#define _G_BUFSIZ 8192
/usr/include/expect.h:#ifndef BUFSIZ
/usr/include/ldap.h:#define LDAP_OPT_X_SASL_MAXBUFSIZE		0x6109
/usr/include/libdevmapper.h:#define DM_FORMAT_DEV_BUFSIZE	13	/* Minimum bufsize to handle worst case. */
/usr/include/libio.h:#define _IO_BUFSIZ _G_BUFSIZ
/usr/include/pi-error.h:	PI_ERR_DLP_BUFSIZE		= -300,	/**< provided buffer is not big enough to store data */
/usr/include/pi-palmpix.h:   /* This callback should read record #RECNO into BUFFER and BUFSIZE, and
/usr/include/stdio.h:#ifndef BUFSIZ
/usr/include/stdio.h:# define BUFSIZ _IO_BUFSIZ
/usr/include/stdio.h:   Else make it use buffer BUF, of size BUFSIZ.  */

(the boldface one is it).

You can experiment and double or triple the BUFSIZ value and use that in dd but you may have diminishing returns, best to try it and see what happens. You can set it it some megabytes but you can overdo it.

Hope this helps some.

rknichols · 08-15-2016, 09:40 AM

Quote:

Originally Posted by tronayne

The default block size is 5120 (bytes). That will be pretty slow.

It is 512 bytes. The system call overhead with that is a killer. A long time ago, when drives and machines were a lot slower, I found that there was little improvement in increasing the blocksize above 64K. These days, with Gigabytes of memory available, I typically use 256K, 512K, or 1M, as suits my mood at the time (i.e., for no particular reason).

Unless you are accessing a tape drive or other device where the blocksize directly affects the I/O operation, it really doesn't matter except for the performance penalty at small block sizes.

tronayne · 08-15-2016, 10:06 AM

Quote:

Originally Posted by rknichols

It is 512 bytes.

Yes, indeedy-do, tis 512 not 5120.

I hate being old and half blind and fumble-fingered.

Thanks for pointing that out.

jefro · 08-15-2016, 02:58 PM

The reasons one might wish to know block size could be for some things.
One may be a type of media that you wish to get exact block as it is used in device. Some devices won't properly copy if you don't use block for block copy.
Another is to speed up the use of dd on an operation.
Another may be to use dd to extract a small portion of media as opposed to the entire drive.

hack3rcon · 08-16-2016, 04:02 AM

OK. This it is kind of optional.
For an external HDD with 1TB capacity which "bs" size is good?

jpollard · 08-16-2016, 05:57 AM

Depends on how much memory you have, and the I/O controllers you have.

1 MB works fairly well. 8MB can work better. Why - lower controller contention when reading large blocks. But for your specific platform, experimentation is the best way to find out.

hazel · 08-16-2016, 06:28 AM

According to the dd info page:

Quote:

Any block size you specify via ‘bs=’, ‘ibs=’, ‘obs=’, ‘cbs=’ should
not be too large—values larger than a few megabytes are generally
wasteful or (as in the gigabyte..exabyte case) downright
counterproductive or error-inducing.

jpollard · 08-16-2016, 07:45 AM

Again, "too large" is determined by the local hardware. If I'm copying a 32GB SD card (for a Raspberry Pi) I just might use 1GB, though using 512MB works just fine.. It isn't "too large", after all, I have 8. But more than that? nope - that WOULD be "too large".

It also helps to have only one device on that USB controller...

rknichols · 08-16-2016, 09:02 AM

Quote:

Originally Posted by jpollard

Again, "too large" is determined by the local hardware. If I'm copying a 32GB SD card (for a Raspberry Pi) I just might use 1GB,

That's getting into "counterproductive" territory. Why? Because writing to the destination cannot begin until that first 1GB block has been completely read from the source. You are losing some of the benefit of overlapped input and output operations, though that's not going to matter if both devices are on the same controller (no overlap possible) or if there is a vast difference in speed between the two devices (the slower device will dominate). For the case you mentioned, the difference isn't going to be huge (at most, 1/32 of the total time), but about the only benefit of going that large is that "Because I can" good feeling.

hack3rcon · 08-16-2016, 09:03 AM

Thank you a lot. I understand.

jpollard · 08-16-2016, 10:44 AM

Quote:

Originally Posted by rknichols

That's getting into "counterproductive" territory. Why? Because writing to the destination cannot begin until that first 1GB block has been completely read from the source. You are losing some of the benefit of overlapped input and output operations, though that's not going to matter if both devices are on the same controller (no overlap possible) or if there is a vast difference in speed between the two devices (the slower device will dominate). For the case you mentioned, the difference isn't going to be huge (at most, 1/32 of the total time), but about the only benefit of going that large is that "Because I can" good feeling.

Actually, no. The first GB does have to be read... but after than the reads/writes are overlapped - and with less I/O interaction. Since the SATA controller is faster than the USB you now get to keep the USB going.

But again, it depends on the controllers. If you think it is going slow with a big block size, use a smaller.

The elapsed time remains about the same - but the system overhead gets bigger with the smaller block sizes.

For most controllers, 512MB is sufficient - and the overhead is a don't care when you are the only user.

rknichols · 08-16-2016, 12:49 PM

Quote:

Originally Posted by jpollard

Actually, no. The first GB does have to be read... but after than the reads/writes are overlapped

I think that's pretty much what I said.

Going from SATA to a much slower USB device qualifies as a vast difference in speed. Delaying the start of output by the 5 to 10 seconds it takes to read that first 1GB block is insignificant. And BTW, unless you use "iflag=direct" and "oflag=direct" the kernel is going to be buffering all your data anyway.

jefro · 08-16-2016, 02:50 PM

There are a few web pages out there that have a quick way to test the bs sizes and gives you a fairly good number to use. You could wing it from guesses based on some quick tests. Make the command with small bs sizes and wait 60 seconds and stop it then see how much it moved.