Something like DD

sourceman · 03-12-2002, 07:30 AM

Using pipes I dd a meg of data from the start of a partition, bz2 compress it, and then save it to a file.
With all this piping going on, the thing takes ages to do (as compared to how long it should take to do something as simple as this), and besides this, it's also reading and compressing the entire 1Meg. Compression is not as good as it could be.
I can remember reading about an inteligent dd a while back.
One that skips 0's or something. Anyone know about this?

What is the fastest, best compression, method, to do something like this?

Mara · 03-12-2002, 09:08 AM

You may use
dd if=/dev/xxxx bs=xxx count=xxxx |gzip -v9 >whatever.gz

sourceman · 03-13-2002, 12:50 AM

Thanks, but...

Using pipes I dd a meg of data from the start of a partition, bz2 compress it, and then save it to a file.

You may use
dd if=/dev/xxxx bs=xxx count=xxxx |gzip -v9 >whatever.gz

... Looks the same to me.

This is what my initial question is based on. It is this that I complain about. What if I want to dd 10Mb. It takes almost 30 seconds, because of the piping and so on.
I'm looking for a faster and better way.

Mik · 03-13-2002, 03:06 AM

I doubt it's the piping that is making it take so long. Bzip2 compresses at a higher rate but in return is also a lot slower. I doubt using a smarter dd which removes all the 0's would help to make it smaller. You get the highest compression rate if the zip algorithm finds a lot of repeating patterns. So if it would find a lot of zero's then it would compress it a lot anyways.
But on a filesystem if you delete a file it doesn't zero out all the blocks, it just frees the inode entries. The data will still be physically on the drive. So you'll always have very random data on your drive which is very hard to compress at a good rate.

Instead of using bzip2 you could try using gzip like Mara suggested. It's faster but you lose compression rate. So you'll just have to run several tests which speed/compression rate suits you.

Bill Krauss · 03-15-2002, 02:00 AM

Granted there is overhead in piping the data from program to program, but it's a very small fraction of the time it takes to read from the disk, or compress that data. I've moved 10's of gigs via pipes with no measurable loss of speed, i.e. it was going as fast as the disk could handle.

Make sure you use a reasonable blocksize. I'd recommend using at least 16k.

There's a program called mbuffer which does a lot of what dd does, but it shows xfer rate as it runs.