Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Using pipes I dd a meg of data from the start of a partition, bz2 compress it, and then save it to a file.
With all this piping going on, the thing takes ages to do (as compared to how long it should take to do something as simple as this), and besides this, it's also reading and compressing the entire 1Meg. Compression is not as good as it could be.
I can remember reading about an inteligent dd a while back.
One that skips 0's or something. Anyone know about this?
What is the fastest, best compression, method, to do something like this?
Using pipes I dd a meg of data from the start of a partition, bz2 compress it, and then save it to a file.
You may use dd if=/dev/xxxx bs=xxx count=xxxx |gzip -v9 >whatever.gz
... Looks the same to me.
This is what my initial question is based on. It is this that I complain about. What if I want to dd 10Mb. It takes almost 30 seconds, because of the piping and so on.
I'm looking for a faster and better way.
I doubt it's the piping that is making it take so long. Bzip2 compresses at a higher rate but in return is also a lot slower. I doubt using a smarter dd which removes all the 0's would help to make it smaller. You get the highest compression rate if the zip algorithm finds a lot of repeating patterns. So if it would find a lot of zero's then it would compress it a lot anyways.
But on a filesystem if you delete a file it doesn't zero out all the blocks, it just frees the inode entries. The data will still be physically on the drive. So you'll always have very random data on your drive which is very hard to compress at a good rate.
Instead of using bzip2 you could try using gzip like Mara suggested. It's faster but you lose compression rate. So you'll just have to run several tests which speed/compression rate suits you.
Granted there is overhead in piping the data from program to program, but it's a very small fraction of the time it takes to read from the disk, or compress that data. I've moved 10's of gigs via pipes with no measurable loss of speed, i.e. it was going as fast as the disk could handle.
Make sure you use a reasonable blocksize. I'd recommend using at least 16k.
There's a program called mbuffer which does a lot of what dd does, but it shows xfer rate as it runs.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.