Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I see that mysqldump outputs the databases as SQL to re-create. That might make sense. But I wonder if that scales. If I have several terabytes of data, and this several terabytes of SQL dumped, and need to restore this, does it do this without needing several terabytes of RAM to cache the SQL, or even several terabytes of extra disk space above and beyond the space it will be creating databases in?
I guess if it can handle SQL as a serialized data stream, which it is just deserializing, that could work. SQL just doesn't seem to be a language that can do that very easily. It seems to me that some serializing format like JSON (not necessarily the exact same as JSON) would be better.
It will probably kill your system on I/O with that much data trying to be read by MySQL but it should complete(maybe next year) as long as you are doing it via the mysql CLI in SSH and not trying to use something like PHPMyAdmin.
I have never heard of anyone with that much data inside of MySQL, thats pretty nuts.
You can also look at using the 'quick' option for mysql CLI as it will read one row of data at a time as opposed to trying to read an entire table structure at a time(MySQL Default).
It will probably kill your system on I/O with that much data trying to be read by MySQL but it should complete(maybe next year) as long as you are doing it via the mysql CLI in SSH and not trying to use something like PHPMyAdmin.
I have never heard of anyone with that much data inside of MySQL, thats pretty nuts.
I think you can tune mysqldump to create multi-row inserts, saving on disk space. I don't think it puts everything in memory, then to disk, I think it streams in/out to/from disk.
For storage most people pipe through gzip/bzip2.
Also, consider mysqlhotcopy for ISAM or archive files, otherwise you might need the 'MySQL Enterprise Backup' for Innodb https://dev.mysql.com/doc/mysql-back...p-methods.html
I think you can tune mysqldump to create multi-row inserts, saving on disk space. I don't think it puts everything in memory, then to disk, I think it streams in/out to/from disk.
For storage most people pipe through gzip/bzip2.
Also, consider mysqlhotcopy for ISAM or archive files, otherwise you might need the 'MySQL Enterprise Backup' for Innodb https://dev.mysql.com/doc/mysql-back...p-methods.html
I do use "xz -9" on my dumps. Saves more space than gzip or bzip2 do.
I would like to see a tool that is the SQL equivalent of rsync. It would be nice if it could work across different database types (for example, to do a one time sync of all the tables of one database in MySQL over to a specified database in Postgres).
I think finding an existing tool for that would be unlikely (but not impossible; try sourceforge)
However, I have written similar stuff like that in the past for data take-ons etc using Perl.
You can open multiple db handles to different dbs/schemas simultaneously, then ETL (Extract / Transform / Load) the data from one to the other.
I've even had ones that run continuously, sort of like an rsync if you like...
Quote:
do use "xz -9" on my dumps. Saves more space than gzip or bzip2 do.
Just out of curiosity, did you compare like with like ie gzip -9, bzip2 -9, xz -9; the results tend to differ depending on the exact data.
I think finding an existing tool for that would be unlikely (but not impossible; try sourceforge)
However, I have written similar stuff like that in the past for data take-ons etc using Perl.
You can open multiple db handles to different dbs/schemas simultaneously, then ETL (Extract / Transform / Load) the data from one to the other.
I've even had ones that run continuously, sort of like an rsync if you like...
Just out of curiosity, did you compare like with like ie gzip -9, bzip2 -9, xz -9; the results tend to differ depending on the exact data.
I did only a simple compression test. xz worked best, though slower. From experience with these compression methods, I expected xz to win on space and gzip to win on time. And that is how it turned out. Most cases are like that.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.