Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I need to update our local Mysql DB server with data from the live server
on a daily basis ...
The simplest approach is to use the mysqldump utility on the live server to create a dump file and then to import that dump file locally. This can be done using a simple shell script.
I could also go down the road of database replication using master - slave.
There are many articles on how to best do this ...
Basically we have 2 DB server - the local one needs to be kept in sync with the live one ...
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541
Rep:
The two methods you've talked about are perfectly usable and most likely will work just fine.
One other way you might consider -- and this depends heavily on your data -- is to unload only what has changed or been added periodically from the live data base to the local data base.
That requires selecting from a table with time stamps, a serial number or some other column that would indicate an addition (a serial number) or a change (a time stamp) plus rows from any tables that join to that one.
What this accomplishes is small files that can be quickly copied and loaded (or updated) rather than copying the entire live data base and loading that into the local data base. It would probably be a good idea to periodically do a full dump and copy "just in case," say at 0300 on a Sunday. As your data base get larger the time to do a full unload-copy-load will increase and this may be a quicker, easier way to do what you need to do.
Again, this sort of method depends upon your data base and whether serial numbers are used or time stamps are used and available in the data. Essentially, you know the last serial number that went in the local data base and you can simply select all row greater than that number from the live data base, copy those and load them. Ditto for time stamps.
It'll take a little work to figure out what you need to get but it's doable without too much hassle.
Again, this sort of method depends upon your data base and whether serial numbers are used or time stamps are used and available in the data. Essentially, you know the last serial number that went in the local data base and you can simply select all row greater than that number from the live data base, copy those and load them.
Thanks for getting back.
Certainly a great idea but if you have many tables with many relations where only some tables have been updated
this approach could become complex and error prone. It's probably best suited to small databases.
Do you know if database replication using master - slave uses this approach OR does it do a full 'copy and paste' ?
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541
Rep:
Well, actually, I've done just that with multi terabyte data bases with, sometime, upwards of 15 tables involved (all, of course, having related keys); no biggie, saved tons of time. And I continue to do much the same with MySQL and PostgreSQL data bases.
Although I'm familiar with mysqldump and mysqlimport methods, I don't know for sure about using master slave replication so I'm probably not the proper person to comment -- I can point you to Section 15.1.1 of the MySQL 5.1.x Reference Manual (your version may vary) at http://dev.mysql.com/doc/refman/5.1/...ion-howto.html which provides a pretty good guide. I suspect that trying it will be the best way to decide.
It depends on whether your backup DB needs to be absolutely in-sync (completely up to date at all times) in which case Master/slave is the way to go, OR whether a delay of eg a few hrs is ok, in which case mysqldump then reload is simpler.
NB: a possible downside of master/slave is that if something goes wrong at the master, that problem will be replicated into the slave immediately.... sometimes a delay is better
Well, actually, I've done just that with multi terabyte data bases with, sometime, upwards of 15 tables involved (all, of course, having related keys); no biggie, saved tons of time. And I continue to do much the same with MySQL and PostgreSQL data bases.
Hi Tronayne,
I was wondering if you have access to some resources so that I get to grips with this approach.
If not could you offer some pointers on how to go about this ?
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541
Rep:
The way I design tables is to, wherever possible, use a unique numeric value as the first column; that would normally be a serial number in the case of a primary table and that number would be inserted in every table row in one or more subsidiary table(s) -- it would also be the index value. This would cover things like purchase orders, invoices and other similar transactions plus things like vendors, customers and the like.
I've more or less adopted Universal Product Codes (UPC) for inventory items (they're numeric) which covers manufacturers and their UPC codes for the things they make (this includes publications too, books have both ISBNs and UPCs) as do optical media and pretty much everything else). You can download a vendor's or manufacturer's product list from various sources.
Where possible I try to avoid non-numeric index values (obvious exceptions are the names, address and the like for individuals where you want a alphanumeric index key but I try to keep those to an absolute minimum). I also try to avoid wherever possible multi-column indexes; that doesn't mean that I won't use them, but I'll use them sparingly. One typical multi-column index would be the serial number and date (or datetime); a sale order would index the serial number, date and client identification number and that's about it.
Over indexing is possibly the greatest waste of resources -- storage space as well as processing -- one can do. I have seen data base tables where multiple indexes occupy more space that the blasted data does, occasionally five or more times. Too, it's not worth indexing a table with less than, oh, 1,000 rows (possibly more) -- queries tend to run faster without an index on small tables.
I think the key thing to remember is that it's a relational data base; i.e., things in it should relate to each other in some way. Lots of folks seem to think that an RDBMS is just a bigger spread sheet and forget to only put unique data in a table; joins do work pretty well.
That sort of stuff is called normalization (of course there's a fancy name); essentially it means don't ever have redundant data in more than one table. Does that mean more tables? Sure. So what? Joins work quite well.
So, practically, how do you do it? Well, say you're buying stuff. You're going to cut an order to a vendor. So your order table would contain your order number (serial), the vendor number (the serial number from your vendor table)), the date, maybe the employee number (from the employee table) making the order and maybe a column for the total of the order (or maybe not). The stuff you're ordering would contain the PO number (from the order table), UPC or the vendor custom part number and the quantity ordered. As you go along, hopefully using PHP on a web page, descriptions, prices and all the other display stuff would get selected from the various tables, but would not get recorded anywhere. Keep in mind that data base "engines" know how to do arithmetic, you don't need to store totals, you don't need to store number of items multiplied by item prices; you might want to calculate them when you're going to print them, but you don't need to store them.
That's pretty simplified (you might want terms, delivery schedules and other information included in the order table), but, really, that's all you need. What gets printed gets selected from tables as needed but does not get recorded in any other table. And, essentially, the same thing happens when you sell stuff only you use customer data rather than vendor data.
It's usually a good idea to not do transactions directly in a master data base (depending upon the size and number of transactions per working day) but to hold transactions in a working data base then upload them to a master periodically (hourly, half-day, end-of-day, whatever). Having those serial numbers makes it pretty easy to unload transactions to an ASCII file (like what you get with mysqldump) and upload those to a master data base (as with mysqlimport) the simply reset the working data base for the next day.
You don't need to transfer everything, only those things that have changed (like orders, sales, shipping, invoices and all the other activities done during a working period).
If you want to be really slick, you can select all the information you need from the master data base (and not have any data tables in the working data base at all) but that depends upon the volume, your intranet speed and your confidence in uptime. You'd probably want a master/slave just to act as the master to avoid downtime (stuff does break, after all). If you're doing point of sale, that hardware comes into play as well and adds a layer of complexity (like, when do you dump the cash register to the working data base, update your inventory and the like, if the register has UPC data in its memory, when do you update that, those and other considerations come into play).
The bottom line, though, is that whatever you do must fit your organization's needs. If you can normalize your data bases (eliminate redundancy) you can, pretty easily, write a couple of shell programs to do the unloading and uploading of changes periodically. Using serial data, dates, and other numeric keys in your tables makes life a little simpler.
Anyway, that's what I do and why I do it that way -- I tend to make stored data as small as possible and get embarrassed if I find something redundant. It's a hangover from the bad old days when disk drives cost thousands of dollars and were only 50MB, you learn to keep it down as much as possible if you know what I mean. Offhand I can't think of any sources for advice on data base design and engineering; most of the stuff I see nowadays is theory rather than practice. There must be something out there about sensible data base design but I just don't know what it may be or where to find it.
The advice I can give you is to always use defaults; integer (never mind sizing, an integer is the platform default, usually 2147483647), never use smallint (it a short, 32767, and I guarantee that someday somebody will overflow it and you crash), varchar is better than character, if you're working with money 10.2 is pretty good -- that's eight significant digits and cents are rounded fairly. If you're working with billions (or, if you're the US government, trillions), you might want to make that larger, say, 12.2. Eight significant digits is just under 100,000,000. Don't over do it -- check the data base documentation for the defaults and use them where possible. And never, ever, copy stuff from one data base table into another -- look it up instead.
It depends on whether your backup DB needs to be absolutely in-sync (completely up to date at all times) in which case Master/slave is the way to go, OR whether a delay of eg a few hrs is ok, in which case mysqldump then reload is simpler.
How do you sync 2 databases which can both change ? Is that possible ?
Replication enables data from one MySQL database server (the master) to be replicated to one or more MySQL database servers (the slaves). Replication is asynchronous - slaves need not be connected permanently to receive updates from the master. This means that updates can occur over long-distance connections and even over temporary or intermittent connections such as a dial-up service. Depending on the configuration, you can replicate all databases, selected databases, or even selected tables within a database.
For answers to some questions often asked by those who are new to MySQL Replication, see Section B.13, “MySQL 5.0 FAQ: Replication”.
The target uses for replication in MySQL include:
Scale-out solutions - spreading the load among multiple slaves to improve performance. In this environment, all writes and updates must take place on the master server. Reads, however, may take place on one or more slaves. This model can improve the performance of writes (since the master is dedicated to updates), while dramatically increasing read speed across an increasing number of slaves.
Data security - because data is replicated to the slave, and the slave can pause the replication process, it is possible to run backup services on the slave without corrupting the corresponding master data.
Analytics - live data can be created on the master, while the analysis of the information can take place on the slave without affecting the performance of the master.
Long-distance data distribution - if a branch office would like to work with a copy of your main data, you can use replication to create a local copy of the data for their use without requiring permanent access to the master.
Replication in MySQL features support for one-way, asynchronous replication, in which one server acts as the master, while one or more other servers act as slaves. This is in contrast to the synchronous replication which is a characteristic of MySQL Cluster (see Chapter 16, MySQL Cluster).
There are a number of solutions available for setting up replication between two servers, but the best method to use depends on the presence of data and the engine types you are using. For more information on the available options, see Section 15.1.1, “How to Set Up Replication”.
Replication is controlled through a number of different options and variables. These control the core operation of the replication, timeouts, and the databases and filters that can be applied on databases and tables. For more information on the available options, see Section 15.1.2, “Replication and Binary Logging Options and Variables”.
You can use replication to solve a number of different problems, including problems with performance, supporting the backup of different databases, and as part of a larger solution to alleviate system failures. For information on how to address these issues, see Section 15.3, “Replication Solutions”.
For notes and tips on how different data types and statements are treated during replication, including details of replication features, version compatibility, upgrades, and problems and their resolution, including an FAQ, see Section 15.4, “Replication Notes and Tips”.
For detailed information on the implementation of replication, how replication works, the process and contents of the binary log, background threads and the rules used to decide how statements are recorded and replication, see Section 15.2, “Replication Implementation”.
The above is something I'm been aware of but, well, I just haven't had a reason to try it (been busy with other stuff). Might be a good idea to free up a couple of servers and experiment, though, so it's on my to-do list. I suspect, though, that this is what you're looking to do and it might be worth your time to grab a couple of servers (you know, a pair of old unused boxes would do, doesn't have to be exotic, just has to prove the method), set them up and give it a shot (I'm going to fiddle with a couple of laptops that aren't doing anything when I get some time).
I'm sure there are other folks around that have actually implemented master/slave replication and it sure would be nice if somebody could speak up about the experience; what the heck, Google might be a good source too. I just queried Google with "mysql replication master/slave howto" and got some interesting things to look at, starting with http://dev.mysql.com/doc/refman/5.0/...ion-howto.html. Might be worth a look-see.
How do you sync 2 databases which can both change ? Is that possible ?
That's called master-master replication, and it is possible. Basically, both databases act as master of each other, so they are slaves at the same time. There are some details to be looked at so the keys will not mix up, but this can be handled. Have a look: http://www.howtoforge.com/mysql_mast...er_replication
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.