LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 01-24-2012, 04:41 AM   #1
ksmatthews
Member
 
Registered: Aug 2008
Posts: 86

Rep: Reputation: 15
Question mysql replication


Hi All,

I need to update our local Mysql DB server with data from the live server
on a daily basis ...

The simplest approach is to use the mysqldump utility on the live server to create a dump file and then to import that dump file locally. This can be done using a simple shell script.

I could also go down the road of database replication using master - slave.
There are many articles on how to best do this ...

Basically we have 2 DB server - the local one needs to be kept in sync with the live one ...

What is the best way to do this ?

regards,

Steven M
 
Old 01-24-2012, 07:32 AM   #2
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541

Rep: Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065
The two methods you've talked about are perfectly usable and most likely will work just fine.

One other way you might consider -- and this depends heavily on your data -- is to unload only what has changed or been added periodically from the live data base to the local data base.

That requires selecting from a table with time stamps, a serial number or some other column that would indicate an addition (a serial number) or a change (a time stamp) plus rows from any tables that join to that one.

What this accomplishes is small files that can be quickly copied and loaded (or updated) rather than copying the entire live data base and loading that into the local data base. It would probably be a good idea to periodically do a full dump and copy "just in case," say at 0300 on a Sunday. As your data base get larger the time to do a full unload-copy-load will increase and this may be a quicker, easier way to do what you need to do.

Again, this sort of method depends upon your data base and whether serial numbers are used or time stamps are used and available in the data. Essentially, you know the last serial number that went in the local data base and you can simply select all row greater than that number from the live data base, copy those and load them. Ditto for time stamps.

It'll take a little work to figure out what you need to get but it's doable without too much hassle.

Hope this helps some.
 
Old 01-24-2012, 10:52 AM   #3
ksmatthews
Member
 
Registered: Aug 2008
Posts: 86

Original Poster
Rep: Reputation: 15
mysql replication

Quote:
Originally Posted by tronayne View Post

Again, this sort of method depends upon your data base and whether serial numbers are used or time stamps are used and available in the data. Essentially, you know the last serial number that went in the local data base and you can simply select all row greater than that number from the live data base, copy those and load them.
Thanks for getting back.
Certainly a great idea but if you have many tables with many relations where only some tables have been updated
this approach could become complex and error prone. It's probably best suited to small databases.

Do you know if database replication using master - slave uses this approach OR does it do a full 'copy and paste' ?

Steven M
 
Old 01-24-2012, 01:28 PM   #4
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541

Rep: Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065
Well, actually, I've done just that with multi terabyte data bases with, sometime, upwards of 15 tables involved (all, of course, having related keys); no biggie, saved tons of time. And I continue to do much the same with MySQL and PostgreSQL data bases.

Although I'm familiar with mysqldump and mysqlimport methods, I don't know for sure about using master slave replication so I'm probably not the proper person to comment -- I can point you to Section 15.1.1 of the MySQL 5.1.x Reference Manual (your version may vary) at http://dev.mysql.com/doc/refman/5.1/...ion-howto.html which provides a pretty good guide. I suspect that trying it will be the best way to decide.

Hope this helps some.
 
Old 01-24-2012, 10:58 PM   #5
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 7.7 (?), Centos 8.1
Posts: 18,165

Rep: Reputation: 2680Reputation: 2680Reputation: 2680Reputation: 2680Reputation: 2680Reputation: 2680Reputation: 2680Reputation: 2680Reputation: 2680Reputation: 2680Reputation: 2680
It depends on whether your backup DB needs to be absolutely in-sync (completely up to date at all times) in which case Master/slave is the way to go, OR whether a delay of eg a few hrs is ok, in which case mysqldump then reload is simpler.
NB: a possible downside of master/slave is that if something goes wrong at the master, that problem will be replicated into the slave immediately.... sometimes a delay is better
 
Old 01-26-2012, 09:46 AM   #6
ksmatthews
Member
 
Registered: Aug 2008
Posts: 86

Original Poster
Rep: Reputation: 15
Question

Quote:
Originally Posted by tronayne View Post
Well, actually, I've done just that with multi terabyte data bases with, sometime, upwards of 15 tables involved (all, of course, having related keys); no biggie, saved tons of time. And I continue to do much the same with MySQL and PostgreSQL data bases.
Hi Tronayne,

I was wondering if you have access to some resources so that I get to grips with this approach.
If not could you offer some pointers on how to go about this ?

regards,

steven Matthews
 
Old 01-26-2012, 01:24 PM   #7
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541

Rep: Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065
The way I design tables is to, wherever possible, use a unique numeric value as the first column; that would normally be a serial number in the case of a primary table and that number would be inserted in every table row in one or more subsidiary table(s) -- it would also be the index value. This would cover things like purchase orders, invoices and other similar transactions plus things like vendors, customers and the like.

I've more or less adopted Universal Product Codes (UPC) for inventory items (they're numeric) which covers manufacturers and their UPC codes for the things they make (this includes publications too, books have both ISBNs and UPCs) as do optical media and pretty much everything else). You can download a vendor's or manufacturer's product list from various sources.

Where possible I try to avoid non-numeric index values (obvious exceptions are the names, address and the like for individuals where you want a alphanumeric index key but I try to keep those to an absolute minimum). I also try to avoid wherever possible multi-column indexes; that doesn't mean that I won't use them, but I'll use them sparingly. One typical multi-column index would be the serial number and date (or datetime); a sale order would index the serial number, date and client identification number and that's about it.

Over indexing is possibly the greatest waste of resources -- storage space as well as processing -- one can do. I have seen data base tables where multiple indexes occupy more space that the blasted data does, occasionally five or more times. Too, it's not worth indexing a table with less than, oh, 1,000 rows (possibly more) -- queries tend to run faster without an index on small tables.

I think the key thing to remember is that it's a relational data base; i.e., things in it should relate to each other in some way. Lots of folks seem to think that an RDBMS is just a bigger spread sheet and forget to only put unique data in a table; joins do work pretty well.

That sort of stuff is called normalization (of course there's a fancy name); essentially it means don't ever have redundant data in more than one table. Does that mean more tables? Sure. So what? Joins work quite well.

So, practically, how do you do it? Well, say you're buying stuff. You're going to cut an order to a vendor. So your order table would contain your order number (serial), the vendor number (the serial number from your vendor table)), the date, maybe the employee number (from the employee table) making the order and maybe a column for the total of the order (or maybe not). The stuff you're ordering would contain the PO number (from the order table), UPC or the vendor custom part number and the quantity ordered. As you go along, hopefully using PHP on a web page, descriptions, prices and all the other display stuff would get selected from the various tables, but would not get recorded anywhere. Keep in mind that data base "engines" know how to do arithmetic, you don't need to store totals, you don't need to store number of items multiplied by item prices; you might want to calculate them when you're going to print them, but you don't need to store them.

That's pretty simplified (you might want terms, delivery schedules and other information included in the order table), but, really, that's all you need. What gets printed gets selected from tables as needed but does not get recorded in any other table. And, essentially, the same thing happens when you sell stuff only you use customer data rather than vendor data.

It's usually a good idea to not do transactions directly in a master data base (depending upon the size and number of transactions per working day) but to hold transactions in a working data base then upload them to a master periodically (hourly, half-day, end-of-day, whatever). Having those serial numbers makes it pretty easy to unload transactions to an ASCII file (like what you get with mysqldump) and upload those to a master data base (as with mysqlimport) the simply reset the working data base for the next day.

You don't need to transfer everything, only those things that have changed (like orders, sales, shipping, invoices and all the other activities done during a working period).

If you want to be really slick, you can select all the information you need from the master data base (and not have any data tables in the working data base at all) but that depends upon the volume, your intranet speed and your confidence in uptime. You'd probably want a master/slave just to act as the master to avoid downtime (stuff does break, after all). If you're doing point of sale, that hardware comes into play as well and adds a layer of complexity (like, when do you dump the cash register to the working data base, update your inventory and the like, if the register has UPC data in its memory, when do you update that, those and other considerations come into play).

The bottom line, though, is that whatever you do must fit your organization's needs. If you can normalize your data bases (eliminate redundancy) you can, pretty easily, write a couple of shell programs to do the unloading and uploading of changes periodically. Using serial data, dates, and other numeric keys in your tables makes life a little simpler.

Anyway, that's what I do and why I do it that way -- I tend to make stored data as small as possible and get embarrassed if I find something redundant. It's a hangover from the bad old days when disk drives cost thousands of dollars and were only 50MB, you learn to keep it down as much as possible if you know what I mean. Offhand I can't think of any sources for advice on data base design and engineering; most of the stuff I see nowadays is theory rather than practice. There must be something out there about sensible data base design but I just don't know what it may be or where to find it.

The advice I can give you is to always use defaults; integer (never mind sizing, an integer is the platform default, usually 2147483647), never use smallint (it a short, 32767, and I guarantee that someday somebody will overflow it and you crash), varchar is better than character, if you're working with money 10.2 is pretty good -- that's eight significant digits and cents are rounded fairly. If you're working with billions (or, if you're the US government, trillions), you might want to make that larger, say, 12.2. Eight significant digits is just under 100,000,000. Don't over do it -- check the data base documentation for the defaults and use them where possible. And never, ever, copy stuff from one data base table into another -- look it up instead.

Hope this helps some.

Last edited by tronayne; 01-26-2012 at 01:27 PM.
 
1 members found this post helpful.
Old 01-26-2012, 02:58 PM   #8
ksmatthews
Member
 
Registered: Aug 2008
Posts: 86

Original Poster
Rep: Reputation: 15
Question master - slave

Quote:
Originally Posted by chrism01 View Post
It depends on whether your backup DB needs to be absolutely in-sync (completely up to date at all times) in which case Master/slave is the way to go, OR whether a delay of eg a few hrs is ok, in which case mysqldump then reload is simpler.
How do you sync 2 databases which can both change ? Is that possible ?

regards,

Steven M
 
Old 01-27-2012, 08:52 AM   #9
tronayne
Senior Member
 
Registered: Oct 2003
Location: Northeastern Michigan, where Carhartt is a Designer Label
Distribution: Slackware 32- & 64-bit Stable
Posts: 3,541

Rep: Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065Reputation: 1065
I'm no expert on replication (using master/slave) but there is this explanation from this link: http://dev.mysql.com/doc/refman/5.0/en/replication.html. In part,
Quote:
Replication enables data from one MySQL database server (the master) to be replicated to one or more MySQL database servers (the slaves). Replication is asynchronous - slaves need not be connected permanently to receive updates from the master. This means that updates can occur over long-distance connections and even over temporary or intermittent connections such as a dial-up service. Depending on the configuration, you can replicate all databases, selected databases, or even selected tables within a database.

For answers to some questions often asked by those who are new to MySQL Replication, see Section B.13, “MySQL 5.0 FAQ: Replication”.

The target uses for replication in MySQL include:

Scale-out solutions - spreading the load among multiple slaves to improve performance. In this environment, all writes and updates must take place on the master server. Reads, however, may take place on one or more slaves. This model can improve the performance of writes (since the master is dedicated to updates), while dramatically increasing read speed across an increasing number of slaves.

Data security - because data is replicated to the slave, and the slave can pause the replication process, it is possible to run backup services on the slave without corrupting the corresponding master data.

Analytics - live data can be created on the master, while the analysis of the information can take place on the slave without affecting the performance of the master.

Long-distance data distribution - if a branch office would like to work with a copy of your main data, you can use replication to create a local copy of the data for their use without requiring permanent access to the master.

Replication in MySQL features support for one-way, asynchronous replication, in which one server acts as the master, while one or more other servers act as slaves. This is in contrast to the synchronous replication which is a characteristic of MySQL Cluster (see Chapter 16, MySQL Cluster).

There are a number of solutions available for setting up replication between two servers, but the best method to use depends on the presence of data and the engine types you are using. For more information on the available options, see Section 15.1.1, “How to Set Up Replication”.

Replication is controlled through a number of different options and variables. These control the core operation of the replication, timeouts, and the databases and filters that can be applied on databases and tables. For more information on the available options, see Section 15.1.2, “Replication and Binary Logging Options and Variables”.

You can use replication to solve a number of different problems, including problems with performance, supporting the backup of different databases, and as part of a larger solution to alleviate system failures. For information on how to address these issues, see Section 15.3, “Replication Solutions”.

For notes and tips on how different data types and statements are treated during replication, including details of replication features, version compatibility, upgrades, and problems and their resolution, including an FAQ, see Section 15.4, “Replication Notes and Tips”.

For detailed information on the implementation of replication, how replication works, the process and contents of the binary log, background threads and the rules used to decide how statements are recorded and replication, see Section 15.2, “Replication Implementation”.
The above is something I'm been aware of but, well, I just haven't had a reason to try it (been busy with other stuff). Might be a good idea to free up a couple of servers and experiment, though, so it's on my to-do list. I suspect, though, that this is what you're looking to do and it might be worth your time to grab a couple of servers (you know, a pair of old unused boxes would do, doesn't have to be exotic, just has to prove the method), set them up and give it a shot (I'm going to fiddle with a couple of laptops that aren't doing anything when I get some time).

I'm sure there are other folks around that have actually implemented master/slave replication and it sure would be nice if somebody could speak up about the experience; what the heck, Google might be a good source too. I just queried Google with "mysql replication master/slave howto" and got some interesting things to look at, starting with http://dev.mysql.com/doc/refman/5.0/...ion-howto.html. Might be worth a look-see.

Hope this helps some.

Last edited by tronayne; 01-27-2012 at 09:01 AM.
 
Old 01-30-2012, 11:50 AM   #10
eehmke
LQ Newbie
 
Registered: Aug 2011
Distribution: Debian, Gentoo
Posts: 24

Rep: Reputation: 0
Quote:
Originally Posted by ksmatthews View Post
How do you sync 2 databases which can both change ? Is that possible ?
That's called master-master replication, and it is possible. Basically, both databases act as master of each other, so they are slaves at the same time. There are some details to be looked at so the keys will not mix up, but this can be handled. Have a look:
http://www.howtoforge.com/mysql_mast...er_replication

Last edited by eehmke; 01-30-2012 at 11:51 AM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Mysql replication failed after mysql reboot ajayan Linux - Newbie 8 09-08-2010 11:00 AM
MySQL replication Padawan.AVT Linux - Server 4 06-16-2009 09:27 PM
mysql replication lord-fu *BSD 1 04-03-2007 09:23 PM
mysql replication lord-fu Linux - Server 0 04-02-2007 08:38 AM
Mysql replication pk21 Linux - Software 0 08-28-2003 05:00 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 02:54 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration