Database Design Help

msound · 12-08-2006, 09:58 AM

Hey Hey! I've decided that I would like to begin development of an open source online accounting system. Basically just something that could track inventory use, purchases, receivables, sales, etc. etc. The PHP portion of it should be easy since it's really only going to consist of simply DB queries and some basic arithmetic.

The only thing I'm unsure of is how to store the data in MySQL. Let's assume for a minute that a single company has several different locations. Each location should have it's own accounting information that is unaffected by the others.

Assuming I create tables like: Inventory, Purchases, Receivables, CashSales, etc.
Do I a: just create a Table Field named "location_id" and make that a foreign key that corresponds with the primary key of a Locations table? Or b: Simply give each location it's own Database, and then each database would share an identical table structure?

I know best practice would probably say to keep all of the data in one DB and use a location identifier to sort through the data, but one of my goals is to make location completely independent. If one of the MySQL tables becomes corrupt, all of the locations would then be unable to submit their reports.

If I use a separate database for each location, then a corrupt table would only have an impact on the database it's under. The rest of the company would be able to continue working as usual.

So what do you think?

lenky0401 · 12-08-2006, 10:18 AM

wow,if u want to make the location completely independent,the b is the best choice,i think.
but if this company has another different location some day,were u create a new database for it?

msound · 12-08-2006, 10:27 AM

Does that make sense though? What is the usual practice in a situation like this? I've only seen 2 corrupt db tables and they both occurred in a high volume phpbb forum. From the php standpoint i could easily make both configurations work. My real concern is that you're always going to have companies/locations that do a considerably higher volume than another. Once you start to get several, or several hundred, high volume accounts than the tables could start to reach tens of thousands of entries. By breaking the data down into account/location specific database tables, wouldn't that save a bit of performance that would be lost once the tables start to hit a high level of entries?

On the flip side, would establishing connections with multiple databases simultaneous hinder performance more so than working out of a single DB with a ton of entries?

The overall goal of the project is going to be giving businesses an open source option that is fast, scalable, and stable. The last company I worked for used to run its mysql db and web server off of a server with a 1.2GHz celeron. So it's important to keep performance in mind along with reliability.

So putting all of this into consideration, would option b still be the best design for what I'd like to do?

climbingmerlin · 12-08-2006, 11:45 AM

personally I would go for option A, it is good practice to keep everything in at least on table, because if you start adding tables per site it could become a massive mess. MySQL should be able to handle hundred of thousands of rows, without any major dramas.

The thing that I would be looking at as you have said yourself, is data integrity. If I was doing what you are planning on doing, I would use the InnoDB engine, as it is designed to be transaction-safe and able to handle large data volumes. Or to quote MySQL:

Quote:

InnoDB provides MySQL with a transaction-safe (ACID compliant) storage engine that has commit, rollback, and crash recovery capabilities. InnoDB does locking on the row level and also provides an Oracle-style consistent non-locking read in SELECT statements. These features increase multi-user concurrency and performance. There is no need for lock escalation in InnoDB because row-level locks fit in very little space. InnoDB also supports FOREIGN KEY constraints. You can freely mix InnoDB tables with tables from other MySQL storage engines, even within the same statement.

InnoDB has been designed for maximum performance when processing large data volumes. Its CPU efficiency is probably not matched by any other disk-based relational database engine

The key that I have found when handling large amounts of data on any database is to make sure that you have indexes on the right columns to help with lookup times. If you don't have indexes on tables where you will frequently do lookups or updates on, queries will take an age to run, use up a lot of CPU time and make your application run like a dog.

Also have a look at PostgreSQL, which is another very good OS database, which works just as well with high data volumes. It like MySQL is very good at being transaction-safe and can be scaled to loadbalance. (Not sure about MYSQL as never had to do it)

However if you need high availability, with things such as failover boxes or distributed servers. I have found that replication with MySQL very easy to implement, and very reliable.

So I personally would go with option A and look at making sure that your database schema/design has good indexes on columns that you are going to be referencing a lot, and that you make sure that you use transactions when handling the data.

well that's my

worth anyway.

msound · 12-08-2006, 11:51 AM

Great answer. Thanks for the help you two! I'll go ahead and store everything in a single Database, and then, when the time comes, look into replication options. Thanks again!

chrism01 · 12-10-2006, 05:11 PM

See also V5 of MySQL which does clustering.
BTW, I would definitely go with (a), unless you plan on giving each location it's own DB server.
Just ensure you use SQL txns and do frequent backups.

climbingmerlin · 12-11-2006, 04:33 AM

Never had to do clustering in MySQL is it any good? As it is in memory clustering what down sides are there as there must be some performance hit? Or am I getting it wrong?

chrism01 · 12-11-2006, 04:51 PM

V5.0 is all in RAM, so theoretically should be quick, but you need a LOT of RAM in each system; enough for DB code & DB ie application data.
V5.1 is disk backed, like a normal DB.
http://dev.mysql.com/doc/refman/5.1/...daitem-28-10-2