Refactoring db access and tweaking MySQL performance...need help

sneakyimp · 05-22-2013, 02:44 PM

I'm working on refactoring a site I built when my PHP and database knowledge was caveman-primitive. While it has been running fabulously and profitably for years now, traffic is growing and it's starting to creak a bit under the growing load. I need to refactor a variety of things, but the database access is probably the most important. Yesterday, the DB server (an Amazon RDS instance) had its CPU maxed out and it looked pretty much like the CPU was maxed out pretty much all day long M-F during business hours. I tweaked some MySQL parameters to increase memory available for caching and it's much happier now that it's making use of the copious memory in the db server. Still, the DB is working hard.

I've got some basic questions:
1) Is there some way to alter my database access class to recognize which queries might benefit from an index?
I know from the MySQL status vars (see below) that some queries are running that result in table scans and such and that adding some indexes might help. The problem is that I have hundreds of queries and hundreds of tables and I'm not sure where to start. All queries are accomplished via a centralized DB class, but I don't want to introduce any changes for monitoring that might severely hamper performance. Is there some easy way to report queries that are not properly using indexes? I was imagining using pattern matching of some kind on explain statement but that sounds tricky and bad for performance.

2) What changes might I make to anticipate the user of a cluster instead of a single machine?
Given our rate of growth, I expect we will soon outgrow a single-machine DB server configuration and we'll need to move to a cluster. I've looked into this a bit and vaguely understand that when you have a master-slave configuration, there is typically some (hopefully short) period during which master and slaves are out of sync. I also understand (perhaps incorrectly) that it's easier to set up a single master and multiple replicating slaves rather than a multiple-master cluster where the db nodes somehow manage to keep data in sync. That said, it seems clear that I'll probably want to distinguish my write queries from my read queries so I can route writes to the master and reads to a slave. Also, I might need to alter my code to keep in mind the lag time between master and slaves. Lastly, it seems like it might be wise to delegate busy session-related tables to the slaves and try to make sure that a) users are consistently served by a single web server and b) web servers consistently connect to the same DB slave. Help! I need some input here.

3) Given the following detail from phpMyAdmin, any thoughts on my performance levels and how I might improve them?
Please note that I spent several hours trying to increase table cache sizes and so on response to these values and I've seen the server use more memory, but I'm reluctant to make more drastic changes lest I introduce instability. Any advice on specific settings would be much appreciated.

Code:

// salient details from phpmyadmin's "status" tab:
This MySQL server has been running for 0 days, 23 hours, 26 minutes and 39 seconds.
Since its startup, 4,625,274 queries have been sent to the server.

select 641 k 27.342 k 14.68%
update 280 k 11.960 k 6.42%
set option 259 k 11.032 k 5.92%
change db 258 k 11.000 k 5.91%
insert 169 k 7.216 k 3.87%
delete 20 k 840.038 0.45%

// items flagged in red to indicate potential problems / areas for improvment:
Slow_queries 244 The number of queries that have taken more than long_query_time seconds.
Innodb_buffer_pool_reads 78 The number of logical reads that InnoDB could not satisfy from buffer pool and had to do a single-page read.
Handler_read_rnd 7,456 k The number of requests to read a row based on a fixed position. This is high if you are doing a lot of queries that require sorting of the result. You probably have a lot of queries that require MySQL to scan whole tables or you have joins that don't use keys properly.
Handler_read_rnd_next 79 G The number of requests to read the next row in the data file. This is high if you are doing a lot of table scans. Generally this suggests that your tables are not properly indexed or that your queries are not written to take advantage of the indexes you have.
Slow_launch_threads 2 The number of threads that have taken more than slow_launch_time seconds to create.
Created_tmp_disk_tables 35 k The number of temporary tables on disk created automatically by the server while executing statements. If Created_tmp_disk_tables is big, you may want to increase the tmp_table_size value to cause temporary tables to be memory-based instead of disk-based.
Select_full_join 7,460 The number of joins that do not use indexes. If this value is not 0, you should carefully check the indexes of your tables.
Select_range_check 3 The number of joins without keys that check for key usage after each row. (If this is not 0, you should carefully check the indexes of your tables.)
Sort_merge_passes 11 k The number of merge passes the sort algorithm has had to do. If this value is large, you should consider increasing the value of the sort_buffer_size system variable.
Opened_tables 7,089 The number of tables that have been opened. If opened tables is big, your table cache value is probably too small.
Table_locks_waited 57 k The number of times that a table lock could not be acquired immediately and a wait was needed. If this is high, and you have performance problems, you should first optimize your queries, and then either split your table or tables or use replication.

chrism01 · 05-23-2013, 02:12 AM

A couple of things occur to me (apologies if you know this stuff already).

1. Indexes
Adding indexes will probably give you a quick gain.
You should know that indexes are only used if the indexed column(s) are explicitly mentioned in the WHERE clause of a SQL statement.
Note that for compound indexes (ie multi-column eg index on col_a, col_b, col_c), its reads the leading edge of the index first ie you can't just put cols b & c in the WHERE clause, the index will not be used; you must mention col_a as well.
It also used to be true (& may still be), that MySQL only uses one index per table in a SQL statement, even if more than one could be used... it may or may not use the 'best' one.
You can give the optimiser a hint https://dev.mysql.com/doc/refman/5.0...dex-hints.html

Incidentally, Databases really love to have a lot of RAM for caching purposes.

2. Master-Slave
This is Replication, effectively giving you a hot standby. If you do master-master you get similar thing to a Cluster, but this is not a Cluster.
A Cluster is multiple 'instances' literally sharing the same (physical) data ie disks.
You probably want to read up on these 2 technologies and decide which you want to use.

In general have a good read of the MySQL website eg https://dev.mysql.com/doc/refman/5.1/en/; just make sure you read the right version of the doc.

HTH (at least a little)

PS I assume you have a (separate!) dev/test system to try stuff out on.

sneakyimp · 05-23-2013, 03:30 PM

Thanks for your response, chrism01.

1) I'm familiar with indexes and the explain statement to try and optimize things. The problem is that I have hundreds of queries throughout the site and need some way to identify the problematic ones. I'm in the process of turning on the slow query log (a bit trickier with an Amazon RDS instance) but I expect that much of the CPU effort is not in queries that are slow but rather in ones that result in a table scan but are not slow enough to get written to the slow query log. Thus my original question, "how do we identify queries that fail to use an index?"

2) I have a basic familiarity with master/slave setups and vaguely recall that setting up master/slave is a lot easier than any kind of master/master configuration. I further understand that one's application needs to distinguish between read and write queries in a master/slave situation because writes must go to the master and reads to the slave.

Sadly, your second link did not work

chrism01 · 05-23-2013, 08:32 PM

1. The logs should show you which ones are doing full table scans.
If you use your knowledge of the system, you may be able to pick a few up that way as well.

For a serious system like this, I'd highly recommend creating use/test cases for each SQL statement (at least major ones).
May take some time, but as you've seen, you're now in the serious player arena and need to start doing stuff like that.
You can re-use them each time you do an change; google 'regression testing'

2. link https://dev.mysql.com/doc/refman/5.1/en/ : the above link had a';' on the end that got picked up by the html link parser here (LQ), I should have added a space in front of it.