Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back > Forums > Linux Forums > Linux - Server
User Name
Linux - Server This forum is for the discussion of Linux Software used in a server related context.


  Search this Thread
Old 05-22-2013, 02:44 PM   #1
Registered: Dec 2004
Posts: 931

Rep: Reputation: 65
Refactoring db access and tweaking MySQL performance...need help

I'm working on refactoring a site I built when my PHP and database knowledge was caveman-primitive. While it has been running fabulously and profitably for years now, traffic is growing and it's starting to creak a bit under the growing load. I need to refactor a variety of things, but the database access is probably the most important. Yesterday, the DB server (an Amazon RDS instance) had its CPU maxed out and it looked pretty much like the CPU was maxed out pretty much all day long M-F during business hours. I tweaked some MySQL parameters to increase memory available for caching and it's much happier now that it's making use of the copious memory in the db server. Still, the DB is working hard.

I've got some basic questions:
1) Is there some way to alter my database access class to recognize which queries might benefit from an index?
I know from the MySQL status vars (see below) that some queries are running that result in table scans and such and that adding some indexes might help. The problem is that I have hundreds of queries and hundreds of tables and I'm not sure where to start. All queries are accomplished via a centralized DB class, but I don't want to introduce any changes for monitoring that might severely hamper performance. Is there some easy way to report queries that are not properly using indexes? I was imagining using pattern matching of some kind on explain statement but that sounds tricky and bad for performance.

2) What changes might I make to anticipate the user of a cluster instead of a single machine?
Given our rate of growth, I expect we will soon outgrow a single-machine DB server configuration and we'll need to move to a cluster. I've looked into this a bit and vaguely understand that when you have a master-slave configuration, there is typically some (hopefully short) period during which master and slaves are out of sync. I also understand (perhaps incorrectly) that it's easier to set up a single master and multiple replicating slaves rather than a multiple-master cluster where the db nodes somehow manage to keep data in sync. That said, it seems clear that I'll probably want to distinguish my write queries from my read queries so I can route writes to the master and reads to a slave. Also, I might need to alter my code to keep in mind the lag time between master and slaves. Lastly, it seems like it might be wise to delegate busy session-related tables to the slaves and try to make sure that a) users are consistently served by a single web server and b) web servers consistently connect to the same DB slave. Help! I need some input here.

3) Given the following detail from phpMyAdmin, any thoughts on my performance levels and how I might improve them?
Please note that I spent several hours trying to increase table cache sizes and so on response to these values and I've seen the server use more memory, but I'm reluctant to make more drastic changes lest I introduce instability. Any advice on specific settings would be much appreciated.
// salient details from phpmyadmin's "status" tab:
This MySQL server has been running for 0 days, 23 hours, 26 minutes and 39 seconds.
Since its startup, 4,625,274 queries have been sent to the server.

select 	641 k 	27.342 k 	14.68%
update 	280 k 	11.960 k 	6.42%
set option 	259 k 	11.032 k 	5.92%
change db 	258 k 	11.000 k 	5.91%
insert 	169 k 	7.216 k 	3.87%
delete 	20 k 	840.038 	0.45%

// items flagged in red to indicate potential problems / areas for improvment:
Slow_queries 	244 	The number of queries that have taken more than long_query_time seconds.
Innodb_buffer_pool_reads 	78 	The number of logical reads that InnoDB could not satisfy from buffer pool and had to do a single-page read. 
Handler_read_rnd 	7,456 k 	The number of requests to read a row based on a fixed position. This is high if you are doing a lot of queries that require sorting of the result. You probably have a lot of queries that require MySQL to scan whole tables or you have joins that don't use keys properly. 
Handler_read_rnd_next 	79 G 	The number of requests to read the next row in the data file. This is high if you are doing a lot of table scans. Generally this suggests that your tables are not properly indexed or that your queries are not written to take advantage of the indexes you have. 
Slow_launch_threads 	2 	The number of threads that have taken more than slow_launch_time seconds to create.
Created_tmp_disk_tables 	35 k 	The number of temporary tables on disk created automatically by the server while executing statements. If Created_tmp_disk_tables is big, you may want to increase the tmp_table_size value to cause temporary tables to be memory-based instead of disk-based.
Select_full_join 	7,460 	The number of joins that do not use indexes. If this value is not 0, you should carefully check the indexes of your tables. 
Select_range_check 	3 	The number of joins without keys that check for key usage after each row. (If this is not 0, you should carefully check the indexes of your tables.) 
Sort_merge_passes 	11 k 	The number of merge passes the sort algorithm has had to do. If this value is large, you should consider increasing the value of the sort_buffer_size system variable. 
Opened_tables 	7,089 	The number of tables that have been opened. If opened tables is big, your table cache value is probably too small. 
Table_locks_waited 	57 k 	The number of times that a table lock could not be acquired immediately and a wait was needed. If this is high, and you have performance problems, you should first optimize your queries, and then either split your table or tables or use replication.
Old 05-23-2013, 02:12 AM   #2
LQ Guru
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,198

Rep: Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310
A couple of things occur to me (apologies if you know this stuff already).

1. Indexes
Adding indexes will probably give you a quick gain.
You should know that indexes are only used if the indexed column(s) are explicitly mentioned in the WHERE clause of a SQL statement.
Note that for compound indexes (ie multi-column eg index on col_a, col_b, col_c), its reads the leading edge of the index first ie you can't just put cols b & c in the WHERE clause, the index will not be used; you must mention col_a as well.
It also used to be true (& may still be), that MySQL only uses one index per table in a SQL statement, even if more than one could be used... it may or may not use the 'best' one.
You can give the optimiser a hint

Incidentally, Databases really love to have a lot of RAM for caching purposes.

2. Master-Slave
This is Replication, effectively giving you a hot standby. If you do master-master you get similar thing to a Cluster, but this is not a Cluster.
A Cluster is multiple 'instances' literally sharing the same (physical) data ie disks.
You probably want to read up on these 2 technologies and decide which you want to use.

In general have a good read of the MySQL website eg; just make sure you read the right version of the doc.

HTH (at least a little)

PS I assume you have a (separate!) dev/test system to try stuff out on.

Last edited by chrism01; 05-23-2013 at 02:14 AM.
Old 05-23-2013, 03:30 PM   #3
Registered: Dec 2004
Posts: 931

Original Poster
Rep: Reputation: 65
Thanks for your response, chrism01.

1) I'm familiar with indexes and the explain statement to try and optimize things. The problem is that I have hundreds of queries throughout the site and need some way to identify the problematic ones. I'm in the process of turning on the slow query log (a bit trickier with an Amazon RDS instance) but I expect that much of the CPU effort is not in queries that are slow but rather in ones that result in a table scan but are not slow enough to get written to the slow query log. Thus my original question, "how do we identify queries that fail to use an index?"

2) I have a basic familiarity with master/slave setups and vaguely recall that setting up master/slave is a lot easier than any kind of master/master configuration. I further understand that one's application needs to distinguish between read and write queries in a master/slave situation because writes must go to the master and reads to the slave.

Sadly, your second link did not work
Old 05-23-2013, 08:32 PM   #4
LQ Guru
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.8, Centos 5.10
Posts: 17,198

Rep: Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310Reputation: 2310
1. The logs should show you which ones are doing full table scans.
If you use your knowledge of the system, you may be able to pick a few up that way as well.

For a serious system like this, I'd highly recommend creating use/test cases for each SQL statement (at least major ones).
May take some time, but as you've seen, you're now in the serious player arena and need to start doing stuff like that.
You can re-use them each time you do an change; google 'regression testing'

2. link : the above link had a';' on the end that got picked up by the html link parser here (LQ), I should have added a space in front of it.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Tweaking wine for better performance Jayla Linux - Games 7 09-04-2009 11:10 AM
Tweaking xorg for better game performance?? stratotak Debian 1 12-12-2008 01:12 AM
Tweaking Mepis Performance Memphis MEPIS 3 08-17-2006 09:57 AM
Tweaking raid 0 performance jackj Linux - General 3 07-16-2006 11:25 AM
Tweaking NFSv4 Performance fortezza Linux - Newbie 2 08-14-2005 08:20 PM

All times are GMT -5. The time now is 12:44 AM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration