HPM Note5, Query Performance Optimization

hideto

浏览: 2703150 次
性别:
来自: 北京

最近访客更多访客>>

_Mark24_

lzyfn123

leeyisoft

linziyuu

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

博客分类：

Database

performance MySQL SQL Server thread memcached

Slow Query Basics: Optimize Data Access
Analyze a poorly performing query in two steps:
1, Find out whether your application is retrieving more data than you need
2, Find out whether the MySQL server is analyzing more rows than it needs

Typical mistakes:
1, Fetching more rows than needed
2, Fetching all columns from a multitable join
3, Fetching all columsn

The simplest query cost metrics in MySQL(Slow query log):
1, Execution time
2, Number of rows examined
3, Number of rows returned

The ratio of rows examined to rows returned is usually small-say, between 1:1 and 10:1-but sometimes it can be orders of magnitude larger

The access methods apprear in the type column in EXPLAIN's output
The access types range from a full table scan to index scans, range scans, unique index lookups and constants

If you find that a huge number of rows were examined to produce relatively few rows in the result, you can try some more sophisticated fixes:
1, Use covering indexes, which store data so that the storage engine doesn't have to retrieve the complete rows
2, Change the schema. An example is using summary tables
3, Rewrite a complicated query so the MySQL optimizer is able to execute it optimally

Ways to Restructure Queries
Sometimes you can make a query more efficient by decomposing it and executing a few simple queries instead of one complex one

Chopping up the DELETE statement and using medium-size queries can improve performance and reduce replication lag considerably

// Bad
mysql > DELETE FROM messages WHERE created < DATE_SUB(NOW(), INTERVAL 3 MONTH);

// Good
rows_affected = 0
do {
  rows_affected = do_query (
    "DELETE FROM messages WHERE created < DATE_SUB(NOW(), INTERVAL 3 MONTH) LIMIT 10000"
  )
} while rows_affected > 0

Many high-performance web sites use join decomposition
You can decompose a join by running multiple single-table queries instead of a multitable join, and then performing the join in the application

// Bad
SELECT * FROM tag JOIN tag_post ON tag_post.tag_id = tag.id JOIN post ON tag_post.post_id = post.id WHERE tag.tag = 'mysql'

//Good
SELECT * FROM tag WHERE tag='mysql'
SELECT * FROM tag_post WHERE tag_id = 1234
SELECT * FROM post WHERE post.id in (123,456,567,9098,8904)

Doing joins in the application may be more efficient when:
1, You cache and reuse a lot of data from earlier queries
2, You use multiple MyISAM tables
3, You distribute data across multiple servers
4, You replace joins with IN() lists on large tables
5, A join refers to the same table multiple times

Query Execution Basics
What happens when you send MySQL a query:
1, The client sends the SQL statement to the server
2, The server checks the query cache
3, The server parses, preprocesses, and optimizes the SQL into a query execution plan
4, The query execution engine executes the plan by making calls to the storage engine API
5, The server sends the result to the client

The MySQL Client/Server Protocol is half duplex, which means that at any given time the MySQL server can be either sending or receiving messages, but not both
It also means there is no way to cut a message short
There's no flow control; once oneside sends a message, the other side must fetch the entire message before responding
When the server responds, the client has to receive the entire result set

You can use SHOW FULL PROCESSLIST to view MySQL connection or thread state:
1, Sleep
The thread is waiting for a new query from the client
2, Query
The thread is either executing the query or sending the result back to the client
3, Locked
The thread is waiting for a table lock to be granted at the server level
4, Analyzing and statistics
The thread is checking storage engine statistics and optimizing the query
5, Copying to tmp table [on disk]
The thread is processing the query and copying results to a temporary table, probably for a GROUP BY, for a filesort, or to satisfy a UNION
If the state ends with "on disk", MySQL is converting an in-memory table to an on-disk table
6, Sorting result
The thread is sorting a result set
7, Sending data
This can mean several things: the thread might be sending data between stages of the query, generating the result set, or returning the result set to the client

Some types of optimizations MySQL knows how to do:
1, Reordering joins
2, Converting OUTER JOINS to INNER JOINS
3, Applying algebraic equivalence rules
4, COUNT(), MIN(), and MAX() optimizations
5, Evaluating and reducing constant expressiosn
6, Covering indexes
7, Subquery optimization
8, Early termination
9, Equality propagation
10, IN() list comparisons

// to-do
// Have a thorough understanding of MySQL parser/optimizer

Optimizing Specific Types of Queries
COUNT

// Bad
SELECT COUNT(*) FROM world.City WHERE ID > 5;
// Good
SELECT (SELECT COUNT(*) FROM world.City) - COUNT(*) FROM world.City WHERE ID <= 5;

Use covering index, Summary tables or external caching system such as memcached

JOIN
1, Make sure there are indexes on the columns in the ON or USING clauses
2, Try to ensure that any GROUP BY or ORDER BY expression refers only to columns from a single table
3, Be careful when upgrading MySQL, because the join syntax, operator precedence, and other behaviors have changed at various times

Subqueries
You should usually prefer a join where possible

GROUP BY and DISTINCT
WITH ROLLUP or do superaggregation in your application

LIMIT and OFFSET
Covering index
Convert the limit to a positional query by precalculate and index a position column
Precomputed summaries
Sphinx

UNION
Always use UNION ALL

分享到：

轻量级开源BI简要设计 | HPM Note4, Schema Optimization and Index ...

2009-07-21 18:05
浏览 1522
评论(0)
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论