2014년 8월 11일 월요일

MySQL LIMIT offset performance

http://devoluk.com/mysql-limit-offset-performance.html



MySQL LIMIT offset performance

Evey so often we need to SELECT a bunch of rows by date, category or keyword and show the result paginated. Pagination is almost always an implicit requirement. After all we have limited screen space and there's only so much of information we can read and comprehend at once. There's no need to transfer more information than we are going to show. We need to defer the additional data request until the user asks for it it by following the Next Page or More Info links or scrolling down towards the end of the content. Let's look at the MySQL side of such a trivial requirement as pagination - the LIMIT keyword. When used in conjunction with ORDER BY you can get a list of rows sorted by some criteria. And when used in conjunction with WHERE keyword you can filter out certail rows. 

Let's look at this simple table:
IDNameDate
1SampleeOct 1
2TestOct 2
..............

One SELECT using ORDER BY and LIMIT

Usually we'd order the products by date and want MySQL server to return just the rows for the particular page we are looking for. We'd use LIMIT and specify the page offset and number of rows we need. 

SELECT *
FROM   products
ORDER  BY date
LIMIT  0, 10 


The query selects all rows from our products table and orders them by date. We have created the appropriate INDEX so that ordering by Date should be fast. This query is easy and straightforward but it can cause a sever performance hit. MySQL LIMIT may return results way too slow when the offset is high. When used in a table with a lot of rows this query gets slower and slower the higher page offset gets. It seems like for some reason MySQL is operating on the whole table data instead of working only with the rows we'll actually need.

LIMIT is used for pagination very often. In the example above MySQL needs to sort the rows by Date (using an index) and then lookup the rows specified in the OFFSET, LIMIT keyword. 
We can rewrite this query so that we make it clear that only a fraction of the rows are required.
Time (sec)Select Page by OffsetPage IndexLIMITJOIN010k20k30k40k50k60k70k80k90k100k110k120k130k140k150k160k170k180k190k120.751.251.51.752.25Highcharts.com

An alternative approach

Here's a suggested ORDER BY ... LIMIT Performance Optimization

SELECT *
FROM   products
JOIN (SELECT id
      FROM   products
      ORDER  BY date
      LIMIT  0, 10) AS t ON t.id = products.id; 


Let's see what it does. First it calculates the IDs to be returned and then JOINs the rest of the column data to it. It clearly states that we only need 10 IDs and then bring in the rest of the column data. This query is slightly faster. What matters most is that it looks like increasing page offset doesn't slow the query down that much. The query returns fast enough even for huge page offsets. Users don't usually browse that deep in a website but bots do. And when they do the requests will get slower and slower as the page index increases. 

A probable explanation is late row lookup: For more information check out: MySQL ORDER BY / LIMIT performance: late row lookups 

댓글 없음:

댓글 쓰기