Our development team recently ran into quite an interesting issue – when joining around 20 tables together, the query performed outrageously slowly, taking in excess of 40 seconds to execute. This was using MySQL 5.7 at the time, but astonishingly, it ran in less than a second on MySQL 5.6. Further, we were returning only a handful of records. So what happened?
A bit of background
Before a query is executed, the database engine will attempt to find the optimum way to join all the queries together and access the data required. This is essential, as a bad decision in the join stage can tank the performance of the whole gig. The outcome of the optimisation stage is called a Query Execution Plan, and underpins the resulting query execution time.
To find the Query Execution Plan MySQL performs a backwards depth-first search algorithm to determine which methodology will result in the fastest performance. The challenge is the search space grows astronomically quickly – for 5 tables there are 120 combinations to explore. Adding just one more table raises this to 720, a 6-fold increase. In the scenario we had, 20 tables will result in a search space of 2,432,902,008,176,640,000 possible outcomes. Clearly MySQL cannot check all of these, so it uses a combination of heuristics to try and find the best query plan within an acceptable time limit. After all, this step is a trade off to reduce the actual execution stage of the query. When MySQL is doing all this, you’ll see the query will be placed in the STATISTICS state.
So what happened in our case?
It turns out that the time MySQL is allowed to spend finding the best Query Execution Plan is, in part, controlled by the optimizer_search_depth engine parameter. By default, this value was quite high. If you set this to zero, MySQL will choose the depth but also cap out at a reasonable value.
SET SESSION optimizer_search_depth = 0;
In essence, MySQL was stuck optimising the overall query and trying to find the ultimate Query Execution Plan, when in reality, it would be far quicker to just run it with whatever plan it is able to come up with just searching a few potential solutions