使用IN()子查詢性能下降的例子及優化(特定在mysql5.5版本線)
1、查詢需求:
知道某個用戶的gameid 爲 101190 查詢用戶的name (這兩個字段分別在兩個表中 通過id字段關聯)
如果使用IN子查詢 mysql語句可以這樣寫:
mysql> select name from lee where id IN (select id from lee1 where gameid=101190); +-------+ | name | +-------+ | lee19 | +-------+ 1 row in set (0.00 sec)
2、分析語句影響到的行數:
mysql> explain select name from lee where id IN (select id from lee1 where gameid=101190); +----+--------------------+-------+-----------------+---------------+---------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+-------+-----------------+---------------+---------+---------+------+------+-------------+ | 1 | PRIMARY | lee | ALL | NULL | NULL | NULL | NULL | 1198 | Using where | | 2 | DEPENDENT SUBQUERY | lee1 | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | Using where | +----+--------------------+-------+-----------------+---------------+---------+---------+------+------+-------------+ 2 rows in set (0.00 sec)
3、分析:可以看出對lee表做了一次全表掃描。上述語句是需要根據id來關聯外部表lee,因爲需要id字段,所以Mysql認爲無法先執行這個子查詢,所以對lee表進行全表掃描,然後根據返回的id逐個執行IN子查詢。如果是非常大的表這個查詢性能會很低。
4、重寫上述查詢需要用到連接查詢:
mysql> select name from lee left join lee1 on lee.id=lee1.id where gameid=101190; +-------+ | name | +-------+ | lee19 | +-------+ 1 row in set (0.00 sec)
5、分析語句影響到的行數:
mysql> explain select name from lee left join lee1 on lee.id=lee1.id where gameid=101190; +----+-------------+-------+--------+---------------+---------+---------+---------------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+---------------+---------+---------+---------------+------+-------------+ | 1 | SIMPLE | lee1 | ALL | PRIMARY | NULL | NULL | NULL | 200 | Using where | | 1 | SIMPLE | lee | eq_ref | PRIMARY | PRIMARY | 4 | cr_db.lee1.id | 1 | | +----+-------------+-------+--------+---------------+---------+---------+---------------+------+-------------+ 2 rows in set (0.00 sec)
6、說明:雖然連接查詢在數據庫知識中是相當耗費資源的(連接中所涉及到的表做笛卡爾乘積),但是mysql優化器對連接查詢做了很好的優化,從執行所影響到的行數可以看出首先掃描lee1表,只會返回200條記錄進行後面的嵌套循環查詢(即連接查詢)。實際上影響連接查詢的是所連接的表的執行順序(如果有多餘兩張表參與連接查詢),關聯優化器會嘗試在所有的關聯順序中選擇一個成本最小的來生成執行計劃樹。