MySQL 隱式轉換必知必會

在生產環境中經常會有一些隱式類型轉換導致SQL索引失效,性能極差,進而影響影響集羣負載和業務的情況。本文總結了隱式轉換常見的場景,在生產中要儘量避免 SQL 隱式轉換的出現。

作者:張洛丹,熱衷於數據庫技術,不斷探索,期望未來能夠撰寫更有深度的文章,輸出更有價值的內容!

愛可生開源社區出品,原創內容未經授權不得隨意使用,轉載請聯繫小編並註明來源。

本文約 3000 字,預計閱讀需要 10 分鐘。

常見的 SQL 產生隱式轉換的場景有:

  1. 數據類型的隱式轉換
  2. 字符集的隱式轉換

其中,特別是在表連接場景和存儲過程中的字符集轉換很容易被忽略。

說明:字符集是針對字符類型數據的編碼規則,對於數值類型則不需要進行轉換字符集。

數據類型的隱式轉換

測試表結構

t1 表字段 a 爲 VARCHAR 類型,t2 表字段 a 爲 INT 類型。

mysql> show create database test1\G
*************************** 1. row ***************************
       Database: test1
Create Database: CREATE DATABASE `test1` /*!40100 DEFAULT CHARACTER SET utf8 */
1 row in set (0.00 sec)
mysql> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` int(11) NOT NULL,
  `a` varchar(20) DEFAULT NULL,
  `b` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `id` int(11) NOT NULL,
  `a` int(11) DEFAULT NULL,
  `b` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

單表示例

這裏需要說明的是,有以下兩種類型的轉換:

  1. 當字段類型爲字符串類型,參數爲整型時,會導致索引失效
  2. 而字段類型爲整型,傳入的參數爲字符串類型時,不會導致索引失效

這是因爲在字符串與數字進行比較時,MySQL 會將字符串類型轉換爲數字進行比較,因此當字段類型爲字符串時,會在字段上加函數,而導致索引失效。

官方文檔說明:Strings are automatically converted to numbers and numbers to strings as necessary.

-- 字段類型爲varchar,傳參爲整數,無法走到索引
mysql> explain select * from t1 where a=1000;
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | a             | NULL | NULL    | NULL | 498892 |    10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 3 warnings (0.00 sec)
mysql> show warnings;
+---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                           |
+---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'a' due to type or collation conversion on field 'a'                                                               |
| Warning | 1739 | Cannot use range access on index 'a' due to type or collation conversion on field 'a'                                                             |
| Note    | 1003 | /* select#1 */ select `test1`.`t1`.`id` AS `id`,`test1`.`t1`.`a` AS `a`,`test1`.`t1`.`b` AS `b` from `test1`.`t1` where (`test1`.`t1`.`a` = 1000) |
+---------+------+---------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)
-- 字段類型爲int,傳參爲字符串,可以走到索引
mysql> explain select * from t2 where a='1000';
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t2    | NULL       | ref  | a             | a    | 5       | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

至於爲什麼不能將數字轉換爲字符串進行比較呢?

下面的比較結果:

  • 字符串的比較是逐個比較字符串的大小,直到找到不同的字符,這樣的比較結果和數字的比較結果是不同的。
mysql> select '2000' <'250';
+---------------+
| '2000' <'250' |
+---------------+
|             1 |
+---------------+
1 row in set (0.00 sec)

表連接中的數據類型轉換

當兩個表的連接字段類型不一致時會導致隱式轉換(MySQL 內部增加 cast() 函數),無法走到連接字段索引,進而可能無法使用最優的表連接順序。

原本作爲被驅動表的表由於無法使用到索引,而可能作爲驅動表。

示例:

  • 如下,正常情況下會選擇 t2 表作爲驅動表,但由於數據類型不同,實際上執行的 SQL 是:select * from t1 join t2 on cast(t1.a as unsigned)=t2.a where t2.id<1000
  • 如果 t1 作爲被驅動表,則沒有辦法走到 t1.a 的索引,因此選擇 t1 表作爲驅動表
mysql> explain select * from t1 join t2 on t1.a=t2.a where t2.id<1000;
+----+-------------+-------+------------+------+---------------+------+---------+------------+--------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref        | rows   | filtered | Extra                 |
+----+-------------+-------+------------+------+---------------+------+---------+------------+--------+----------+-----------------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | a             | NULL | NULL    | NULL       | 498892 |   100.00 | Using where           |
|  1 | SIMPLE      | t2    | NULL       | ref  | PRIMARY,a     | a    | 5       | test1.t1.a |      1 |     5.00 | Using index condition |
+----+-------------+-------+------------+------+---------------+------+---------+------------+--------+----------+-----------------------+
2 rows in set, 2 warnings (0.00 sec)
mysql> show warnings;
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                                                                                                                                                                                                    |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Warning | 1739 | Cannot use ref access on index 'a' due to type or collation conversion on field 'a'                                                                                                                                                                                                        |
| Note    | 1003 | /* select#1 */ select `test1`.`t1`.`id` AS `id`,`test1`.`t1`.`a` AS `a`,`test1`.`t1`.`b` AS `b`,`test1`.`t2`.`id` AS `id`,`test1`.`t2`.`a` AS `a`,`test1`.`t2`.`b` AS `b` from `test1`.`t1` join `test1`.`t2` where ((`test1`.`t2`.`id` < 1000) and (`test1`.`t1`.`a` = `test1`.`t2`.`a`)) |
+---------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2 rows in set (0.01 sec)

字符集的隱式轉換

當參數字符集和字段字符集不同時,無法直接進行比較,而需要進行字符集轉換,則可能需要在轉換字段上加 convert() 函數來轉換字符集,導致索引失效。

測試表結構

  • 數據庫字符集是 UTF8MB4
  • t1 表字符集是 UTF8
  • t2 表字符集是 UTF8MB4
mysql> show create database test\G
*************************** 1. row ***************************
       Database: test
Create Database: CREATE DATABASE `test` /*!40100 DEFAULT CHARACTER SET utf8mb4 */
mysql> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` int(11) NOT NULL,
  `a` varchar(20) DEFAULT NULL,
  `b` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
mysql> show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `id` int(11) NOT NULL,
  `a` varchar(20) DEFAULT NULL,
  `b` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.01 sec)

單表示例

-- 正常執行時,匹配字段的字符集(沒有單獨指定時繼承表的字符集)
mysql> explain select * from t1 where a='1000';
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t1    | NULL       | ref  | a             | a    | 63      | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

-- 將參數轉換不同的字符集,無法走到索引,而是全表掃描
mysql> explain select * from t1 where a=convert('1000' using utf8mb4);
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 2000 |   100.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)


-- show warnings可以看到優化器進行了轉換,在t1.a上加了convert函數,從而無法走到索引
mysql> show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                               |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b` from `test`.`t1` where (convert(`test`.`t1`.`a` using utf8mb4) = <cache>(convert('1000' using utf8mb4))) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

另外,需要注意的是:

MySQL 內部會優先將低級的字符集轉換爲更高級的字符集,例如將 UTF8 轉換爲 UTF8MB4。

在前面的示例中,convert() 函數加在 t1.a 上,而下面這個示例,convert() 函數加在參數上,而非 t2.a 字段上,這種情況則沒有導致性能變差:

mysql> show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `id` int(11) NOT NULL,
  `a` varchar(20) DEFAULT NULL,
  `b` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `a` (`a`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)
mysql> explain select * from t2 where a=convert('1000' using utf8);
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref   | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | t2    | NULL       | ref  | a             | a    | 83      | const |    1 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
mysql> show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                   |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`t2`.`id` AS `id`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t2` where (`test`.`t2`.`a` = convert(convert('1000' using utf8) using utf8mb4)) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

綜上:

  • 在當表字段字符集爲更低級的字符集(如 UTF8),而傳入的值爲更高級的字符集(如 UTF8MB4),則此時會轉換表字段的字符集,相當於字段上使用了函數,索引失效。
  • 當表字段爲更高級的字符集(如 UTF8MB4),而傳入的值爲更低級的字符集(如 UTF8),則此時會將傳入的值進行字符集轉換,並不會導致索引失效。

但我們通常不會去手工使用 convert() 函數轉換參數的字符集,在後文兩種場景中可能會出現比較容易忽略的隱式類型轉換,引發生產問題。

表連接中的字符集轉換

當兩個表的連接字段字符集不一致時會導致隱式轉換(MySQL 內部增加 convert() 函數),無法走到連接字段索引,進而可能無法使用最優的表連接順序。

原本作爲被驅動表的表由於無法使用到索引,而可能作爲驅動表。

示例:

  • 正常情況下,MySQL 會優先小結果集的表作爲驅動表,在本例中即爲 t2 爲驅動表,t1 爲被驅動表。
  • 但是由於字符集不同,實際上執行的 SQL 爲 show warnings 看到的,對 t1.a 字段加了 convert() 函數進行轉換字符集,則無法走到 t1.a 字段的索引而不得不改變連接順序。
mysql> explain select * from t1 left join t2 on t1.a=t2.a where t2.id<1000;
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra                 |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-----------------------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 498649 |   100.00 | NULL                  |
|  1 | SIMPLE      | t2    | NULL       | ref  | PRIMARY,a     | a    | 83      | func |      1 |     4.79 | Using index condition |
+----+-------------+-------+------------+------+---------------+------+---------+------+--------+----------+-----------------------+
2 rows in set, 1 warning (0.00 sec)
mysql> show warnings;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                                |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`id` AS `id`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t1` join `test`.`t2` where ((`test`.`t2`.`id` < 1000) and (convert(`test`.`t1`.`a` using utf8mb4) = `test`.`t2`.`a`)) |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)




-- 在下面示例中,雖然也發生了類型轉換,但是效率並沒有變差,因爲原本最優的連接順序就是t1作爲驅動表
mysql> explain select * from t1 left join t2 on t1.a=t2.a where t1.id<1000;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t1    | NULL       | range | PRIMARY       | PRIMARY | 4       | NULL |  999 |   100.00 | Using where |
|  1 | SIMPLE      | t2    | NULL       | ref   | a             | a       | 83      | func |    1 |   100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)


mysql> show warnings;
+-------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                                                                                                                                                                                                   |
+-------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select `test`.`t1`.`id` AS `id`,`test`.`t1`.`a` AS `a`,`test`.`t1`.`b` AS `b`,`test`.`t2`.`id` AS `id`,`test`.`t2`.`a` AS `a`,`test`.`t2`.`b` AS `b` from `test`.`t1` left join `test`.`t2` on((convert(`test`.`t1`.`a` using utf8mb4) = `test`.`t2`.`a`)) where (`test`.`t1`.`id` < 1000) |
+-------+------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

存儲過程中的字符集轉換

這也是比較容易忽略的一種場景,問題的發現是在生產環境存儲過程中根據主鍵更新,但卻需要執行 10s+。

存儲過程中變量的字符集默認繼承自 database 的字符集(也可以在創建時指定),當表字段字符集和 database 的字符集不一樣時,就會出現類似前面的隱式字符集類型轉換。

示例:

  • database 的字符集是 UTF8MB4
  • character_set_clientcollation_connection 是創建存儲過程時會話的 character_set_clientcollation_connection 的值
  • 經測試存儲過程中的變量的字符集是和數據庫級別的字符集一致
-- 存儲過程信息: Database Collation: utf8mb4_general_ci
mysql> show create procedure update_data\G
*************************** 1. row ***************************
           Procedure: update_data
            sql_mode: ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
    Create Procedure: CREATE DEFINER=`root`@`%` PROCEDURE `update_data`()
begin
  declare j int;
  declare n varchar(100);
   select charset(n);
  set j=1;
  while(j<=2000)do
set n = cast(j as char);
select 1,now();
    update t1 set b=concat(b,'1') where a=n;
select 2,now();
select sleep(1);
    set j=j+1;
  end while;
end
character_set_client: utf8mb4
collation_connection: utf8mb4_general_ci
  Database Collation: utf8mb4_general_ci
1 row in set (0.00 sec)
如下,在執行存儲過程後,看到打印的變量n的字符集是utf8mb4


mysql> call update_data();
+------------+
| charset(n) |
+------------+
| utf8mb4    |
+------------+
1 row in set (0.00 sec)

根據索引字段 a 更新的語句實際上是變成了下面這樣,走的是全表掃描(type:index,key:primary)。

mysql> explain update t1 set b=concat(b,'1') where a=convert('1000' using utf8mb4);
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
|  1 | UPDATE      | t1    | NULL       | index | NULL          | PRIMARY | 4       | NULL | 498649 |   100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
1 row in set (0.00 sec)


-- 而正常情況下,執行計劃爲:
mysql> explain update t1 set b=concat(b,'1') where a='1000';
+----+-------------+-------+------------+-------+---------------+------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type  | possible_keys | key  | key_len | ref   | rows | filtered | Extra       |
+----+-------------+-------+------------+-------+---------------+------+---------+-------+------+----------+-------------+
|  1 | UPDATE      | t1    | NULL       | range | a             | a    | 63      | const |    1 |   100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)

更新時間也由 0.00sec 變爲 0.60sec,在表數據量很大的情況下,全表掃描將會對生產產生較大影響。

mysql> update t1 set b=concat(b,'1') where a='1000';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
mysql> update t1 set b=concat(b,'1') where a=convert('1000' using utf8mb4);
Query OK, 1 row affected (0.60 sec)
Rows matched: 1  Changed: 1  Warnings: 0

如何避免隱式轉換

對於數據類型的隱式轉換:

  1. 規範數據類型的選擇
  2. SQL 傳參與字段數據類型匹配

對於字符集的隱式轉換:客戶端字符集、服務器端字符集、數據庫字符集、表字符集、字段字符集保持一致。

更多技術文章,請訪問:https://opensource.actionsky.com/

關於 SQLE

SQLE 是一款全方位的 SQL 質量管理平臺,覆蓋開發至生產環境的 SQL 審覈和管理。支持主流的開源、商業、國產數據庫,爲開發和運維提供流程自動化能力,提升上線效率,提高數據質量。

SQLE 獲取

類型 地址
版本庫 https://github.com/actiontech/sqle
文檔 https://actiontech.github.io/sqle-docs/
發佈信息 https://github.com/actiontech/sqle/releases
數據審覈插件開發文檔 https://actiontech.github.io/sqle-docs/docs/dev-manual/plugins/howtouse
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章