MySQL Innodb 如何找出阻塞事務源頭 SQL

在MySQL數據庫中出現了阻塞問題,如何快速查找定位問題根源?在實驗開始前,我們先梳理一下有什麼工具或命令查看MySQL的阻塞,另外,我們也要一一對比其優劣,因爲有些命令可能在實際環境下可能並不適用。

show engine innodb status
Innotop工具
INNODB_TRX 等系統表
下面我們理論聯繫實際,通過實驗來測試總結一下這個問題。首先構造測試環境,數據庫測試環境爲( 5.7.21 MySQL Community Server 和5.6.20-enterprise-commercial,這兩個測試環境我都測試驗證過)

mysql> use MyDB;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
mysql> create table test_blocking(id int primary key, name varchar(12));
Query OK, 0 rows affected (0.05 sec)
mysql> insert into test_blocking
-> select 1, 'kerry' from dual;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> insert into test_blocking
-> select 2, 'jimmy' from dual;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
mysql> insert into test_blocking
-> select 3, 'kkk' from dual;
Query OK, 1 row affected (0.00 sec)
Records: 1 Duplicates: 0 Warnings: 0
準備好測試環境數據後,那麼我們接下來開始實驗,爲了實驗效果,我們先將參數innodb_lock_wait_timeout設置爲100。

mysql> show variables like 'innodb_lock_wait_timeout';
+--------------------------+-------+
| Variable_name | Value |
+--------------------------+-------+
| innodb_lock_wait_timeout | 50 |
+--------------------------+-------+
1 row in set (0.00 sec)
mysql> set global innodb_lock_wait_timeout=100 ;
Query OK, 0 rows affected (0.00 sec)
mysql> select connection_id() from dual;
+-----------------+
| connection_id() |
+-----------------+
| 8 |
+-----------------+
1 row in set (0.00 sec)
mysql> set session autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from test_blocking where id=1 for update;
+----+-------+
| id | name |
+----+-------+
| 1 | kerry |
+----+-------+
1 row in set (0.00 sec)
然後在第二個連接會話中執行更新腳本,構造被阻塞的案例

mysql> select connection_id() from dual;
+-----------------+
| connection_id() |
+-----------------+
| 9 |
+-----------------+
1 row in set (0.00 sec)
mysql> update test_blocking set name='kk' where id=1;
在第三個連接會話執行下面命令,查看TRANSACTIONS相關信息:
1
mysql> show engine innodb status\G;

使用show engine innodb status命令後,可以查看其輸出的TRANSACTIONS部分信息,如上截圖所示,找到類似TRX HAS BEEN WATING …部分的信息,
通過那部分信息,我們可以看到update test_blocking set name=’kk’ where id=1這個SQL語句被阻塞了14秒,一直在等待獲取X Lock。

TRANSACTIONS

Trx id counter 148281 #下一個事務ID
Purge done for trx's n:o < 148273 undo n:o < 0 state: running but idle
History list length 552
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 0, not started
MySQL thread id 15, OS thread handle 0x4cc64940, query id 261 localhost root cleaning up
---TRANSACTION 0, not started
MySQL thread id 14, OS thread handle 0x4cbe2940, query id 278 localhost root init
show engine innodb status
---TRANSACTION 148280, ACTIVE 24 sec
2 lock struct(s), heap size 360, 1 row lock(s)
MySQL thread id 8, OS thread handle 0x4cba1940, query id 276 localhost root cleaning up
---TRANSACTION 148279, ACTIVE 313 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 360, 1 row lock(s)
MySQL thread id 9, OS thread handle 0x4cc23940, query id 277 localhost root updating #線程ID爲9, 操作系統線程句柄爲0x4cc23940, 查詢ID爲277,賬號爲root的UPDATE操作
update test_blocking set name='kk' where id=1 #具體SQL語句
------- TRX HAS BEEN WAITING 14 SEC FOR THIS LOCK TO BE GRANTED: #TRX等待授予鎖已經有14秒了
RECORD LOCKS space id 337 page no 3 n bits 72 index PRIMARY of table MyDB.test_blocking trx id 148279 lock_mode X locks rec but not gap waiting
#在space id=337(test_blocking表的表空間),page no=3的頁上,表test_blocking上的主鍵索引在等待X鎖
Record lock, heap no 2 PHYSICAL RECORD: n_fields 4; compact format; info bits 0
0: len 4; hex 80000001; asc ;; #第一個字段是主鍵,制度按長爲4,值爲1
1: len 6; hex 000000024322; asc C";; #該字段爲6個字節的事務id,這個id表示最近一次被更新的事務id(對應十進制爲148258)
2: len 7; hex 9a000001f20110; asc ;; #該字段爲7個字節的回滾指針,用於mvcc
3: len 5; hex 6b65727279; asc kerry;; #該字段表示的是此記錄的第二個字段,長度爲5,值爲kerry(如果表有多個字段,那麼此處後面還有記錄)

mysql> select * from information_schema.INNODB_SYS_DATAFILES where space=337;
+-------+--------------------------+
| SPACE | PATH |
+-------+--------------------------+
| 337 | ./MyDB/test_blocking.ibd |
+-------+--------------------------+
1 row in set (0.00 sec)
mysql>

但是這種方式也有一些弊端,例如生產環境很複雜,尤其是有大量事務的情況下。諸多信息根本無法清晰判斷知道誰阻塞了誰;其次一點也不直觀; 另外,這個也無法定位blocker 的SQL語句。這種方式只能作爲輔助分析用途,通過查看取鎖的詳細信息,幫助進一步診斷問題。
2: Innotop工具

如下所示,Innotop工具很多情況下也不能定位到阻塞的語句(Blocking Query), 也僅僅能獲取一些鎖相關信息
3:通過查詢information_schema數據庫下與事務相關的幾個系統表
還是構造之前的測試案例,在第一個會話中使用SELECT FOR UPDATE鎖定其中一行記錄

mysql> use MyDB;
Database changed
mysql> set session autocommit=0;
Query OK, 0 rows affected (0.00 sec)
mysql> select connection_id() from dual;
+-----------------+
| connection_id() |
+-----------------+
| 17 |
+-----------------+
1 row in set (0.00 sec)
mysql> select * from test_blocking where id=1 for update;
+----+-------+
| id | name |
+----+-------+
| 1 | kerry |
+----+-------+
1 row in set (0.00 sec)
mysql>
然後在第二個連接會話中執行更新腳本,構造被阻塞的案例

mysql> use MyDB;
Database changed
mysql> select connection_id() from dual;
+-----------------+
| connection_id() |
+-----------------+
| 19 |
+-----------------+
1 row in set (0.00 sec)
mysql> update test_blocking set name='kk' where id=1;
此時阻我們在第三個連接會話查找誰被阻塞了

SELECT b.trx_mysql_thread_id AS 'blocked_thread_id'
,b.trx_query AS 'blocked_sql_text'
,c.trx_mysql_thread_id AS 'blocker_thread_id'
,c.trx_query AS 'blocker_sql_text'
,( Unix_timestamp() - Unix_timestamp(c.trx_started) )
AS 'blocked_time'
FROM information_schema.innodb_lock_waits a
INNER JOIN information_schema.innodb_trx b
ON a.requesting_trx_id = b.trx_id
INNER JOIN information_schema.innodb_trx c
ON a.blocking_trx_id = c.trx_id
WHERE ( Unix_timestamp() - Unix_timestamp(c.trx_started) ) > 4;
SELECT a.sql_text,
c.id,
d.trx_started
FROM performance_schema.events_statements_current a
join performance_schema.threads b
ON a.thread_id = b.thread_id
join information_schema.processlist c
ON b.processlist_id = c.id
join information_schema.innodb_trx d
ON c.id = d.trx_mysql_thread_id
where c.id=17
ORDER BY d.trx_started\G;
如下截圖所示,第一個SQL語句能夠查到線程19 被線程 17阻塞了, 被阻塞的SQL語句爲“update test_blocking set name=’kk’ where id=1;”, 能夠查到被阻塞了多長時間,但是無法查到源頭SQL語句。此時就需要第二個SQL語句登場,找到源頭語句。

但是不要太天真的認爲第二個SQL語句能夠獲取所有場景下的阻塞源頭SQL語句,實際業務場景,會話可能在執行一個存儲過程或複雜的業務,有可能它執行完阻塞源頭SQL後,繼續在執行其它SQL語句,此時,你抓取的是這個連接會話最後執行的SQL語句,如下所示,我簡單構造了一個例子。就能構造這樣的一個場景。這個我曾經寫過一篇博客“爲什麼數據庫有時候不能定位阻塞(Blocker)源頭的SQL語句”,分析SQL Server和ORACLE 定位查找阻塞源頭SQL語句,現在看來真是大道同源,殊途同歸。

mysql> select * from test_blocking where id=1 for update;
+----+-------+
| id | name |
+----+-------+
| 1 | kerry |
+----+-------+
1 row in set (0.00 sec)
mysql> delete from student where stu_id=1001;
Query OK, 1 row affected (0.00 sec)
mysql>

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章