[原創]如何從數據庫層面檢測兩表內容的一致性

一般來說呢，如何檢測兩張表的內容是否一致，這樣的需求大多在從機上體現，以保證數據一致性。方法無非有兩個，第一呢就是從數據庫着手，第二呢就是從應用程序端着手。我這裏羅列了些如何從數據庫層面來解決此類問題的方法。

當然第一步就是檢查記錄數是否一致，否則不用想任何其他方法了。

這裏我們用兩張表t1_old,t1_new來演示。

表結構：
 CREATE TABLE t1_old (
  id int(11) NOT NULL,
  log_time timestamp DEFAULT NULL
) ;
 CREATE TABLE t1_new (
  id int(11) NOT NULL,
  log_time timestamp DEFAULT NULL
) ;
兩表的記錄數都爲100條。
mysql> select count(*) from t1_old;
+----------+
| count(*) |
+----------+
|      100 |
+----------+
1 row in set (0.31 sec)
mysql> select count(*) from t1_new;
+----------+
| count(*) |
+----------+
|      100 |
+----------+
1 row in set (0.00 sec)

方法一：用加法然後去重。

由於Union 本身具備把上下兩條連接的記錄做唯一性排序，所以這樣檢測來的非常簡單。
mysql> select count(*) from (select * from t1_old union select * from t1_new) as T;
+----------+
| count(*) |
+----------+
|      100 |
+----------+
1 row in set (0.06 sec)
這裏的記錄數爲100，初步證明兩表內容一致。但是，這個方法有個BUG，在某些情形下不能簡單表示結果集一致。
比如：
mysql> create table t1_old1 (id int);
Query OK, 0 rows affected (0.27 sec)
mysql> create table t1_new1(id int);
Query OK, 0 rows affected (0.09 sec)
mysql> insert into t1_old1 values (1),(2),(3),(5);
Query OK, 4 rows affected (0.15 sec)
Records: 4  Duplicates: 0  Warnings: 0
mysql> insert into t1_new1 values (2),(2),(3),(5);    
Query OK, 4 rows affected (0.02 sec)
Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from t1_old1;
+------+
| id   |
+------+
|    1 |
|    2 |
|    3 |
|    5 |
+------+
4 rows in set (0.00 sec)
mysql> select * from t1_new1;
+------+
| id   |
+------+
|    2 |
|    2 |
|    3 |
|    5 |
+------+
4 rows in set (0.00 sec)
mysql> select count(*) from (select * from t1_old1 union select * from t1_new1) as T;
+----------+
| count(*) |
+----------+
|        4 |
+----------+
1 row in set (0.00 sec)
mysql> 
所以在這點上，這個方法等於是無效。

方法二：用減法來歸零。

由於MySQL 沒有提供減法操作符，這裏我們換做PostgreSQL來檢測。
t_girl=# select count(*) from (select * from t1_old except select * from t1_new) as T;
 count 
-------
     0
(1 row)
Time: 1.809 ms
這裏檢測出來結果是0，那麼證明兩表的內容一致。 那麼我們可以針對第一種方法提到的另外一種情況做檢測:
t_girl=# select count(*) from (select * from t1_old1 except select * from t1_new1) as T;
 count 
-------
     1
(1 row)
Time: 9.837 ms

OK，這裏檢測出來結果不對，那麼就直接給出不一致的結論。

第三種：用全表JOIN，這個也是最爛的做法了，當然我這裏指的是在表記錄數超級多的情形下。

當然這點我也用PostgreSQL來演示
t_girl=# select count(*) from t1_old as a full outer join t1_new as b using (id,log_time) where a.id is null or b.id is null; 
 count 
-------
     0
(1 row)
Time: 5.002 ms
t_girl=# 
結果爲0，證明內容一致。

第四種：用checksum校驗。

比如在MySQL 裏面，如果兩張表的checksum值一致，那麼內容也就一致。
mysql> checksum table t1_old;
+---------------+----------+
| Table         | Checksum |
+---------------+----------+
| t_girl.t1_old | 60614552 |
+---------------+----------+
1 row in set (0.00 sec)
mysql> checksum table t1_new;
+---------------+----------+
| Table         | Checksum |
+---------------+----------+
| t_girl.t1_new | 60614552 |
+---------------+----------+
1 row in set (0.00 sec)
但是這種方法也只侷限於兩表結構一摸一樣。 比如，我修改下表t1_old的字段類型，那麼checksum的值也就不一樣了。
mysql> alter table t1_old modify id bigint;
Query OK, 100 rows affected (0.23 sec)
Records: 100  Duplicates: 0  Warnings: 0
mysql> checksum table t1_old;
+---------------+------------+
| Table         | Checksum   |
+---------------+------------+
| t_girl.t1_old | 3211623989 |
+---------------+------------+
1 row in set (0.00 sec)
mysql> checksum table t1_new;
+---------------+----------+
| Table         | Checksum |
+---------------+----------+
| t_girl.t1_new | 60614552 |
+---------------+----------+
1 row in set (0.00 sec)

所以從上面幾種數據庫提供的方法來看，用減法來歸零相對來說比較可靠，其他的方法比較適合在特定的情形下來檢測。

[原創]如何從數據庫層面檢測兩表內容的一致性

.Net 8.0 下的新RPC，IceRPC之試試的新玩法"打洞"

完美替代postman的軟件

Vue mockjs mock.js

關於遊戲付費的一點想法

我通過CKA和CKS啦！

安裝chromadb注意事項

《最新出爐》系列入門篇-Python+Playwright自動化測試-42-強大的可視化追蹤利器Trace Viewer

大數據怎麼學？對大數據開發領域及崗位的詳細解讀，完整理解大數據開發領域技術體系

【原創】PostgreSQL 給數組排序

【原創】REDIS與MYSQL實現標籤的對比

【原創】oracle函數INSTR的MySQL實現

【原創】POSTGRESQL交叉表的實現

【原創】mysql 錯誤緩衝堆棧

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結