前面文章中講了lcn模式下的正常流程是如何運作的。這篇講下在發生異常時框架是怎麼進行回滾的,同樣調用鏈還是A>B>C。
我們知道前一個模塊的doBusinessCode執行的是後一個模塊的所有邏輯。所以 我們採用遞歸的 從後(最後一個模塊)往前(上一個調用模塊)的邏輯分析。
C模塊的所有的代碼執行都在B模塊的doBusinessCode方法中。B模塊的代碼執行都在A模塊的doBusinessCode方法中。
C模塊
C模塊業務代碼如下(B模塊此代碼相同處理類相同)
1、此方法會拋出Throwable 類型的異常
2、此方法會catch住兩種異常TransactionException 與 Throwable 異常,並拋出。
public Object transactionRunning(TxTransactionInfo info) throws Throwable {
// 1. 獲取事務類型
String transactionType = info.getTransactionType();
// 2. 獲取事務傳播狀態
DTXPropagationState propagationState = propagationResolver.resolvePropagationState(info);
// 2.1 如果不參與分佈式事務立即終止
if (propagationState.isIgnored()) {
return info.getBusinessCallback().call();
}
// 3. 獲取本地分佈式事務控制器
DTXLocalControl dtxLocalControl = txLcnBeanHelper.loadDTXLocalControl(transactionType, propagationState);
// 4. 織入事務操作
try {
// 4.1 記錄事務類型到事務上下文
Set<String> transactionTypeSet = globalContext.txContext(info.getGroupId()).getTransactionTypes();
transactionTypeSet.add(transactionType);
dtxLocalControl.preBusinessCode(info);
// 4.2 業務執行前
txLogger.txTrace(
info.getGroupId(), info.getUnitId(), "pre business code, unit type: {}", transactionType);
// 4.3 執行業務
Object result = dtxLocalControl.doBusinessCode(info);
// 4.4 業務執行成功
txLogger.txTrace(info.getGroupId(), info.getUnitId(), "business success");
dtxLocalControl.onBusinessCodeSuccess(info, result);
return result;
} catch (TransactionException e) {
txLogger.error(info.getGroupId(), info.getUnitId(), "before business code error");
throw e;
} catch (Throwable e) {
// 4.5 業務執行失敗
txLogger.error(info.getGroupId(), info.getUnitId(), Transactions.TAG_TRANSACTION,
"business code error");
dtxLocalControl.onBusinessCodeError(info, e);
throw e;
} finally {
// 4.6 業務執行完畢
dtxLocalControl.postBusinessCode(info);
}
}
C模塊由於是最後一個模塊不再去調用其他接口,它的doBusinessCode只是執行本地數據庫操作,此doBusinessCode方法會拋出Throwable異常,如果C模塊的本地數據庫操作失敗報錯,則會被catch住去執行下面代碼
catch (Throwable e) {
// 4.5 業務執行失敗
txLogger.error(info.getGroupId(), info.getUnitId(), Transactions.TAG_TRANSACTION,
"business code error");
dtxLocalControl.onBusinessCodeError(info, e);
throw e;
}
public void onBusinessCodeError(TxTransactionInfo info, Throwable throwable) {
try {
//清理事務,即回滾本地數據庫連接
transactionCleanTemplate.clean(info.getGroupId(), info.getUnitId(), info.getTransactionType(), 0);
} catch (TransactionClearException e) {
log.error("{} > clean transaction error." , Transactions.LCN);
}
}
如果本地數據庫操作成功,C模塊會去joinGroup加入事務組。(異步檢測也是處理異常的,後面再講)
public void joinGroup(String groupId, String unitId, String transactionType, TransactionInfo transactionInfo)
throws TransactionException {
try {
txLogger.txTrace(groupId, unitId, "join group > transaction type: {}", transactionType);
reliableMessenger.joinGroup(groupId, unitId, transactionType, DTXLocalContext.transactionState(globalContext.dtxState(groupId)));
txLogger.txTrace(groupId, unitId, "join group message over.");
// 異步檢測
dtxChecking.startDelayCheckingAsync(groupId, unitId, transactionType);
// 緩存參與方切面信息
aspectLogger.trace(groupId, unitId, transactionInfo);
} catch (RpcException e) {
dtxExceptionHandler.handleJoinGroupMessageException(Arrays.asList(groupId, unitId, transactionType), e);
} catch (LcnBusinessException e) {
dtxExceptionHandler.handleJoinGroupBusinessException(Arrays.asList(groupId, unitId, transactionType), e);
}
txLogger.txTrace(groupId, unitId, "join group logic over");
}
public void joinGroup(String groupId, String unitId, String unitType, int transactionState) throws RpcException, LcnBusinessException {
JoinGroupParams joinGroupParams = new JoinGroupParams();
joinGroupParams.setGroupId(groupId);
joinGroupParams.setUnitId(unitId);
joinGroupParams.setUnitType(unitType);
joinGroupParams.setTransactionState(transactionState);
MessageDto messageDto = request(MessageCreator.joinGroup(joinGroupParams));
//加入事務組失敗,拋出異常
if (!MessageUtils.statusOk(messageDto)) {
throw new LcnBusinessException(messageDto.loadBean(Throwable.class));
}
}
這裏會catch異常一個是RpcException 異常即和服務端連接不成功,第二個是LcnBusinessException 異常這個異常是在加入事務組失敗的情況下拋出的。
對於RpcException異常框架的處理是直接拋出
public void handleJoinGroupMessageException(Object params, Throwable ex) throws TransactionException {
throw new TransactionException(ex);
}
對於LcnBusinessException異常是先清理本地事務,回滾連接然後拋出異常
public void handleJoinGroupBusinessException(Object params, Throwable ex) throws TransactionException {
List paramList = (List) params;
String groupId = (String) paramList.get(0);
String unitId = (String) paramList.get(1);
String unitType = (String) paramList.get(2);
try {
transactionCleanTemplate.clean(groupId, unitId, unitType, 0);
} catch (TransactionClearException e) {
txLogger.error(groupId, unitId, "join group", "clean [{}]transaction fail.", unitType);
}
throw new TransactionException(ex);
}
總結下C模塊
1、本地數據庫操作異常和加入事務組失敗會進行本地數據庫連接回滾
2、針對於在加入事務組時和服務端連接、通信失敗是直接拋出異常的(基本不可能除非所有的服務端都不可用)
3、只要C模塊出現異常都會向B模塊拋出Throwable。無論何種異常情況,C模塊的異常都會拋出,到B模塊中。
B模塊
B模塊和C模塊代碼一模一樣,只是B模塊的doBussinessCode是所有的C模塊流程與本地操作。
上面說過C模塊只要出錯或者本地數據庫操作失敗,都會被B模塊的catch Throwable 所捕獲到,處理邏輯和C模塊一樣清理本地事務,回滾連接。
也和C模塊同樣會啓動異步檢測程序,會有RpcException與LcnBusinessException處理也和C模塊一致。
A模塊
A模塊會先進行創建事務組,但是由於業務是在之後執行的,則創建事務組只是做拋出異常。A模塊catch住後都沒有做其他的操作。
A模塊的異常處理都放在postBusinessCode方法中。
A 模塊創建事務組,若執行失敗 則拋出異常 TransactionException(e)
@Override
public DTXContext create(String groupId) throws TransactionException {
try {
fastStorage.initGroup(groupId);
} catch (FastStorageException e) {
// idempotent processing
if (e.getCode() != FastStorageException.EX_CODE_REPEAT_GROUP) {
throw new TransactionException(e);
}
}
return get(groupId);
}
@Override
public void postBusinessCode(TxTransactionInfo info) {
// RPC close DTX group
transactionControlTemplate.notifyGroup(
info.getGroupId(), info.getUnitId(), info.getTransactionType(),
DTXLocalContext.transactionState(globalContext.dtxState(info.getGroupId())));
}
public void notifyGroup(String groupId, String unitId, String transactionType, int state) {
try {
txLogger.txTrace(
groupId, unitId, "notify group > transaction type: {}, state: {}.", transactionType, state);
if (globalContext.isDTXTimeout()) {
throw new LcnBusinessException("dtx timeout.");
}
state = reliableMessenger.notifyGroup(groupId, state);
transactionCleanTemplate.clean(groupId, unitId, transactionType, state);
} catch (TransactionClearException e) {
txLogger.trace(groupId, unitId, Transactions.TE, "clean transaction fail.");
} catch (RpcException e) {
dtxExceptionHandler.handleNotifyGroupMessageException(Arrays.asList(groupId, state, unitId, transactionType), e);
} catch (LcnBusinessException e) {
// 關閉事務組失敗
dtxExceptionHandler.handleNotifyGroupBusinessException(Arrays.asList(groupId, state, unitId, transactionType), e.getCause());
}
txLogger.txTrace(groupId, unitId, "notify group exception state {}.", state);
}
我們按情況來說
1、如果A、B、C模塊都正確執行,這時notifyGroup方法的state參數爲1,如果調用服務端通知清理事務連接有問題或者網絡不通(請求異常) reliableMessenger.notifyGroup方法拋出RpcException 異常執行catch邏輯
catch (RpcException e) {
dtxExceptionHandler.handleNotifyGroupMessageException(Arrays.asList(groupId, state, unitId, transactionType), e);
}
public void handleNotifyGroupMessageException(Object params, Throwable ex) {
// 當0 時候
List paramList = (List) params;
String groupId = (String) paramList.get(0);
int state = (int) paramList.get(1);
if (state == 0) {
handleNotifyGroupBusinessException(params, ex);
return;
}
//1的情況
String unitId = (String) paramList.get(2);
String transactionType = (String) paramList.get(3);
try {
//清理本地事務
transactionCleanTemplate.cleanWithoutAspectLog(groupId, unitId, transactionType, state);
} catch (TransactionClearException e) {
txLogger.error(groupId, unitId, "notify group", "{} > cleanWithoutAspectLog transaction error.", transactionType);
}
// 上報Manager,上報直到成功.
tmReporter.reportTransactionState(groupId, null, TxExceptionParams.NOTIFY_GROUP_ERROR, state);
}
private MessageDto request(MessageDto messageDto, long timeout, String whenNonManagerMessage) throws RpcException {
for (int i = 0; i < rpcClient.loadAllRemoteKey().size() + 1; i++) {
try {
String remoteKey = rpcClient.loadRemoteKey();
MessageDto result = rpcClient.request(remoteKey, messageDto, timeout);
log.debug("request action: {}. TM[{}]", messageDto.getAction(), remoteKey);
return result;
} catch (RpcException e) {
if (e.getCode() == RpcException.NON_TX_MANAGER) {
throw new RpcException(e.getCode(), whenNonManagerMessage + ". non tx-manager is alive.");
}
}
}
throw new RpcException(RpcException.NON_TX_MANAGER, whenNonManagerMessage + ". non tx-manager is alive.");
}
會先提交本地事務(狀態爲1),然後會和服務端通信進行記錄事務狀態,可能有人會問你這都請求不到服務端,這裏怎麼會通信成功呢?我們都知道實際上我們的服務端部署多臺,分佈式事務只是選取一臺來操作事務,如果其中一臺不能正常工作,會選擇其他服務器。上面的request方法就是根據此客戶端連接的所有的服務端進行通信。
服務端接收到狀態爲1的消息後,會在t_tx_exception表中插入一條數據,state值爲1表示要提交事務。但是這裏A模塊提交了本地事務了,B、C模塊還沒提交這是怎麼搞的?
還記得前面提到的異步檢測程序嗎?
// 異步檢測
dtxChecking.startDelayCheckingAsync(groupId, unitId, transactionType);
public void startDelayCheckingAsync(String groupId, String unitId, String transactionType) {
txLogger.taskTrace(groupId, unitId, "start delay checking task");
ScheduledFuture scheduledFuture = scheduledExecutorService.schedule(() -> {
try {
TxContext txContext = globalContext.txContext(groupId);
if (Objects.nonNull(txContext)) {
synchronized (txContext.getLock()) {
txLogger.taskTrace(groupId, unitId, "checking waiting for business code finish.");
txContext.getLock().wait();
}
}
int state = reliableMessenger.askTransactionState(groupId, unitId);
txLogger.taskTrace(groupId, unitId, "ask transaction state {}", state);
if (state == -1) {
txLogger.error(this.getClass().getSimpleName(), "delay clean transaction error.");
onAskTransactionStateException(groupId, unitId, transactionType);
} else {
transactionCleanTemplate.clean(groupId, unitId, transactionType, state);
aspectLogger.clearLog(groupId, unitId);
}
} catch (RpcException e) {
onAskTransactionStateException(groupId, unitId, transactionType);
} catch (TransactionClearException | InterruptedException e) {
txLogger.error(this.getClass().getSimpleName(), "{} clean transaction error.", transactionType);
}
}, clientConfig.getDtxTime(), TimeUnit.MILLISECONDS);
delayTasks.put(groupId + unitId, scheduledFuture);
}
這個定時任務會按週期性的去調用服務端查詢t_tx_exception中的state信息,然後按照state進行提交事務或者回滾事務(這裏是提交)。mysql絕對可用。
如果發生業務異常LcnBusinessException,表示服務端在通知B、C客戶端提交事務失敗,同樣服務端會寫表t_tx_exception的state爲1(提交事務),然後A客戶端也提交事務
2、如果C模塊報錯則,C、B模塊已回滾。這種情況下無論是什麼異常只要A模塊回滾即可。
//請求異常回滾
public void handleNotifyGroupMessageException(Object params, Throwable ex) {
// 當0 時候
List paramList = (List) params;
String groupId = (String) paramList.get(0);
int state = (int) paramList.get(1);
if (state == 0) {
handleNotifyGroupBusinessException(params, ex);
return;
}
public void handleNotifyGroupBusinessException(Object params, Throwable ex) {
List paramList = (List) params;
String groupId = (String) paramList.get(0);
int state = (int) paramList.get(1);
String unitId = (String) paramList.get(2);
String transactionType = (String) paramList.get(3);
//用戶強制回滾.
if (ex instanceof UserRollbackException) {
state = 0;
}
if ((ex.getCause() != null && ex.getCause() instanceof UserRollbackException)) {
state = 0;
}
// 結束事務
try {
transactionCleanTemplate.clean(groupId, unitId, transactionType, state);
} catch (TransactionClearException e) {
txLogger.error(groupId, unitId, "notify group", "{} > clean transaction error.", transactionType);
}
}
事務異常回滾
public void handleNotifyGroupBusinessException(Object params, Throwable ex) {
List paramList = (List) params;
String groupId = (String) paramList.get(0);
int state = (int) paramList.get(1);
String unitId = (String) paramList.get(2);
String transactionType = (String) paramList.get(3);
//用戶強制回滾.
if (ex instanceof UserRollbackException) {
state = 0;
}
if ((ex.getCause() != null && ex.getCause() instanceof UserRollbackException)) {
state = 0;
}
// 結束事務
try {
transactionCleanTemplate.clean(groupId, unitId, transactionType, state);
} catch (TransactionClearException e) {
txLogger.error(groupId, unitId, "notify group", "{} > clean transaction error.", transactionType);
}
}
3、如果B或C模塊異常則只能通過通知B、C進行回滾,如果通知失敗則失敗,靠客戶端A無法處理。
注:由於服務端是高可用上述的一些異常基本不存在