1.ANR的定義
ANR(Application Not Responding):應用無響應
即主線程在特定的時間內沒有完成特定的事情,就會產生ANR。
在Android當中有以下幾種ANR的類型:
- KeyDispatchTimeout,input事件在5秒內沒有處理完;
- ServiceTimeout,前臺service在20秒內,後臺service在200秒內沒有處理完;
- BroadcastTimeout,BroadcastReceiver的onReceiver,前臺廣播在10秒內,後臺廣播在60秒內沒有處理完;
- ProcessContentproviderPublishTimeoutLocked,ContentProvider publish在10秒內沒有處理完;
2.各種場景產生ANR的原因
我們分別針對這幾種場景來看看,系統是如何拋出ANR異常的
2.1 ServiceTimeout
首先我們來看下一個Service啓動的流程,如下圖所示
我們來看下ActiveServices.realStartServiceLocked的源碼,
private final void realStartServiceLocked(ServiceRecord r,
ProcessRecord app, boolean execInFg) throws RemoteException {
if (app.thread == null) {
throw new RemoteException();
}
if (DEBUG_MU)
Slog.v(TAG_MU, "realStartServiceLocked, ServiceRecord.uid = " + r.appInfo.uid
+ ", ProcessRecord.uid = " + app.uid);
r.app = app;
r.restartTime = r.lastActivity = SystemClock.uptimeMillis();
final boolean newService = app.services.add(r);
//1.啓動ANR監測
bumpServiceExecutingLocked(r, execInFg, "create");
mAm.updateLruProcessLocked(app, false, null);
updateServiceForegroundLocked(r.app, /* oomAdj= */ false);
mAm.updateOomAdjLocked();
boolean created = false;
try {
if (LOG_SERVICE_START_STOP) {
String nameTerm;
int lastPeriod = r.shortName.lastIndexOf('.');
nameTerm = lastPeriod >= 0 ? r.shortName.substring(lastPeriod) : r.shortName;
EventLogTags.writeAmCreateService(
r.userId, System.identityHashCode(r), nameTerm, r.app.uid, r.app.pid);
}
synchronized (r.stats.getBatteryStats()) {
r.stats.startLaunchedLocked();
}
mAm.notifyPackageUse(r.serviceInfo.packageName,
PackageManager.NOTIFY_PACKAGE_USE_SERVICE);
app.forceProcessStateUpTo(ActivityManager.PROCESS_STATE_SERVICE);
//2.啓動service
app.thread.scheduleCreateService(r, r.serviceInfo,
mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo),
app.repProcState);
r.postNotification();
created = true;
} catch (DeadObjectException e) {
Slog.w(TAG, "Application dead when creating service " + r);
mAm.appDiedLocked(app);
throw e;
} finally {
if (!created) {
// Keep the executeNesting count accurate.
final boolean inDestroying = mDestroyingServices.contains(r);
serviceDoneExecutingLocked(r, inDestroying, inDestroying);
// Cleanup.
if (newService) {
app.services.remove(r);
r.app = null;
}
// Retry.
if (!inDestroying) {
scheduleServiceRestartLocked(r, false);
}
}
}
......
}
在上面的代碼註釋2處,啓動了service,而在啓動service之前,執行了函數bumpServiceExecutingLocked,這個函數會調用scheduleServiceTimeoutLocked,啓動ANR的監測機制,具體如下所示
// 前臺service timeout的時間
static final int SERVICE_TIMEOUT = 20*1000;
// 後臺service Timeout的時間
static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;
void scheduleServiceTimeoutLocked(ProcessRecord proc) {
if (proc.executingServices.size() == 0 || proc.thread == null) {
return;
}
Message msg = mAm.mHandler.obtainMessage(
ActivityManagerService.SERVICE_TIMEOUT_MSG);
msg.obj = proc;
mAm.mHandler.sendMessageDelayed(msg,
proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
}
可以很容易的看出,scheduleServiceTimeoutLocked其實就是往handler發送了一個延遲msg,如果是後臺service,delay的時間就是200秒,如果是前臺service,delay的時間就是20秒。
而當服務如果在限制的時間內就執行完成了,那麼執行ActivityManagerService.serviceDoneExecuting移除這個msg。
public void serviceDoneExecuting(IBinder token, int type, int startId, int res) {
synchronized(this) {
if (!(token instanceof ServiceRecord)) {
Slog.e(TAG, "serviceDoneExecuting: Invalid service token=" + token);
throw new IllegalArgumentException("Invalid service token");
}
mServices.serviceDoneExecutingLocked((ServiceRecord)token, type, startId, res);
}
}
2.2 BroadcastTimeout
BroadcastTimeout的ANR原理和Service的差不多,我們來看下BroadcastQueue.processNextBroadcastLocked的源碼,
final void processNextBroadcastLocked(boolean fromMsg, boolean skipOomAdj) {
......
do {
if (mOrderedBroadcasts.size() == 0) {
// No more broadcasts pending, so all done!
mService.scheduleAppGcsLocked();
if (looped) {
// If we had finished the last ordered broadcast, then
// make sure all processes have correct oom and sched
// adjustments.
mService.updateOomAdjLocked();
}
return;
}
r = mOrderedBroadcasts.get(0);
boolean forceReceive = false;
// Ensure that even if something goes awry with the timeout
// detection, we catch "hung" broadcasts here, discard them,
// and continue to make progress.
//
// This is only done if the system is ready so that PRE_BOOT_COMPLETED
// receivers don't get executed with timeouts. They're intended for
// one time heavy lifting after system upgrades and can take
// significant amounts of time.
int numReceivers = (r.receivers != null) ? r.receivers.size() : 0;
if (mService.mProcessesReady && r.dispatchTime > 0) {
long now = SystemClock.uptimeMillis();
if ((numReceivers > 0) &&
(now > r.dispatchTime + (2*mTimeoutPeriod*numReceivers))) {
Slog.w(TAG, "Hung broadcast ["
+ mQueueName + "] discarded after timeout failure:"
+ " now=" + now
+ " dispatchTime=" + r.dispatchTime
+ " startTime=" + r.receiverTime
+ " intent=" + r.intent
+ " numReceivers=" + numReceivers
+ " nextReceiver=" + r.nextReceiver
+ " state=" + r.state);
//1.啓動ANR監聽機制
broadcastTimeoutLocked(false); // forcibly finish this broadcast
forceReceive = true;
r.state = BroadcastRecord.IDLE;
}
}
if (r.state != BroadcastRecord.IDLE) {
if (DEBUG_BROADCAST) Slog.d(TAG_BROADCAST,
"processNextBroadcast("
+ mQueueName + ") called when not idle (state="
+ r.state + ")");
return;
}
if (r.receivers == null || r.nextReceiver >= numReceivers
|| r.resultAbort || forceReceive) {
// No more receivers for this broadcast! Send the final
// result if requested...
if (r.resultTo != null) {
try {
if (DEBUG_BROADCAST) Slog.i(TAG_BROADCAST,
"Finishing broadcast [" + mQueueName + "] "
+ r.intent.getAction() + " app=" + r.callerApp);
//2.處理廣播消息
performReceiveLocked(r.callerApp, r.resultTo,
new Intent(r.intent), r.resultCode,
r.resultData, r.resultExtras, false, false, r.userId);
// Set this to null so that the reference
// (local and remote) isn't kept in the mBroadcastHistory.
r.resultTo = null;
} catch (RemoteException e) {
r.resultTo = null;
Slog.w(TAG, "Failure ["
+ mQueueName + "] sending broadcast result of "
+ r.intent, e);
}
}
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Cancelling BROADCAST_TIMEOUT_MSG");
cancelBroadcastTimeoutLocked();
if (DEBUG_BROADCAST_LIGHT) Slog.v(TAG_BROADCAST,
"Finished with ordered broadcast " + r);
// ... and on to the next...
addBroadcastToHistoryLocked(r);
if (r.intent.getComponent() == null && r.intent.getPackage() == null
&& (r.intent.getFlags()&Intent.FLAG_RECEIVER_REGISTERED_ONLY) == 0) {
// This was an implicit broadcast... let's record it for posterity.
mService.addBroadcastStatLocked(r.intent.getAction(), r.callerPackage,
r.manifestCount, r.manifestSkipCount, r.finishTime-r.dispatchTime);
}
mOrderedBroadcasts.remove(0);
r = null;
looped = true;
continue;
}
} while (r == null);
......
}
我們可以看到在上面的代碼註釋1處啓動了ANR的監聽機制,在註釋2處處理廣播信息。在註釋1處,broadcastTimeoutLocked最終會調用broadcastTimeoutLocked,如下面的代碼所示,而timeout的時間就等於接收到廣播的時間+mTimeoutPeriod。
final void broadcastTimeoutLocked(boolean fromMsg) {
......
if (fromMsg) {
if (!mService.mProcessesReady) {
// Only process broadcast timeouts if the system is ready. That way
// PRE_BOOT_COMPLETED broadcasts can't timeout as they are intended
// to do heavy lifting for system up.
return;
}
long timeoutTime = r.receiverTime + mTimeoutPeriod;
if (timeoutTime > now) {
if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
"Premature timeout ["
+ mQueueName + "] @ " + now + ": resetting BROADCAST_TIMEOUT_MSG for "
+ timeoutTime);
//發送延遲消息
setBroadcastTimeoutLocked(timeoutTime);
return;
}
}
......
}
final void setBroadcastTimeoutLocked(long timeoutTime) {
if (! mPendingBroadcastTimeoutMessage) {
Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this);
mHandler.sendMessageAtTime(msg, timeoutTime);
mPendingBroadcastTimeoutMessage = true;
}
}
mTimePeriod是在初始化的時候傳入的,如下所示。可以很清楚的看到前臺BroadcastReceiver的Timeout的時間是10秒,後臺BroadcastReceiver的時間是60秒
//BroadcastQueue代碼
BroadcastQueue(ActivityManagerService service, Handler handler,
String name, long timeoutPeriod, boolean allowDelayBehindServices) {
mService = service;
mHandler = new BroadcastHandler(handler.getLooper());
mQueueName = name;
mTimeoutPeriod = timeoutPeriod;
mDelayBehindServices = allowDelayBehindServices;
}
//ActivityManagerService代碼
static final int BROADCAST_FG_TIMEOUT = 10*1000;
static final int BROADCAST_BG_TIMEOUT = 60*1000;
mFgBroadcastQueue = new BroadcastQueue(this, mHandler,
"foreground", BROADCAST_FG_TIMEOUT, false);
mBgBroadcastQueue = new BroadcastQueue(this, mHandler,
"background", BROADCAST_BG_TIMEOUT, true);
2.3 ProcessContentproviderPublishTimeoutLocked
在APP進程啓動的時,ActivityThread的main函數會執行attach,而attach函數會調用ActivityManagerService.attachApplication,進而執行attachApplicationLocked函數,我們可以看到在註釋1處發送了延遲msg,啓動了anr監聽機制,而CONTENT_PROVIDER_PUBLISH_TIMEOUT的時間是10秒。
static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10*1000;
private final boolean attachApplicationLocked(IApplicationThread thread,
int pid, int callingUid, long startSeq) {
......
mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app);
boolean normalMode = mProcessesReady || isAllowedWhileBooting(app.info);
List<ProviderInfo> providers = normalMode ? generateApplicationProvidersLocked(app) : null;
if (providers != null && checkAppInLaunchingProvidersLocked(app)) {
Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);
msg.obj = app;
//1.啓動anr監聽機制
mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_PUBLISH_TIMEOUT);
}
checkTime(startTime, "attachApplicationLocked: before bindApplication");
......
}
而當ContentProvider成功發佈之後,ActivityManagerService會remove掉這個msg,如下面代碼註釋1處所示
public final void publishContentProviders(IApplicationThread caller,
List<ContentProviderHolder> providers) {
if (providers == null) {
return;
}
enforceNotIsolatedCaller("publishContentProviders");
synchronized (this) {
......
for (int i = 0; i < N; i++) {
ContentProviderHolder src = providers.get(i);
if (src == null || src.info == null || src.provider == null) {
continue;
}
ContentProviderRecord dst = r.pubProviders.get(src.info.name);
if (DEBUG_MU) Slog.v(TAG_MU, "ContentProviderRecord uid = " + dst.uid);
if (dst != null) {
ComponentName comp = new ComponentName(dst.info.packageName, dst.info.name);
mProviderMap.putProviderByClass(comp, dst);
String names[] = dst.info.authority.split(";");
for (int j = 0; j < names.length; j++) {
mProviderMap.putProviderByName(names[j], dst);
}
int launchingCount = mLaunchingProviders.size();
int j;
boolean wasInLaunchingProviders = false;
for (j = 0; j < launchingCount; j++) {
if (mLaunchingProviders.get(j) == dst) {
mLaunchingProviders.remove(j);
wasInLaunchingProviders = true;
j--;
launchingCount--;
}
}
if (wasInLaunchingProviders) {
//1.移除Timeout的msg
mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, r);
}
synchronized (dst) {
dst.provider = src.provider;
dst.proc = r;
dst.notifyAll();
}
updateOomAdjLocked(r, true);
maybeUpdateProviderUsageStatsLocked(r, src.info.packageName,
src.info.authority);
}
}
Binder.restoreCallingIdentity(origId);
}
}
2.4 KeyDispatchTimeout
InpuntManagerService會起兩個線程InputReader線程和InputDispatcherThread,InputDispatcherThread線程會由reader線程wake,起來後就threadloop不斷循環讀取input事件。
具體流程如下
圖片來源於https://www.jianshu.com/p/fd376366e031?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation
這裏我們主要看一下handleTargetsNotReadyLocked函數,
如下注釋1處,當inputdispatcher收到一個事件之後,會將mInputTargetWaitTimeoutTime賦值爲當前時間+timeout,然後當下一個input事件來臨之時,用此時的當前時間和mInputTargetWaitTimeoutTime進行比對(註釋2處),如果當前時間大於mInputTargetWaitTimeoutTime則觸發ANR。
//默認5秒超時
constexpr nsecs_t DEFAULT_INPUT_DISPATCHING_TIMEOUT = 5000 * 1000000LL; // 5 sec
int32_t InputDispatcher::handleTargetsNotReadyLocked(
nsecs_t currentTime, const EventEntry* entry,
const sp<InputApplicationHandle>& applicationHandle,
const sp<InputWindowHandle>& windowHandle, nsecs_t* nextWakeupTime, const char* reason) {
if (applicationHandle == nullptr && windowHandle == nullptr) {
if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY) {
#if DEBUG_FOCUS
ALOGD("Waiting for system to become ready for input. Reason: %s", reason);
#endif
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY;
mInputTargetWaitStartTime = currentTime;
mInputTargetWaitTimeoutTime = LONG_LONG_MAX;
mInputTargetWaitTimeoutExpired = false;
mInputTargetWaitApplicationToken.clear();
}
} else {
if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY) {
#if DEBUG_FOCUS
ALOGD("Waiting for application to become ready for input: %s. Reason: %s",
getApplicationWindowLabel(applicationHandle, windowHandle).c_str(), reason);
#endif
nsecs_t timeout;
if (windowHandle != nullptr) {
timeout = windowHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
} else if (applicationHandle != nullptr) {
timeout =
applicationHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
} else {
timeout = DEFAULT_INPUT_DISPATCHING_TIMEOUT;
}
mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY;
mInputTargetWaitStartTime = currentTime;
//1.當前時間+timeout
mInputTargetWaitTimeoutTime = currentTime + timeout;
mInputTargetWaitTimeoutExpired = false;
mInputTargetWaitApplicationToken.clear();
if (windowHandle != nullptr) {
mInputTargetWaitApplicationToken = windowHandle->getApplicationToken();
}
if (mInputTargetWaitApplicationToken == nullptr && applicationHandle != nullptr) {
mInputTargetWaitApplicationToken = applicationHandle->getApplicationToken();
}
}
}
if (mInputTargetWaitTimeoutExpired) {
return INPUT_EVENT_INJECTION_TIMED_OUT;
}
//2.觸發anr
if (currentTime >= mInputTargetWaitTimeoutTime) {
onANRLocked(currentTime, applicationHandle, windowHandle, entry->eventTime,
mInputTargetWaitStartTime, reason);
// Force poll loop to wake up immediately on next iteration once we get the
// ANR response back from the policy.
*nextWakeupTime = LONG_LONG_MIN;
return INPUT_EVENT_INJECTION_PENDING;
} else {
// Force poll loop to wake up when timeout is due.
if (mInputTargetWaitTimeoutTime < *nextWakeupTime) {
*nextWakeupTime = mInputTargetWaitTimeoutTime;
}
return INPUT_EVENT_INJECTION_PENDING;
}
}
3.總結
對比這幾種不同場景的anr,其實原理都是一樣,就是在執行之前先埋一顆定時炸彈,如果能夠在規定的時間之內執行完,那麼就可以及時的拆除炸彈,否則就會引爆炸彈,產生anr。
參考:https://www.jianshu.com/p/fd376366e031?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation