一文搞懂ANR

1.ANR的定義

ANR(Application Not Responding):應用無響應
即主線程在特定的時間內沒有完成特定的事情,就會產生ANR。
在Android當中有以下幾種ANR的類型:

  1. KeyDispatchTimeout,input事件在5秒內沒有處理完;
  2. ServiceTimeout,前臺service在20秒內,後臺service在200秒內沒有處理完;
  3. BroadcastTimeout,BroadcastReceiver的onReceiver,前臺廣播在10秒內,後臺廣播在60秒內沒有處理完;
  4. ProcessContentproviderPublishTimeoutLocked,ContentProvider publish在10秒內沒有處理完;

2.各種場景產生ANR的原因

我們分別針對這幾種場景來看看,系統是如何拋出ANR異常的

2.1 ServiceTimeout

首先我們來看下一個Service啓動的流程,如下圖所示
在這裏插入圖片描述
我們來看下ActiveServices.realStartServiceLocked的源碼,

    private final void realStartServiceLocked(ServiceRecord r,
            ProcessRecord app, boolean execInFg) throws RemoteException {
        if (app.thread == null) {
            throw new RemoteException();
        }
        if (DEBUG_MU)
            Slog.v(TAG_MU, "realStartServiceLocked, ServiceRecord.uid = " + r.appInfo.uid
                    + ", ProcessRecord.uid = " + app.uid);
        r.app = app;
        r.restartTime = r.lastActivity = SystemClock.uptimeMillis();

        final boolean newService = app.services.add(r);
        //1.啓動ANR監測
        bumpServiceExecutingLocked(r, execInFg, "create");
        mAm.updateLruProcessLocked(app, false, null);
        updateServiceForegroundLocked(r.app, /* oomAdj= */ false);
        mAm.updateOomAdjLocked();

        boolean created = false;
        try {
            if (LOG_SERVICE_START_STOP) {
                String nameTerm;
                int lastPeriod = r.shortName.lastIndexOf('.');
                nameTerm = lastPeriod >= 0 ? r.shortName.substring(lastPeriod) : r.shortName;
                EventLogTags.writeAmCreateService(
                        r.userId, System.identityHashCode(r), nameTerm, r.app.uid, r.app.pid);
            }
            synchronized (r.stats.getBatteryStats()) {
                r.stats.startLaunchedLocked();
            }
            mAm.notifyPackageUse(r.serviceInfo.packageName,
                                 PackageManager.NOTIFY_PACKAGE_USE_SERVICE);
            app.forceProcessStateUpTo(ActivityManager.PROCESS_STATE_SERVICE);
            //2.啓動service
            app.thread.scheduleCreateService(r, r.serviceInfo,
                    mAm.compatibilityInfoForPackageLocked(r.serviceInfo.applicationInfo),
                    app.repProcState);
            r.postNotification();
            created = true;
        } catch (DeadObjectException e) {
            Slog.w(TAG, "Application dead when creating service " + r);
            mAm.appDiedLocked(app);
            throw e;
        } finally {
            if (!created) {
                // Keep the executeNesting count accurate.
                final boolean inDestroying = mDestroyingServices.contains(r);
                serviceDoneExecutingLocked(r, inDestroying, inDestroying);

                // Cleanup.
                if (newService) {
                    app.services.remove(r);
                    r.app = null;
                }

                // Retry.
                if (!inDestroying) {
                    scheduleServiceRestartLocked(r, false);
                }
            }
        }
        ......
    }

在上面的代碼註釋2處,啓動了service,而在啓動service之前,執行了函數bumpServiceExecutingLocked,這個函數會調用scheduleServiceTimeoutLocked,啓動ANR的監測機制,具體如下所示

// 前臺service timeout的時間
static final int SERVICE_TIMEOUT = 20*1000;

// 後臺service Timeout的時間
static final int SERVICE_BACKGROUND_TIMEOUT = SERVICE_TIMEOUT * 10;
    
void scheduleServiceTimeoutLocked(ProcessRecord proc) {
    if (proc.executingServices.size() == 0 || proc.thread == null) {
        return;
    }
    Message msg = mAm.mHandler.obtainMessage(
            ActivityManagerService.SERVICE_TIMEOUT_MSG);
    msg.obj = proc;
    mAm.mHandler.sendMessageDelayed(msg,
            proc.execServicesFg ? SERVICE_TIMEOUT : SERVICE_BACKGROUND_TIMEOUT);
}

可以很容易的看出,scheduleServiceTimeoutLocked其實就是往handler發送了一個延遲msg,如果是後臺service,delay的時間就是200秒,如果是前臺service,delay的時間就是20秒。
而當服務如果在限制的時間內就執行完成了,那麼執行ActivityManagerService.serviceDoneExecuting移除這個msg。

public void serviceDoneExecuting(IBinder token, int type, int startId, int res) {
    synchronized(this) {
        if (!(token instanceof ServiceRecord)) {
            Slog.e(TAG, "serviceDoneExecuting: Invalid service token=" + token);
            throw new IllegalArgumentException("Invalid service token");
        }
        mServices.serviceDoneExecutingLocked((ServiceRecord)token, type, startId, res);
    }
}

2.2 BroadcastTimeout

BroadcastTimeout的ANR原理和Service的差不多,我們來看下BroadcastQueue.processNextBroadcastLocked的源碼,

final void processNextBroadcastLocked(boolean fromMsg, boolean skipOomAdj) {
    ......
    
    do {
        if (mOrderedBroadcasts.size() == 0) {
            // No more broadcasts pending, so all done!
            mService.scheduleAppGcsLocked();
            if (looped) {
                // If we had finished the last ordered broadcast, then
                // make sure all processes have correct oom and sched
                // adjustments.
                mService.updateOomAdjLocked();
            }
            return;
        }
        r = mOrderedBroadcasts.get(0);
        boolean forceReceive = false;

        // Ensure that even if something goes awry with the timeout
        // detection, we catch "hung" broadcasts here, discard them,
        // and continue to make progress.
        //
        // This is only done if the system is ready so that PRE_BOOT_COMPLETED
        // receivers don't get executed with timeouts. They're intended for
        // one time heavy lifting after system upgrades and can take
        // significant amounts of time.
        int numReceivers = (r.receivers != null) ? r.receivers.size() : 0;
        if (mService.mProcessesReady && r.dispatchTime > 0) {
            long now = SystemClock.uptimeMillis();
            if ((numReceivers > 0) &&
                    (now > r.dispatchTime + (2*mTimeoutPeriod*numReceivers))) {
                Slog.w(TAG, "Hung broadcast ["
                        + mQueueName + "] discarded after timeout failure:"
                        + " now=" + now
                        + " dispatchTime=" + r.dispatchTime
                        + " startTime=" + r.receiverTime
                        + " intent=" + r.intent
                        + " numReceivers=" + numReceivers
                        + " nextReceiver=" + r.nextReceiver
                        + " state=" + r.state);
                //1.啓動ANR監聽機制
                broadcastTimeoutLocked(false); // forcibly finish this broadcast
                forceReceive = true;
                r.state = BroadcastRecord.IDLE;
            }
        }

        if (r.state != BroadcastRecord.IDLE) {
            if (DEBUG_BROADCAST) Slog.d(TAG_BROADCAST,
                    "processNextBroadcast("
                    + mQueueName + ") called when not idle (state="
                    + r.state + ")");
            return;
        }

        if (r.receivers == null || r.nextReceiver >= numReceivers
                || r.resultAbort || forceReceive) {
            // No more receivers for this broadcast!  Send the final
            // result if requested...
            if (r.resultTo != null) {
                try {
                    if (DEBUG_BROADCAST) Slog.i(TAG_BROADCAST,
                            "Finishing broadcast [" + mQueueName + "] "
                            + r.intent.getAction() + " app=" + r.callerApp);
                    //2.處理廣播消息
                    performReceiveLocked(r.callerApp, r.resultTo,
                        new Intent(r.intent), r.resultCode,
                        r.resultData, r.resultExtras, false, false, r.userId);
                    // Set this to null so that the reference
                    // (local and remote) isn't kept in the mBroadcastHistory.
                    r.resultTo = null;
                } catch (RemoteException e) {
                    r.resultTo = null;
                    Slog.w(TAG, "Failure ["
                            + mQueueName + "] sending broadcast result of "
                            + r.intent, e);

                }
            }

            if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST, "Cancelling BROADCAST_TIMEOUT_MSG");
            cancelBroadcastTimeoutLocked();

            if (DEBUG_BROADCAST_LIGHT) Slog.v(TAG_BROADCAST,
                    "Finished with ordered broadcast " + r);

            // ... and on to the next...
            addBroadcastToHistoryLocked(r);
            if (r.intent.getComponent() == null && r.intent.getPackage() == null
                    && (r.intent.getFlags()&Intent.FLAG_RECEIVER_REGISTERED_ONLY) == 0) {
                // This was an implicit broadcast... let's record it for posterity.
                mService.addBroadcastStatLocked(r.intent.getAction(), r.callerPackage,
                        r.manifestCount, r.manifestSkipCount, r.finishTime-r.dispatchTime);
            }
            mOrderedBroadcasts.remove(0);
            r = null;
            looped = true;
            continue;
        }
    } while (r == null);

    ......
}

我們可以看到在上面的代碼註釋1處啓動了ANR的監聽機制,在註釋2處處理廣播信息。在註釋1處,broadcastTimeoutLocked最終會調用broadcastTimeoutLocked,如下面的代碼所示,而timeout的時間就等於接收到廣播的時間+mTimeoutPeriod。

final void broadcastTimeoutLocked(boolean fromMsg) {
    ......
    if (fromMsg) {
        if (!mService.mProcessesReady) {
            // Only process broadcast timeouts if the system is ready. That way
            // PRE_BOOT_COMPLETED broadcasts can't timeout as they are intended
            // to do heavy lifting for system up.
            return;
        }

        long timeoutTime = r.receiverTime + mTimeoutPeriod;
        if (timeoutTime > now) {
            if (DEBUG_BROADCAST) Slog.v(TAG_BROADCAST,
                    "Premature timeout ["
                    + mQueueName + "] @ " + now + ": resetting BROADCAST_TIMEOUT_MSG for "
                    + timeoutTime);
            //發送延遲消息
            setBroadcastTimeoutLocked(timeoutTime);
            return;
        }
    }
    ......
}

final void setBroadcastTimeoutLocked(long timeoutTime) {
    if (! mPendingBroadcastTimeoutMessage) {
        Message msg = mHandler.obtainMessage(BROADCAST_TIMEOUT_MSG, this);
        mHandler.sendMessageAtTime(msg, timeoutTime);
        mPendingBroadcastTimeoutMessage = true;
    }
}

mTimePeriod是在初始化的時候傳入的,如下所示。可以很清楚的看到前臺BroadcastReceiver的Timeout的時間是10秒,後臺BroadcastReceiver的時間是60秒

//BroadcastQueue代碼
BroadcastQueue(ActivityManagerService service, Handler handler,
        String name, long timeoutPeriod, boolean allowDelayBehindServices) {
    mService = service;
    mHandler = new BroadcastHandler(handler.getLooper());
    mQueueName = name;
    mTimeoutPeriod = timeoutPeriod;
    mDelayBehindServices = allowDelayBehindServices;
}

//ActivityManagerService代碼
static final int BROADCAST_FG_TIMEOUT = 10*1000;
static final int BROADCAST_BG_TIMEOUT = 60*1000;

mFgBroadcastQueue = new BroadcastQueue(this, mHandler,
        "foreground", BROADCAST_FG_TIMEOUT, false);
mBgBroadcastQueue = new BroadcastQueue(this, mHandler,
        "background", BROADCAST_BG_TIMEOUT, true);

2.3 ProcessContentproviderPublishTimeoutLocked

在APP進程啓動的時,ActivityThread的main函數會執行attach,而attach函數會調用ActivityManagerService.attachApplication,進而執行attachApplicationLocked函數,我們可以看到在註釋1處發送了延遲msg,啓動了anr監聽機制,而CONTENT_PROVIDER_PUBLISH_TIMEOUT的時間是10秒。


static final int CONTENT_PROVIDER_PUBLISH_TIMEOUT = 10*1000;

private final boolean attachApplicationLocked(IApplicationThread thread,
        int pid, int callingUid, long startSeq) {

    ......

    mHandler.removeMessages(PROC_START_TIMEOUT_MSG, app);

    boolean normalMode = mProcessesReady || isAllowedWhileBooting(app.info);
    List<ProviderInfo> providers = normalMode ? generateApplicationProvidersLocked(app) : null;

    if (providers != null && checkAppInLaunchingProvidersLocked(app)) {
        Message msg = mHandler.obtainMessage(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG);
        msg.obj = app;
        //1.啓動anr監聽機制
        mHandler.sendMessageDelayed(msg, CONTENT_PROVIDER_PUBLISH_TIMEOUT);
    }

    checkTime(startTime, "attachApplicationLocked: before bindApplication");

    ......
}

而當ContentProvider成功發佈之後,ActivityManagerService會remove掉這個msg,如下面代碼註釋1處所示

public final void publishContentProviders(IApplicationThread caller,
            List<ContentProviderHolder> providers) {
        if (providers == null) {
            return;
        }

        enforceNotIsolatedCaller("publishContentProviders");
        synchronized (this) {
            ......
            for (int i = 0; i < N; i++) {
                ContentProviderHolder src = providers.get(i);
                if (src == null || src.info == null || src.provider == null) {
                    continue;
                }
                ContentProviderRecord dst = r.pubProviders.get(src.info.name);
                if (DEBUG_MU) Slog.v(TAG_MU, "ContentProviderRecord uid = " + dst.uid);
                if (dst != null) {
                    ComponentName comp = new ComponentName(dst.info.packageName, dst.info.name);
                    mProviderMap.putProviderByClass(comp, dst);
                    String names[] = dst.info.authority.split(";");
                    for (int j = 0; j < names.length; j++) {
                        mProviderMap.putProviderByName(names[j], dst);
                    }

                    int launchingCount = mLaunchingProviders.size();
                    int j;
                    boolean wasInLaunchingProviders = false;
                    for (j = 0; j < launchingCount; j++) {
                        if (mLaunchingProviders.get(j) == dst) {
                            mLaunchingProviders.remove(j);
                            wasInLaunchingProviders = true;
                            j--;
                            launchingCount--;
                        }
                    }
                    if (wasInLaunchingProviders) {
                        //1.移除Timeout的msg
                        mHandler.removeMessages(CONTENT_PROVIDER_PUBLISH_TIMEOUT_MSG, r);
                    }
                    synchronized (dst) {
                        dst.provider = src.provider;
                        dst.proc = r;
                        dst.notifyAll();
                    }
                    updateOomAdjLocked(r, true);
                    maybeUpdateProviderUsageStatsLocked(r, src.info.packageName,
                            src.info.authority);
                }
            }

            Binder.restoreCallingIdentity(origId);
        }
    }

2.4 KeyDispatchTimeout

InpuntManagerService會起兩個線程InputReader線程和InputDispatcherThread,InputDispatcherThread線程會由reader線程wake,起來後就threadloop不斷循環讀取input事件。
具體流程如下
在這裏插入圖片描述
圖片來源於https://www.jianshu.com/p/fd376366e031?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation
這裏我們主要看一下handleTargetsNotReadyLocked函數,
如下注釋1處,當inputdispatcher收到一個事件之後,會將mInputTargetWaitTimeoutTime賦值爲當前時間+timeout,然後當下一個input事件來臨之時,用此時的當前時間和mInputTargetWaitTimeoutTime進行比對(註釋2處),如果當前時間大於mInputTargetWaitTimeoutTime則觸發ANR。

//默認5秒超時
constexpr nsecs_t DEFAULT_INPUT_DISPATCHING_TIMEOUT = 5000 * 1000000LL; // 5 sec

int32_t InputDispatcher::handleTargetsNotReadyLocked(
        nsecs_t currentTime, const EventEntry* entry,
        const sp<InputApplicationHandle>& applicationHandle,
        const sp<InputWindowHandle>& windowHandle, nsecs_t* nextWakeupTime, const char* reason) {
    if (applicationHandle == nullptr && windowHandle == nullptr) {
        if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY) {
#if DEBUG_FOCUS
            ALOGD("Waiting for system to become ready for input.  Reason: %s", reason);
#endif
            mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_SYSTEM_NOT_READY;
            mInputTargetWaitStartTime = currentTime;
            mInputTargetWaitTimeoutTime = LONG_LONG_MAX;
            mInputTargetWaitTimeoutExpired = false;
            mInputTargetWaitApplicationToken.clear();
        }
    } else {
        if (mInputTargetWaitCause != INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY) {
#if DEBUG_FOCUS
            ALOGD("Waiting for application to become ready for input: %s.  Reason: %s",
                  getApplicationWindowLabel(applicationHandle, windowHandle).c_str(), reason);
#endif
            nsecs_t timeout;
            if (windowHandle != nullptr) {
                timeout = windowHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
            } else if (applicationHandle != nullptr) {
                timeout =
                        applicationHandle->getDispatchingTimeout(DEFAULT_INPUT_DISPATCHING_TIMEOUT);
            } else {
                timeout = DEFAULT_INPUT_DISPATCHING_TIMEOUT;
            }

            mInputTargetWaitCause = INPUT_TARGET_WAIT_CAUSE_APPLICATION_NOT_READY;
            mInputTargetWaitStartTime = currentTime;
            //1.當前時間+timeout
            mInputTargetWaitTimeoutTime = currentTime + timeout;
            mInputTargetWaitTimeoutExpired = false;
            mInputTargetWaitApplicationToken.clear();

            if (windowHandle != nullptr) {
                mInputTargetWaitApplicationToken = windowHandle->getApplicationToken();
            }
            if (mInputTargetWaitApplicationToken == nullptr && applicationHandle != nullptr) {
                mInputTargetWaitApplicationToken = applicationHandle->getApplicationToken();
            }
        }
    }

    if (mInputTargetWaitTimeoutExpired) {
        return INPUT_EVENT_INJECTION_TIMED_OUT;
    }
    //2.觸發anr
    if (currentTime >= mInputTargetWaitTimeoutTime) {
        onANRLocked(currentTime, applicationHandle, windowHandle, entry->eventTime,
                    mInputTargetWaitStartTime, reason);

        // Force poll loop to wake up immediately on next iteration once we get the
        // ANR response back from the policy.
        *nextWakeupTime = LONG_LONG_MIN;
        return INPUT_EVENT_INJECTION_PENDING;
    } else {
        // Force poll loop to wake up when timeout is due.
        if (mInputTargetWaitTimeoutTime < *nextWakeupTime) {
            *nextWakeupTime = mInputTargetWaitTimeoutTime;
        }
        return INPUT_EVENT_INJECTION_PENDING;
    }
}

3.總結

對比這幾種不同場景的anr,其實原理都是一樣,就是在執行之前先埋一顆定時炸彈,如果能夠在規定的時間之內執行完,那麼就可以及時的拆除炸彈,否則就會引爆炸彈,產生anr。

參考:https://www.jianshu.com/p/fd376366e031?utm_campaign=maleskine&utm_content=note&utm_medium=seo_notes&utm_source=recommendation

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章