LMKD淺析(三)——Android Q新特性(MTK篇)

原標題:LMKD淺析(三)——Android Q新特性

修改標題原因:由於QCOM基線接觸得晚了不少,所以一直以來以爲如下這篇淺析是Android Q全平臺適用的……

結果QCOM基線來了以後發現差異巨大,因此將此篇的標題改爲LMKD淺析(三)——Android Q新特性(MTK篇)

 

正文:

Android Q引入新模式——PSI (Pressure Stall Information),根據註冊對PSI信息的監聽,並通過判斷watermark,memfree,swapfree,thrashing等信息,更全面地判斷當前系統壓力,並進行鍼對性殺進程。

 

此模式需要依賴:

1、內核配置CONFIG_PSI=y;

2、屬性ro.lmk.use_psi須不爲false;

3、屬性ro.config.low_ram爲true,或ro.lmk.use_minfree_levels爲false;

 

此模式可以配置參數:

1、ro.lmk.swap_free_low_percentage

此屬性在Android P上即引入,用於作爲kill條件判斷,數值爲整數,代表百分比,指當swap的可用空間低於一定比例時,lmkd開始進行kill操作;默認爲10(當ro.config.low_ram=true時)/ 20;

2、ro.lmk.psi_partial_stall_ms

此屬性爲PSI模式特有屬性,僅在PSI模式成功啓用的條件下生效。代表上報數據的條件,數值爲整數,代表毫秒值,指當psi監聽到some級別失速在某一秒內超過多少毫秒時,上報壓力值1(VMPRESS_LEVEL_MEDIUM)。

默認爲200(當ro.config.low_ram=true時) / 70

3、ro.lmk.psi_complete_stall_ms

此屬性爲PSI模式特有屬性,僅在PSI模式成功啓用的條件下生效。代表上報數據的條件,數值爲整數,代表毫秒值,指當psi監聽到full級別失速在某一秒內超過多少毫秒時,上報壓力值2(VMPRESS_LEVEL_CRITICAL)。

默認爲700

 

關於PSI的some與full場景介紹,暫時不擴展介紹,詳見linux kernel官方文檔:https://www.kernel.org/doc/html/latest/accounting/psi.html

lmkd只使用了PSI中memory模塊的信息,後續有空會另起一篇介紹。

 

LMKD工作流程:

1、開機啓動lmkd後進行狀態檢查,加載各種屬性並進行判斷;

2、註冊PSI監聽;

3、當PSI上報壓力值到lmkd時,lmkd會作如下判斷來決定是否需要kill:

    a. 當上一次kill未完成時,當獲取時間失敗時,或當解析vmstat/meminfo失敗時,不進行kill;

    /* Skip while still killing a process */
    if (is_kill_pending()) {
        /* TODO: replace this quick polling with pidfd polling if kernel supports */
        goto no_kill;
    }

    if (clock_gettime(CLOCK_MONOTONIC_COARSE, &curr_tm) != 0) {
        ALOGE("Failed to get current time");
        return;
    }

    if (vmstat_parse(&vs) < 0) {
        ALOGE("Failed to parse vmstat!");
        return;
    }

    if (meminfo_parse(&mi) < 0) {
        ALOGE("Failed to parse meminfo!");
        return;
    }

    b. 當swap可用空間低於ro.lmk.swap_free_low_percentage屬性定義的百分比時,設置swap_is_low = true;

    /* Check free swap levels */
    if (swap_free_low_percentage) {
        if (!swap_low_threshold) {
            swap_low_threshold = mi.field.total_swap * swap_free_low_percentage / 100;
        }
        if (mi.field.free_swap < swap_low_threshold) {
            swap_is_low = true;
        }
    }

    c. 通過判斷pgscan_direct/pgscan_kswapd字段較上一次的變化,確定內存回收的狀態是直接回收(DIRECT_RECLAIM)還是通過swap回收(KSWAPD_RECLAIM),如果都不是(NO_RECLAIM),說明內存壓力不大,不進行kill,否則獲取thrashing值(通過判斷refault頁所佔比例);

    /* Identify reclaim state */
    if (vs.field.pgscan_direct > init_pgscan_direct) {
        init_pgscan_direct = vs.field.pgscan_direct;
        init_pgscan_kswapd = vs.field.pgscan_kswapd;
        reclaim = DIRECT_RECLAIM;
    } else if (vs.field.pgscan_kswapd > init_pgscan_kswapd) {
        init_pgscan_kswapd = vs.field.pgscan_kswapd;
        reclaim = KSWAPD_RECLAIM;
    }

    /* Skip if system is not reclaiming */
    if (reclaim == NO_RECLAIM) {
        in_reclaim = false;
        goto no_kill;
    }
    if (!in_reclaim) {
        /* Record file-backed pagecache size when entering reclaim cycle */
        base_file_lru = vs.field.nr_inactive_file + vs.field.nr_active_file;
        init_ws_refault = vs.field.workingset_refault;
        thrashing_limit = thrashing_limit_pct;
    } else {
        /* Calculate what % of the file-backed pagecache refaulted so far */
        thrashing = (vs.field.workingset_refault - init_ws_refault) * 100 / base_file_lru;
    }
    in_reclaim = true;

    d. 解析zoneinfo並計算min/low/hight水位線;

    /* Refresh thresholds once per min in case user updated one of the margins */
    if (thresholds.high_wmark == 0 || get_time_diff_ms(&threshold_update_tm, &curr_tm) > 60000) {
        struct zoneinfo zi;

        /*
         * In unlikely case of failing we skip the update until the next opportunity
         * but still rate limiting the updates even as we skip one.
         */
        if (zoneinfo_parse(&zi) < 0) {
            ALOGE("Failed to parse zoneinfo!");
        } else {
            calc_zone_thresholds(&zi, &thresholds);
        }
        threshold_update_tm = curr_tm;
    }

    e. 使用當前meminfo的數據來判斷當前所處水位;

    wmark = get_lowest_watermark(&mi, &thresholds);

    f. 根據水位線、thrashing值、壓力值、swap_low值、內存回收模式等進行多種場景判斷,並添加不同的kill原因:

    if (cycle_after_kill && wmark > WMARK_LOW) {
        /* Prevent kills not freeing enough memory */
        kill_reason = PRESSURE_AFTER_KILL;
        strncpy(kill_desc, "min watermark is breached even after kill", sizeof(kill_desc));
    } else if (level == VMPRESS_LEVEL_CRITICAL && events != 0) {
        /* Device is too busy during lowmem event (kill to prevent ANR) */
        kill_reason = NOT_RESPONDING;
        strncpy(kill_desc, "device is not responding", sizeof(kill_desc));
    } else if (swap_is_low && thrashing > thrashing_limit_pct) {
        /* Page cache is thrashing */
        kill_reason = LOW_SWAP_AND_THRASHING;
        snprintf(kill_desc, sizeof(kill_desc), "device is low on swap (%" PRId64
            "kB < %" PRId64 "kB) and thrashing (%" PRId64 "%%)",
            mi.field.free_swap * page_k, swap_low_threshold * page_k, thrashing);
    } else if (swap_is_low && wmark > WMARK_HIGH) {
        /* Both free memory and swap are low */
        kill_reason = LOW_MEM_AND_SWAP;
        snprintf(kill_desc, sizeof(kill_desc), "%s watermark is breached and swap is low (%"
            PRId64 "kB < %" PRId64 "kB)", wmark > WMARK_LOW ? "min" : "low",
            mi.field.free_swap * page_k, swap_low_threshold * page_k);
    } else if (wmark > WMARK_HIGH && thrashing > thrashing_limit) {
        /*
         * Record last time system was thrashing and cut thrasing limit by
         * thrashing_limit_decay_pct percentage of the current thrashing amount
         * until the system stops thrashing
         */
        thrashing_limit = (thrashing_limit * (100 - thrashing_limit_decay_pct)) / 100;
        kill_reason = LOW_MEM_AND_THRASHING;
        min_score_adj = 200;
        sprintf(kill_desc, "%s watermark is breached and thrashing (%" PRId64 "%%)",
            wmark > WMARK_LOW ? "min" : "low", thrashing);
    } else if (reclaim == DIRECT_RECLAIM && thrashing > thrashing_limit) {
        /* Page cache is thrashing while in direct reclaim (mostly happens on lowram devices) */
        thrashing_limit = (thrashing_limit * (100 - thrashing_limit_decay_pct)) / 100;
        kill_reason = DIRECT_RECL_AND_THRASHING;
        min_score_adj = 200;
        snprintf(kill_desc, sizeof(kill_desc), "device is in direct reclaim and thrashing (%"
            PRId64 "%%)", thrashing);
    }

    g. 如果任意條件滿足,則進行kill操作:

    /* Kill a process if necessary */
    if (kill_reason != NONE) {
        if (find_and_kill_process(min_score_adj, kill_desc) > 0) {
            killing = true;
            meminfo_log(&mi);
        } else {
            /* No eligible processes found, reset thrashing limit */
            thrashing_limit = thrashing_limit_pct;
        }
    }

時間有限,先再次作罷,後續會更新:

1、PSI的內核實現、變量含義、節點查看;

2、針對部分手機配置的調優方式;

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章