原標題:LMKD淺析(三)——Android Q新特性
修改標題原因:由於QCOM基線接觸得晚了不少,所以一直以來以爲如下這篇淺析是Android Q全平臺適用的……
結果QCOM基線來了以後發現差異巨大,因此將此篇的標題改爲LMKD淺析(三)——Android Q新特性(MTK篇)
正文:
Android Q引入新模式——PSI (Pressure Stall Information),根據註冊對PSI信息的監聽,並通過判斷watermark,memfree,swapfree,thrashing等信息,更全面地判斷當前系統壓力,並進行鍼對性殺進程。
此模式需要依賴:
1、內核配置CONFIG_PSI=y;
2、屬性ro.lmk.use_psi須不爲false;
3、屬性ro.config.low_ram爲true,或ro.lmk.use_minfree_levels爲false;
此模式可以配置參數:
1、ro.lmk.swap_free_low_percentage
此屬性在Android P上即引入,用於作爲kill條件判斷,數值爲整數,代表百分比,指當swap的可用空間低於一定比例時,lmkd開始進行kill操作;默認爲10(當ro.config.low_ram=true時)/ 20;
2、ro.lmk.psi_partial_stall_ms
此屬性爲PSI模式特有屬性,僅在PSI模式成功啓用的條件下生效。代表上報數據的條件,數值爲整數,代表毫秒值,指當psi監聽到some級別失速在某一秒內超過多少毫秒時,上報壓力值1(VMPRESS_LEVEL_MEDIUM)。
默認爲200(當ro.config.low_ram=true時) / 70
3、ro.lmk.psi_complete_stall_ms
此屬性爲PSI模式特有屬性,僅在PSI模式成功啓用的條件下生效。代表上報數據的條件,數值爲整數,代表毫秒值,指當psi監聽到full級別失速在某一秒內超過多少毫秒時,上報壓力值2(VMPRESS_LEVEL_CRITICAL)。
默認爲700
關於PSI的some與full場景介紹,暫時不擴展介紹,詳見linux kernel官方文檔:https://www.kernel.org/doc/html/latest/accounting/psi.html
lmkd只使用了PSI中memory模塊的信息,後續有空會另起一篇介紹。
LMKD工作流程:
1、開機啓動lmkd後進行狀態檢查,加載各種屬性並進行判斷;
2、註冊PSI監聽;
3、當PSI上報壓力值到lmkd時,lmkd會作如下判斷來決定是否需要kill:
a. 當上一次kill未完成時,當獲取時間失敗時,或當解析vmstat/meminfo失敗時,不進行kill;
/* Skip while still killing a process */
if (is_kill_pending()) {
/* TODO: replace this quick polling with pidfd polling if kernel supports */
goto no_kill;
}
if (clock_gettime(CLOCK_MONOTONIC_COARSE, &curr_tm) != 0) {
ALOGE("Failed to get current time");
return;
}
if (vmstat_parse(&vs) < 0) {
ALOGE("Failed to parse vmstat!");
return;
}
if (meminfo_parse(&mi) < 0) {
ALOGE("Failed to parse meminfo!");
return;
}
b. 當swap可用空間低於ro.lmk.swap_free_low_percentage屬性定義的百分比時,設置swap_is_low = true;
/* Check free swap levels */
if (swap_free_low_percentage) {
if (!swap_low_threshold) {
swap_low_threshold = mi.field.total_swap * swap_free_low_percentage / 100;
}
if (mi.field.free_swap < swap_low_threshold) {
swap_is_low = true;
}
}
c. 通過判斷pgscan_direct/pgscan_kswapd字段較上一次的變化,確定內存回收的狀態是直接回收(DIRECT_RECLAIM)還是通過swap回收(KSWAPD_RECLAIM),如果都不是(NO_RECLAIM),說明內存壓力不大,不進行kill,否則獲取thrashing值(通過判斷refault頁所佔比例);
/* Identify reclaim state */
if (vs.field.pgscan_direct > init_pgscan_direct) {
init_pgscan_direct = vs.field.pgscan_direct;
init_pgscan_kswapd = vs.field.pgscan_kswapd;
reclaim = DIRECT_RECLAIM;
} else if (vs.field.pgscan_kswapd > init_pgscan_kswapd) {
init_pgscan_kswapd = vs.field.pgscan_kswapd;
reclaim = KSWAPD_RECLAIM;
}
/* Skip if system is not reclaiming */
if (reclaim == NO_RECLAIM) {
in_reclaim = false;
goto no_kill;
}
if (!in_reclaim) {
/* Record file-backed pagecache size when entering reclaim cycle */
base_file_lru = vs.field.nr_inactive_file + vs.field.nr_active_file;
init_ws_refault = vs.field.workingset_refault;
thrashing_limit = thrashing_limit_pct;
} else {
/* Calculate what % of the file-backed pagecache refaulted so far */
thrashing = (vs.field.workingset_refault - init_ws_refault) * 100 / base_file_lru;
}
in_reclaim = true;
d. 解析zoneinfo並計算min/low/hight水位線;
/* Refresh thresholds once per min in case user updated one of the margins */
if (thresholds.high_wmark == 0 || get_time_diff_ms(&threshold_update_tm, &curr_tm) > 60000) {
struct zoneinfo zi;
/*
* In unlikely case of failing we skip the update until the next opportunity
* but still rate limiting the updates even as we skip one.
*/
if (zoneinfo_parse(&zi) < 0) {
ALOGE("Failed to parse zoneinfo!");
} else {
calc_zone_thresholds(&zi, &thresholds);
}
threshold_update_tm = curr_tm;
}
e. 使用當前meminfo的數據來判斷當前所處水位;
wmark = get_lowest_watermark(&mi, &thresholds);
f. 根據水位線、thrashing值、壓力值、swap_low值、內存回收模式等進行多種場景判斷,並添加不同的kill原因:
if (cycle_after_kill && wmark > WMARK_LOW) {
/* Prevent kills not freeing enough memory */
kill_reason = PRESSURE_AFTER_KILL;
strncpy(kill_desc, "min watermark is breached even after kill", sizeof(kill_desc));
} else if (level == VMPRESS_LEVEL_CRITICAL && events != 0) {
/* Device is too busy during lowmem event (kill to prevent ANR) */
kill_reason = NOT_RESPONDING;
strncpy(kill_desc, "device is not responding", sizeof(kill_desc));
} else if (swap_is_low && thrashing > thrashing_limit_pct) {
/* Page cache is thrashing */
kill_reason = LOW_SWAP_AND_THRASHING;
snprintf(kill_desc, sizeof(kill_desc), "device is low on swap (%" PRId64
"kB < %" PRId64 "kB) and thrashing (%" PRId64 "%%)",
mi.field.free_swap * page_k, swap_low_threshold * page_k, thrashing);
} else if (swap_is_low && wmark > WMARK_HIGH) {
/* Both free memory and swap are low */
kill_reason = LOW_MEM_AND_SWAP;
snprintf(kill_desc, sizeof(kill_desc), "%s watermark is breached and swap is low (%"
PRId64 "kB < %" PRId64 "kB)", wmark > WMARK_LOW ? "min" : "low",
mi.field.free_swap * page_k, swap_low_threshold * page_k);
} else if (wmark > WMARK_HIGH && thrashing > thrashing_limit) {
/*
* Record last time system was thrashing and cut thrasing limit by
* thrashing_limit_decay_pct percentage of the current thrashing amount
* until the system stops thrashing
*/
thrashing_limit = (thrashing_limit * (100 - thrashing_limit_decay_pct)) / 100;
kill_reason = LOW_MEM_AND_THRASHING;
min_score_adj = 200;
sprintf(kill_desc, "%s watermark is breached and thrashing (%" PRId64 "%%)",
wmark > WMARK_LOW ? "min" : "low", thrashing);
} else if (reclaim == DIRECT_RECLAIM && thrashing > thrashing_limit) {
/* Page cache is thrashing while in direct reclaim (mostly happens on lowram devices) */
thrashing_limit = (thrashing_limit * (100 - thrashing_limit_decay_pct)) / 100;
kill_reason = DIRECT_RECL_AND_THRASHING;
min_score_adj = 200;
snprintf(kill_desc, sizeof(kill_desc), "device is in direct reclaim and thrashing (%"
PRId64 "%%)", thrashing);
}
g. 如果任意條件滿足,則進行kill操作:
/* Kill a process if necessary */
if (kill_reason != NONE) {
if (find_and_kill_process(min_score_adj, kill_desc) > 0) {
killing = true;
meminfo_log(&mi);
} else {
/* No eligible processes found, reset thrashing limit */
thrashing_limit = thrashing_limit_pct;
}
}
時間有限,先再次作罷,後續會更新:
1、PSI的內核實現、變量含義、節點查看;
2、針對部分手機配置的調優方式;