(轉載)深入瞭解iOS中的OOM(低內存崩潰)

英文原文:https://programmer.ink/think/learn-more-about-oom-low-memory-crash-in-ios.html

中文翻譯:https://www.taodudu.cc/news/show-5381.html

在iOS開發過程或者用戶反饋中,可能會經常看到這樣的情況,用着用着就崩潰了,而在後臺查看崩潰棧的時候,找不到崩潰日誌。其實這大多數的可能是系統產生了低內存崩潰,也就是OOM(還有一種可能是主線程卡死,導致watchdog殺掉了應用),而低內存崩潰的日誌,往往都是以JetsamEvent開頭的,日誌中有內存頁大小(pageSize),CPU時間(cpuTime)等字段。

什麼是OOM

什麼是OOM呢,它是out-of-memory的縮寫,字面意思就是內存超過了限制。它是由於 iOS 的 Jetsam機制造成的一種“另類” Crash,它不同於常規的Crash,通過Signal捕獲等Crash監控方案無法捕獲到OOM事件。

當然還會有FOOM這樣的詞,代表的是Foreground-out-of-memory,是指App在前臺因消耗內存過多引起系統強殺。這也就是本文要討論的。後臺出現OOM不一定都是app本身造成的,大多數是因爲當前在前臺的App佔用內存過大,系統爲了保證前臺應用正常運行,把後臺應用清理掉了。

什麼是Jetsam機制

Jetsam機制可以理解爲操作系統爲了控制內存資源過度使用而採用的一種管理機制。Jetsam是一個獨立運行的進程,每一個進程都有一個內存閾值,一旦超過這個閾值Jetsam就會立刻殺掉這個進程。

爲什麼要設計Jetsam機制

首先設備的內存是有限制的,並不是無限大的,所以內存資源非常重要。系統進程及用戶使用的其他app的進程都會爭搶這個資源。由於iOS不支持交換空間,一旦觸發低內存事件,Jetsam就會盡可能多的釋放應用佔用的內存,這樣在iOS系統上出現系統內存不足時,應用就會被系統終止。

交換空間

物理內存不夠使用該怎麼辦呢?像一些桌面操作系統,會有內存交換空間,在window上稱爲虛擬內存。它的機制是,在需要時能將物理內存中的一部分交換到硬盤上去,利用硬盤空間擴展內存空間。

iOS不支持交換空間

但iOS並不支持交換空間,大多數移動設備都不支持交換空間。移動設備的大容量存儲器通常是閃存,它的讀寫速度遠遠小於電腦所使用的硬盤,這就導致在移動設備上就算使用了交換空間,也並不能提升性能。其次,移動設備的容量本身就經常短缺、內存的讀寫壽命也有限,所以在這種情況下還拿閃存來做內存交換,就有點奢侈了。

需要注意的是,網上有少出文章說iOS沒有虛擬內存機制,實際上指的是iOS沒有交換空間機制。

典型app內存類型

當內存不足的時候,系統會按照一定策略來騰出更多空間供使用,比較常見的做法是將一部分低優先級的數據挪到磁盤上,這個操作稱爲Page Out。之後當再次訪問到這塊數據的時候,系統會負責將它重新搬回內存空間中,這個操作稱爲Page In

Clean Memory

Clean Memory是指那些可以用以Page Out的內存,只讀的內存映射文件,或者是App所用到的frameworks。每個frameworks都有_DATA_CONST段,通常他們都是Clean的,但如果用runtime進行swizzling,那麼他們就會變Dirty

Dirty Memory

Dirty Memory是指那些被App寫入過數據的內存,包括所有堆區的對象、圖像解碼緩衝區,同時,類似Clean memory,也包括App所用到的frameworks。每個framework都會有_DATA段和_DATA_DIRTY段,它們的內存是Dirty的。

值得注意的是,在使用framework的過程中會產生Dirty Memory,使用單例或者全局初始化方法是減少Dirty Memory不錯的方法,因爲單例一旦創建就不會銷燬,全局初始化方法會在類加載時執行。

Compressed Memory

由於閃存容量和讀寫壽命的限制,iOS 上沒有交換空間機制,取而代之使用Compressed memory。

Compressed memory是在內存緊張時能夠將最近使用過的內存佔用壓縮至原有大小的一半以下,並且能夠在需要時解壓複用。它在節省內存的同時提高了系統的響應速度,特點總結起來如下:

  • Shrinks memory usage 減少了不活躍內存佔用
  • Improves power efficiency 改善電源效率,通過壓縮減少磁盤IO帶來的損耗
  • Minimizes CPU usage 壓縮/解壓十分迅速,能夠儘可能減少 CPU 的時間開銷
  • Is multicore aware 支持多核操作

例如,當我們使用Dictionary去緩存數據的時候,假設現在已經使用了3頁內存,當不訪問的時候可能會被壓縮爲1頁,再次使用到時候又會解壓成3頁。

本質上,Compressed memory也是Dirty memory
因此, memory footprint = dirty size + compressed size ,這也就是我們需要並且能夠嘗試去減少的內存佔用。

Memory Warning

相信對於MemoryWarning並不陌生,每一個UIViewController都會有一個didReceivedMemoryWarning的方法。

當使用的內存是一點點上漲時,而不是一下子直接把內存撐爆。在達到內存臨界點之前,系統會給各個正在運行的應用發出內存警告,告知app去清理自己的內存。而內存警告,並不總是由於自身app導致的。

內存壓縮技術使得釋放內存變得複雜。內存壓縮技術在操作系統層面實現,對進程無感知。有趣的是如果當前進程收到了內存警告,進程這時候準備釋放大量的誤用內存,如果訪問到過多的壓縮內存,再解壓縮內存的時候反而會導致內存壓力更大,然後出現OOM,被系統殺掉。

我們對數據進行緩存的目的是想減少 CPU 的壓力,但是過多的緩存又會佔用過大的內存。在一些需要緩存數據的場景下,可以考慮使用NSCache代替NSDictionaryNSCache分配的內存實際上是Purgeable Memory,可以由系統自動釋放。這點在Effective Objective 2.0一書中也有推薦NSCacheNSPureableData的結合使用既能讓系統根據情況回收內存,也可以在內存清理的同時移除相關對象。

出現OOM前一定會出現Memory Warning麼? 答案是不一定,有可能瞬間申請了大量內存,而恰好此時主線程在忙於其他事情,導致可能沒有經歷過Memory Warning就發生了OOM。當然即便出現了多次Memory Warning後,也不見得會在最後一次Memory Warning的幾秒鐘後出現OOM。之前做extension開發的時候,就經常會出現Memory Warnning,但是不會出現OOM,再操作一兩分鐘後,纔出現OOM,而在這一兩分鐘內,沒有再出現過Memory Warning。

當然在內存警告時,處理內存,可以在一定程度上避免出現OOM。

如何確定OOM的閾值

有經驗的同學,肯定知道不同設備OOM的閾值是不同的。那我們該如何知道OOM的閾值呢?

方法1

當我們的App被Jetsam機制殺死的時候,在手機中會生成系統日誌,在手機系統設置-隱私-分析中,可以得到JetSamEvent開頭的日誌。這些日誌中就可以獲取到一些關於App的內存信息,例如我當前用的iPhone8(iOS11.4.1),在日誌中的前部分看到了pageSize,而查找per-process-limit一項(並不是所有日誌都有,可以找有的),用該項的rpages * pageSize即可得到OOM的閾值。

{"bug_type":"298","timestamp":"2020-01-03 04:11:13.65 +0800","os_version":"iPhone OS 11.4.1 (15G77)","incident_id":"2723B2EA-7FB8-49A6-B2FC-49F10C748D8A"}
{
  "crashReporterKey" : "a6ad027ba01b1e66d0b3d8446aaef5dbd75dd732",
  "kernel" : "Darwin Kernel Version 17.7.0: Mon Jun 11 19:06:27 PDT 2018; root:xnu-4570.70.24~3\/RELEASE_ARM64_T8015",
  "product" : "iPhone10,1",
  "incident" : "2723B2EA-7FB8-49A6-B2FC-49F10C748D8A",
  "date" : "2020-01-03 04:11:13.65 +0800",
  "build" : "iPhone OS 11.4.1 (15G77)",
  "timeDelta" : 4,
  "memoryStatus" : {
  "compressorSize" : 39010,
  "compressions" : 2282594,
  "decompressions" : 1071238,
  "zoneMapCap" : 402653184,
  "largestZone" : "APFS_4K_OBJS",
  "largestZoneSize" : 35962880,
  "pageSize" : 16384,
  "uncompressed" : 105360,
  "zoneMapSize" : 118865920,
  "memoryPages" : {
    "active" : 39800,
    "throttled" : 0,
    "fileBacked" : 28778,
    "wired" : 19947,
    "anonymous" : 32084,
    "purgeable" : 543,
    "inactive" : 19877,
    "free" : 2935,
    "speculative" : 1185
  }
},
...
  {
    "uuid" : "a2f9f2db-a110-3896-a0ec-d82c156055ed",
    "states" : [
      "frontmost",
      "resume"
    ],
    "killDelta" : 11351,
    "genCount" : 0,
    "age" : 361742447,
    "purgeable" : 0,
    "fds" : 50,
    "coalition" : 2694,
    "rpages" : 89600,
    "reason" : "per-process-limit",
    "pid" : 2541,
    "cpuTime" : 1.65848,
    "name" : "MemoryTest",
    "lifetimeMax" : 24126
  },
...

那麼當前這個MemoryTest的內存閾值就是16384 * 89600 / 1024 / 1024 = 1400MB

方法2

當前網絡上已經有人很多人整理的OOM內存對應表,我這邊根據實際情況比較傾向於該版本。

I created one more list by sorting Jaspers list by device RAM (I made my own tests with Split's tool and fixed some results - check my comments in Jaspers thread).

device RAM: percent range to crash

256MB: 49% - 51%
512MB: 53% - 63%
1024MB: 57% - 68%
2048MB: 68% - 69%
3072MB: 63% - 66%
4096MB: 77%
6144MB: 81%

Special cases:

iPhone X (3072MB): 50%
iPhone XS/XS Max (4096MB): 55%
iPhone XR (3072MB): 63%
iPhone 11/11 Pro Max (4096MB): 54% - 55%

Device RAM can be read easily:

[NSProcessInfo processInfo].physicalMemory
From my experience it is safe to use 45% for 1GB devices, 50% for 2/3GB devices and 55% for 4GB devices. Percent for macOS can be a bit bigger.

方法3

首先,我們可以通過方法得到當前應用程序佔用的內存。代碼如下

- (int)usedSizeOfMemory {
    task_vm_info_data_t taskInfo;
    mach_msg_type_number_t infoCount = TASK_VM_INFO_COUNT;
    kern_return_t kernReturn = task_info(mach_task_self(), TASK_VM_INFO, (task_info_t)&taskInfo, &infoCount);

    if (kernReturn != KERN_SUCCESS) {
        return 0;
    }
    return (int)(taskInfo.phys_footprint / 1024 / 1024);
}

也有其他一些代碼使用過的是taskInfo.resident_size,但該值並不準確。我對比Xcode Debug,發現taskInfo.phys_footprint值基本上與Xcode Debug的值一致。而在XNU的task.c中,也找到了該值是如何計算的。

/*
 * phys_footprint
 *   Physical footprint: This is the sum of:
 *     + (internal - alternate_accounting)
 *     + (internal_compressed - alternate_accounting_compressed)
 *     + iokit_mapped
 *     + purgeable_nonvolatile
 *     + purgeable_nonvolatile_compressed
 *     + page_table
 */
 本地測試了一下:
 iOS11上,phys_footprint值與Xcode DEBUG的值相差不到1M,
 而在iOS13上,phys_footprint值與Xcode DEBUG值完全一致。
 有強迫症的同學可以在iOS11上使用
 ((taskInfo.internal + taskInfo.compressed - taskInfo.purgeable_volatile_pmap))來代替phys_footprint。

那麼我們可以得到這個值之後,就可以開一個線程,循環申請1MB的內存,直至到達第一次內存警告,以及OOM。

#import "ViewController.h"
#import <mach/mach.h>

#define kOneMB  1048576

@interface ViewController ()
{
    NSTimer *timer;

    int allocatedMB;
    Byte *p[10000];
    
    int physicalMemorySizeMB;
    int memoryWarningSizeMB;
    int memoryLimitSizeMB;
    BOOL firstMemoryWarningReceived;
}

@end

@implementation ViewController

- (void)viewDidLoad {
    [super viewDidLoad];
    // Do any additional setup after loading the view.
    physicalMemorySizeMB = (int)([[NSProcessInfo processInfo] physicalMemory] / kOneMB);
    firstMemoryWarningReceived = YES;
}

- (void)didReceiveMemoryWarning {
    [super didReceiveMemoryWarning];
        
    if (firstMemoryWarningReceived == NO) {
        return ;
    }
    memoryWarningSizeMB = [self usedSizeOfMemory];
    firstMemoryWarningReceived = NO;
}

- (IBAction)startTest:(UIButton *)button {
    [timer invalidate];
    timer = [NSTimer scheduledTimerWithTimeInterval:0.01 target:self selector:@selector(allocateMemory) userInfo:nil repeats:YES];
}

- (void)allocateMemory {
    
    p[allocatedMB] = malloc(1048576);
    memset(p[allocatedMB], 0, 1048576);
    allocatedMB += 1;
    
    memoryLimitSizeMB = [self usedSizeOfMemory];
    if (memoryWarningSizeMB && memoryLimitSizeMB) {
        NSLog(@"----- memory warnning:%dMB, memory limit:%dMB", memoryWarningSizeMB, memoryLimitSizeMB);
    }
}

- (int)usedSizeOfMemory {
    task_vm_info_data_t taskInfo;
    mach_msg_type_number_t infoCount = TASK_VM_INFO_COUNT;
    kern_return_t kernReturn = task_info(mach_task_self(), TASK_VM_INFO, (task_info_t)&taskInfo, &infoCount);

    if (kernReturn != KERN_SUCCESS) {
        return 0;
    }
    return (int)(taskInfo.phys_footprint / kOneMB);
}

@end

這樣我們debug,查看控制檯最後一條log即可。

2020-01-03 11:52:26.353765+0800 MemoryTest[2561:599014] ----- memory warnning:1289MB, memory limit:1397MB
2020-01-03 11:52:26.363799+0800 MemoryTest[2561:599014] ----- memory warnning:1289MB, memory limit:1398MB
2020-01-03 11:52:26.373895+0800 MemoryTest[2561:599014] ----- memory warnning:1289MB, memory limit:1399MB

我們發現,內存警告是1289MB,我們有記錄的OOM的log到1399MB,那麼也說明OOM值爲1400MB。

方法4(適用於iOS13系統)

iOS13系統os/proc.h中提供了新的API,可以查看當前可用內存

#import <os/proc.h>

extern size_t os_proc_available_memory(void);

+ (CGFloat)availableSizeOfMemory {
    if (@available(iOS 13.0, *)) {
        return os_proc_available_memory() / 1024.0 / 1024.0;
    }
    // ...
}

有了這個值,我們就可以計算出當前應用的內存限制。用了一個iPhone Xs Max測試了一下。通過方法1獲取到的內存限制值爲134278 * 16384 / 1024 / 1024 = 2098M。通過方法3獲取到的內存值爲2098M。

- (int)limitSizeOfMemory {
    if (@available(iOS 13.0, *)) {
        task_vm_info_data_t taskInfo;
        mach_msg_type_number_t infoCount = TASK_VM_INFO_COUNT;
        kern_return_t kernReturn = task_info(mach_task_self(), TASK_VM_INFO, (task_info_t)&taskInfo, &infoCount);

        if (kernReturn != KERN_SUCCESS) {
            return 0;
        }
        return (int)((taskInfo.phys_footprint + os_proc_available_memory()) / 1024.0 / 1024.0);
    }
    return 0;
}

通過這個方法,得到的值也爲2098M。

以上就是幾種可以獲取到不同應用OOM值的方法。無論是應用還是應用擴展,都可以通過以上幾個方法測試。但應用擴展的內存限制十分嚴格,要遠低於普通的應用程序。例如,iPhone XS Max的應用內存限制爲2098M,而同設備的自定義鍵盤,內存限制爲66M(少的太可憐了)。

源碼探究

我們知道,iOS/MacOS的內核都是XNU,同時XNU是開源的。我們可以在開源的XNU內核源碼中,窺探蘋果Jetsam的具體實現。

XNU的內核內層爲Mach層,Mach作爲微內核,是僅提供基礎服務的一個薄層,如處理器管理和調度及IPC(進程間通信)。XNU的第二個主要部分是BSD層。我們可以將其看成圍繞mach層的一個外環,BSD爲最終用戶的應用程序提供變成接口,其職責包括進程管理,文件系統和網絡。

內存管理中各種常見的JetSam時間也是由BSD產生的,所以,我們從bsd_init這個函數作爲入口,來探究一下原理。

bsd_init中基本都是在初始化各種子系統,比如虛擬內存管理等等。

BSD初始化bsd_init

跟內存相關的包括如下幾步:

//1. 初始化BSD內存Zone,這個Zone是基於Mach內核的zone
kmeminit();

//2.iOS上獨有的特性,內存和進程的休眠的常駐監控線程
#if CONFIG_FREEZE
#ifndef CONFIG_MEMORYSTATUS
    #error "CONFIG_FREEZE defined without matching CONFIG_MEMORYSTATUS"
#endif
	/* Initialise background freezing */
	bsd_init_kprintf("calling memorystatus_freeze_init\n");
	memorystatus_freeze_init();
#endif

//3.iOS獨有,JetSAM(即低內存事件的常駐監控線程)
#if CONFIG_MEMORYSTATUS
	/* Initialize kernel memory status notifications */
	bsd_init_kprintf("calling memorystatus_init\n");
	memorystatus_init();
#endif /* CONFIG_MEMORYSTATUS */

這裏面的memorystatus_freeze_init()memorystatus_init()兩個方法都是調用kern_memorystatus.c裏面暴露的接口,主要的作用就是從內核中開啓兩個優先級最高的線程,來監控整個系統的內存情況。

CONFIG_FREEZE涉及到的功能,當啓用這個宏時,內核會對進程進行冷凍而不是Kill。涉及到進程休眠相關的代碼,暫時不在本文討論範圍內。

回到iOS的OOM崩潰話題上,我們只需要關注memorystatus_init()方法即可。

知識點介紹

  • 內核裏面對於所有的進程都有一個優先級的分佈,通過一個數組維護,數組的每一項是一個進程的列表。這個數組的大小則是JETSAM_PRIORITY_MAX + 1
#define MEMSTAT_BUCKET_COUNT (JETSAM_PRIORITY_MAX + 1)

typedef struct memstat_bucket {
    TAILQ_HEAD(, proc) list;    //  一個TAILQ_HEAD的雙向鏈表,用來存放這個優先級下面的進程
    int count;  //  進程的個數
} memstat_bucket_t;

memstat_bucket_t memstat_bucket[MEMSTAT_BUCKET_COUNT];//優先級隊列(裏面包含不同優先級的結構)
  • kern_memorystatus.h中,我們可以找到JETSAM_PRIORITY_MAX值以及進程優先級相關的定義:
#define JETSAM_PRIORITY_REVISION                  2

#define JETSAM_PRIORITY_IDLE_HEAD                -2
/* The value -1 is an alias to JETSAM_PRIORITY_DEFAULT */
#define JETSAM_PRIORITY_IDLE                      0
#define JETSAM_PRIORITY_IDLE_DEFERRED		  1 /* Keeping this around till all xnu_quick_tests can be moved away from it.*/
#define JETSAM_PRIORITY_AGING_BAND1		  JETSAM_PRIORITY_IDLE_DEFERRED
#define JETSAM_PRIORITY_BACKGROUND_OPPORTUNISTIC  2
#define JETSAM_PRIORITY_AGING_BAND2		  JETSAM_PRIORITY_BACKGROUND_OPPORTUNISTIC
#define JETSAM_PRIORITY_BACKGROUND                3
#define JETSAM_PRIORITY_ELEVATED_INACTIVE	  JETSAM_PRIORITY_BACKGROUND
#define JETSAM_PRIORITY_MAIL                      4
#define JETSAM_PRIORITY_PHONE                     5
#define JETSAM_PRIORITY_UI_SUPPORT                8
#define JETSAM_PRIORITY_FOREGROUND_SUPPORT        9
#define JETSAM_PRIORITY_FOREGROUND               10
#define JETSAM_PRIORITY_AUDIO_AND_ACCESSORY      12
#define JETSAM_PRIORITY_CONDUCTOR                13
#define JETSAM_PRIORITY_HOME                     16
#define JETSAM_PRIORITY_EXECUTIVE                17
#define JETSAM_PRIORITY_IMPORTANT                18
#define JETSAM_PRIORITY_CRITICAL                 19

#define JETSAM_PRIORITY_MAX                      21

/* TODO - tune. This should probably be lower priority */
#define JETSAM_PRIORITY_DEFAULT                  18
#define JETSAM_PRIORITY_TELEPHONY                19

其中數值越大,優先級越高。後臺應用程序優先級JETSAM_PRIORITY_BACKGROUND是3,低於前臺應用程序優先級JETSAM_PRIORITY_FOREGROUND10,而SpringBoard(桌面程序)位於JETSAM_PRIORITY_HOME16。

  • JetSam出現的原因:
//kern_memorystatus.h
/*
 * Jetsam exit reason definitions - related to memorystatus
 *
 * When adding new exit reasons also update:
 *	JETSAM_REASON_MEMORYSTATUS_MAX
 *	kMemorystatusKilled... Cause enum
 *	memorystatus_kill_cause_name[]
 */
#define JETSAM_REASON_INVALID								0
#define JETSAM_REASON_GENERIC								1
#define JETSAM_REASON_MEMORY_HIGHWATER						2
#define JETSAM_REASON_VNODE									3
#define JETSAM_REASON_MEMORY_VMPAGESHORTAGE					4
#define JETSAM_REASON_MEMORY_PROCTHRASHING					5
#define JETSAM_REASON_MEMORY_FCTHRASHING					6
#define JETSAM_REASON_MEMORY_PERPROCESSLIMIT				7
#define JETSAM_REASON_MEMORY_DISK_SPACE_SHORTAGE			8
#define JETSAM_REASON_MEMORY_IDLE_EXIT						9
#define JETSAM_REASON_ZONE_MAP_EXHAUSTION					10
#define JETSAM_REASON_MEMORY_VMCOMPRESSOR_THRASHING			11
#define JETSAM_REASON_MEMORY_VMCOMPRESSOR_SPACE_SHORTAGE	12

#define JETSAM_REASON_MEMORYSTATUS_MAX	JETSAM_REASON_MEMORY_VMCOMPRESSOR_SPACE_SHORTAGE

/*
 * Jetsam exit reason definitions - not related to memorystatus
 */
#define JETSAM_REASON_CPULIMIT			100

/* Cause */
enum {
	kMemorystatusInvalid							= JETSAM_REASON_INVALID,
	kMemorystatusKilled								= JETSAM_REASON_GENERIC,
	kMemorystatusKilledHiwat						= JETSAM_REASON_MEMORY_HIGHWATER,
	kMemorystatusKilledVnodes						= JETSAM_REASON_VNODE,
	kMemorystatusKilledVMPageShortage				= JETSAM_REASON_MEMORY_VMPAGESHORTAGE,
	kMemorystatusKilledProcThrashing				= JETSAM_REASON_MEMORY_PROCTHRASHING,
	kMemorystatusKilledFCThrashing					= JETSAM_REASON_MEMORY_FCTHRASHING,
	kMemorystatusKilledPerProcessLimit				= JETSAM_REASON_MEMORY_PERPROCESSLIMIT,
	kMemorystatusKilledDiskSpaceShortage			= JETSAM_REASON_MEMORY_DISK_SPACE_SHORTAGE,
	kMemorystatusKilledIdleExit						= JETSAM_REASON_MEMORY_IDLE_EXIT,
	kMemorystatusKilledZoneMapExhaustion			= JETSAM_REASON_ZONE_MAP_EXHAUSTION,
	kMemorystatusKilledVMCompressorThrashing		= JETSAM_REASON_MEMORY_VMCOMPRESSOR_THRASHING,
	kMemorystatusKilledVMCompressorSpaceShortage	= JETSAM_REASON_MEMORY_VMCOMPRESSOR_SPACE_SHORTAGE,
};

//kern_memorystatus.m
/* For logging clarity */
static const char *memorystatus_kill_cause_name[] = {
	""								,		/* kMemorystatusInvalid							*/
	"jettisoned"					,		/* kMemorystatusKilled							*/
	"highwater"						,		/* kMemorystatusKilledHiwat						*/
	"vnode-limit"					,		/* kMemorystatusKilledVnodes					*/
	"vm-pageshortage"				,		/* kMemorystatusKilledVMPageShortage			*/
	"proc-thrashing"				,		/* kMemorystatusKilledProcThrashing				*/
	"fc-thrashing"					,		/* kMemorystatusKilledFCThrashing				*/
	"per-process-limit"				,		/* kMemorystatusKilledPerProcessLimit			*/
	"disk-space-shortage"			,		/* kMemorystatusKilledDiskSpaceShortage			*/
	"idle-exit"						,		/* kMemorystatusKilledIdleExit					*/
	"zone-map-exhaustion"			,		/* kMemorystatusKilledZoneMapExhaustion			*/
	"vm-compressor-thrashing"		,		/* kMemorystatusKilledVMCompressorThrashing		*/
	"vm-compressor-space-shortage"	,		/* kMemorystatusKilledVMCompressorSpaceShortage	*/
};

memorystatus_init 內存狀態初始化

接下里讓我們看一下memorystatus_init()函數中,初始化JETSAM線程的關鍵部分代碼。

__private_extern__ void
memorystatus_init(void)
{
    ... 
	/* Initialize the jetsam_threads state array */
	jetsam_threads = kalloc(sizeof(struct jetsam_thread_state) * max_jetsam_threads);

	/* Initialize all the jetsam threads */
	for (i = 0; i < max_jetsam_threads; i++) {

		result = kernel_thread_start_priority(memorystatus_thread, NULL, 95 /* MAXPRI_KERNEL */, &jetsam_threads[i].thread);
		if (result == KERN_SUCCESS) {
			jetsam_threads[i].inited = FALSE;
			jetsam_threads[i].index = i;
			thread_deallocate(jetsam_threads[i].thread);
		} else {
			panic("Could not create memorystatus_thread %d", i);
		}
	}
}

在這裏會根據內核啓動參數和設備性能,開啓max_jetsam_threads個JetSam線程(性能差的設備爲1個,其餘爲3個),這些線程的優先級是內核所能分配的最高級(95, MAXPRI_KERNEL)。並且爲每個線程增加了次序(注意:前文的-2~19是進程優先級區間,而這裏的95是線程優先級,XNU的線程優先級範圍是0~127)。

memorystatus_thread 內存狀態管理線程

系統中專門有一個線程用來管理內存狀態,當內存狀態出現問題或者內存壓力過大時,將會通過一定的策略,幹掉一些 App 回收內存。

繼續看memorystatus_thread內存狀態管理線程的代碼:

static void
memorystatus_thread(void *param __unused, wait_result_t wr __unused)
{
	boolean_t post_snapshot = FALSE;
	uint32_t errors = 0;
	uint32_t hwm_kill = 0;
	boolean_t sort_flag = TRUE;
	boolean_t corpse_list_purged = FALSE;
	int	jld_idle_kills = 0;
	struct jetsam_thread_state *jetsam_thread = jetsam_current_thread();

	if (jetsam_thread->inited == FALSE) {
		/* 
		 * It's the first time the thread has run, so just mark the thread as privileged and block.
		 * This avoids a spurious pass with unset variables, as set out in <rdar://problem/9609402>.
		 */

		char name[32];
		thread_wire(host_priv_self(), current_thread(), TRUE);
		snprintf(name, 32, "VM_memorystatus_%d", jetsam_thread->index + 1);

		if (jetsam_thread->index == 0) {
			if (vm_pageout_state.vm_restricted_to_single_processor == TRUE) {
				thread_vm_bind_group_add();
			}
		}
		thread_set_thread_name(current_thread(), name);
		jetsam_thread->inited = TRUE;
		memorystatus_thread_block(0, memorystatus_thread);
	}
	
	KERNEL_DEBUG_CONSTANT(BSDDBG_CODE(DBG_BSD_MEMSTAT, BSD_MEMSTAT_SCAN) | DBG_FUNC_START,
			      memorystatus_available_pages, memorystatus_jld_enabled, memorystatus_jld_eval_period_msecs, memorystatus_jld_eval_aggressive_count,0);

	/*
	 * Jetsam aware version.
	 *
	 * The VM pressure notification thread is working it's way through clients in parallel.
	 *
	 * So, while the pressure notification thread is targeting processes in order of 
	 * increasing jetsam priority, we can hopefully reduce / stop it's work by killing 
	 * any processes that have exceeded their highwater mark.
	 *
	 * If we run out of HWM processes and our available pages drops below the critical threshold, then,
	 * we target the least recently used process in order of increasing jetsam priority (exception: the FG band).
	 */
	while (memorystatus_action_needed()) {
		boolean_t killed;
		int32_t priority;
		uint32_t cause;
		uint64_t jetsam_reason_code = JETSAM_REASON_INVALID;
		os_reason_t jetsam_reason = OS_REASON_NULL;

		cause = kill_under_pressure_cause;
		switch (cause) {
			case kMemorystatusKilledFCThrashing:
				jetsam_reason_code = JETSAM_REASON_MEMORY_FCTHRASHING;
				break;
			case kMemorystatusKilledVMCompressorThrashing:
				jetsam_reason_code = JETSAM_REASON_MEMORY_VMCOMPRESSOR_THRASHING;
				break;
			case kMemorystatusKilledVMCompressorSpaceShortage:
				jetsam_reason_code = JETSAM_REASON_MEMORY_VMCOMPRESSOR_SPACE_SHORTAGE;
				break;
			case kMemorystatusKilledZoneMapExhaustion:
				jetsam_reason_code = JETSAM_REASON_ZONE_MAP_EXHAUSTION;
				break;
			case kMemorystatusKilledVMPageShortage:
				/* falls through */
			default:
				jetsam_reason_code = JETSAM_REASON_MEMORY_VMPAGESHORTAGE;
				cause = kMemorystatusKilledVMPageShortage;
				break;
		}

		/* Highwater */
		boolean_t is_critical = TRUE;
		if (memorystatus_act_on_hiwat_processes(&errors, &hwm_kill, &post_snapshot, &is_critical)) {
			if (is_critical == FALSE) {
				/*
				 * For now, don't kill any other processes.
				 */
				break;
			} else {
				goto done;
			}
		}

		jetsam_reason = os_reason_create(OS_REASON_JETSAM, jetsam_reason_code);
		if (jetsam_reason == OS_REASON_NULL) {
			printf("memorystatus_thread: failed to allocate jetsam reason\n");
		}

		if (memorystatus_act_aggressive(cause, jetsam_reason, &jld_idle_kills, &corpse_list_purged, &post_snapshot)) {
			goto done;
		}

		/*
		 * memorystatus_kill_top_process() drops a reference,
		 * so take another one so we can continue to use this exit reason
		 * even after it returns
		 */
		os_reason_ref(jetsam_reason);

		/* LRU */
		killed = memorystatus_kill_top_process(TRUE, sort_flag, cause, jetsam_reason, &priority, &errors);
		sort_flag = FALSE;

		if (killed) {
			if (memorystatus_post_snapshot(priority, cause) == TRUE) {

        			post_snapshot = TRUE;
			}

			/* Jetsam Loop Detection */
			if (memorystatus_jld_enabled == TRUE) {
				if ((priority == JETSAM_PRIORITY_IDLE) || (priority == system_procs_aging_band) || (priority == applications_aging_band)) {
					jld_idle_kills++;
				} else {
					/*
					 * We've reached into bands beyond idle deferred.
					 * We make no attempt to monitor them
					 */
				}
			}

			if ((priority >= JETSAM_PRIORITY_UI_SUPPORT) && (total_corpses_count() > 0) && (corpse_list_purged == FALSE)) {
				/*
				 * If we have jetsammed a process in or above JETSAM_PRIORITY_UI_SUPPORT
				 * then we attempt to relieve pressure by purging corpse memory.
				 */
				task_purge_all_corpses();
				corpse_list_purged = TRUE;
			}
			goto done;
		}
		
		if (memorystatus_avail_pages_below_critical()) {
			/*
			 * Still under pressure and unable to kill a process - purge corpse memory
			 */
			if (total_corpses_count() > 0) {
				task_purge_all_corpses();
				corpse_list_purged = TRUE;
			}

			if (memorystatus_avail_pages_below_critical()) {
				/*
				 * Still under pressure and unable to kill a process - panic
				 */
				panic("memorystatus_jetsam_thread: no victim! available pages:%llu\n", (uint64_t)memorystatus_available_pages);
			}
		}
			
done:		

		/*
		 * We do not want to over-kill when thrashing has been detected.
		 * To avoid that, we reset the flag here and notify the
		 * compressor.
		 */
		if (is_reason_thrashing(kill_under_pressure_cause)) {
			kill_under_pressure_cause = 0;
#if CONFIG_JETSAM
			vm_thrashing_jetsam_done();
#endif /* CONFIG_JETSAM */
		} else if (is_reason_zone_map_exhaustion(kill_under_pressure_cause)) {
			kill_under_pressure_cause = 0;
		}

		os_reason_free(jetsam_reason);
	}

	kill_under_pressure_cause = 0;
	
	if (errors) {
		memorystatus_clear_errors();
	}

	if (post_snapshot) {
		proc_list_lock();
		size_t snapshot_size = sizeof(memorystatus_jetsam_snapshot_t) +
			sizeof(memorystatus_jetsam_snapshot_entry_t) * (memorystatus_jetsam_snapshot_count);
		uint64_t timestamp_now = mach_absolute_time();
		memorystatus_jetsam_snapshot->notification_time = timestamp_now;
		memorystatus_jetsam_snapshot->js_gencount++;
		if (memorystatus_jetsam_snapshot_count > 0 && (memorystatus_jetsam_snapshot_last_timestamp == 0 ||
				timestamp_now > memorystatus_jetsam_snapshot_last_timestamp + memorystatus_jetsam_snapshot_timeout)) {
			proc_list_unlock();
			int ret = memorystatus_send_note(kMemorystatusSnapshotNote, &snapshot_size, sizeof(snapshot_size));
			if (!ret) {
				proc_list_lock();
				memorystatus_jetsam_snapshot_last_timestamp = timestamp_now;
				proc_list_unlock();
			}
		} else {
			proc_list_unlock();
		}
	}

	KERNEL_DEBUG_CONSTANT(BSDDBG_CODE(DBG_BSD_MEMSTAT, BSD_MEMSTAT_SCAN) | DBG_FUNC_END,
		memorystatus_available_pages, 0, 0, 0, 0);

	memorystatus_thread_block(0, memorystatus_thread);
}

代碼較多,我們來逐一分析。

判斷條件

我們可以看到核心的代碼在while (memorystatus_action_needed())循環裏,memorystatus_action_needed()是觸發OOM的核心判斷條件。

/* Does cause indicate vm or fc thrashing? */
static boolean_t 
is_reason_thrashing(unsigned cause)
{
	switch (cause) {
	case kMemorystatusKilledFCThrashing:
	case kMemorystatusKilledVMCompressorThrashing:
	case kMemorystatusKilledVMCompressorSpaceShortage:
		return TRUE;
	default:
		return FALSE;
	}
}

/* Is the zone map almost full? */
static boolean_t 
is_reason_zone_map_exhaustion(unsigned cause)
{
	if (cause == kMemorystatusKilledZoneMapExhaustion)
		return TRUE;
	return FALSE;
}

static boolean_t memorystatus_action_needed(void)
{
	return (is_reason_thrashing(kill_under_pressure_cause) ||
			is_reason_zone_map_exhaustion(kill_under_pressure_cause) ||
	       memorystatus_available_pages <= memorystatus_available_pages_pressure);
}

這裏通過接受vm_pageout守護程序(實際上是一個線程)發送的內存壓力來判斷當前內存資源是否緊張。內存緊張的情況可能爲:操作系統的抖動(Thrashing,頻繁的內存頁面(page)換進換出佔用CPU過度),虛擬內存耗盡(比如有人從硬盤向ZFS(動態文件系統)池中拷貝1TB的數據),或者內存可用頁低於閾值memorystatus_available_pages_pressure

high-water

判斷條件通過之後,也就是當前內存緊張,首先走到memorystatus_act_on_hiwat_processes邏輯中。

/* Highwater */
boolean_t is_critical = TRUE;
if (memorystatus_act_on_hiwat_processes(&errors, &hwm_kill, &post_snapshot, &is_critical)) {
	if (is_critical == FALSE) {
		/*
		 * For now, don't kill any other processes.
		 */
		break;
	} else {
		goto done;
	}
}

這是觸發high-water類型OOM的關鍵方法。

static boolean_t
memorystatus_act_on_hiwat_processes(uint32_t *errors, uint32_t *hwm_kill, boolean_t *post_snapshot, __unused boolean_t *is_critical)
{
	boolean_t purged = FALSE;
	boolean_t killed = memorystatus_kill_hiwat_proc(errors, &purged);

	if (killed) {
		*hwm_kill = *hwm_kill + 1;
		*post_snapshot = TRUE;
		return TRUE;
	} else {
		if (purged == FALSE) {
			/* couldn't purge and couldn't kill */
			memorystatus_hwm_candidates = FALSE;
		}
	}

#if CONFIG_JETSAM
	/* No highwater processes to kill. Continue or stop for now? */
	if (!is_reason_thrashing(kill_under_pressure_cause) &&
		!is_reason_zone_map_exhaustion(kill_under_pressure_cause) &&
	    (memorystatus_available_pages > memorystatus_available_pages_critical)) {
		/*
		 * We are _not_ out of pressure but we are above the critical threshold and there's:
		 * - no compressor thrashing
		 * - enough zone memory
		 * - no more HWM processes left.
		 * For now, don't kill any other processes.
		 */
	
		if (*hwm_kill == 0) {
			memorystatus_thread_wasted_wakeup++;
		}

		*is_critical = FALSE;

		return TRUE;
	}
#endif /* CONFIG_JETSAM */

	return FALSE;
}

memorystatus_act_on_hiwat_processes會直接調用memorystatus_kill_hiwat_proc

static boolean_t
memorystatus_kill_hiwat_proc(uint32_t *errors, boolean_t *purged)
{
	pid_t aPid = 0;
	proc_t p = PROC_NULL, next_p = PROC_NULL;
	boolean_t new_snapshot = FALSE, killed = FALSE, freed_mem = FALSE;
	unsigned int i = 0;
	uint32_t aPid_ep;
	os_reason_t jetsam_reason = OS_REASON_NULL;
	KERNEL_DEBUG_CONSTANT(BSDDBG_CODE(DBG_BSD_MEMSTAT, BSD_MEMSTAT_JETSAM_HIWAT) | DBG_FUNC_START,
		memorystatus_available_pages, 0, 0, 0, 0);
	
	jetsam_reason = os_reason_create(OS_REASON_JETSAM, JETSAM_REASON_MEMORY_HIGHWATER);
	if (jetsam_reason == OS_REASON_NULL) {
		printf("memorystatus_kill_hiwat_proc: failed to allocate exit reason\n");
	}

	proc_list_lock();
	
	next_p = memorystatus_get_first_proc_locked(&i, TRUE);
	while (next_p) {
		uint64_t footprint_in_bytes = 0;
		uint64_t memlimit_in_bytes  = 0;
		boolean_t skip = 0;

		p = next_p;
		next_p = memorystatus_get_next_proc_locked(&i, p, TRUE);
		
		aPid = p->p_pid;
		aPid_ep = p->p_memstat_effectivepriority;
		
		if (p->p_memstat_state  & (P_MEMSTAT_ERROR | P_MEMSTAT_TERMINATED)) {
			continue;
		}
		
		/* skip if no limit set */
		if (p->p_memstat_memlimit <= 0) {
			continue;
		}

		footprint_in_bytes = get_task_phys_footprint(p->task);
		memlimit_in_bytes  = (((uint64_t)p->p_memstat_memlimit) * 1024ULL * 1024ULL);	/* convert MB to bytes */
		skip = (footprint_in_bytes <= memlimit_in_bytes);

#if CONFIG_JETSAM && (DEVELOPMENT || DEBUG)
		if (!skip && (memorystatus_jetsam_policy & kPolicyDiagnoseActive)) {
			if (p->p_memstat_state & P_MEMSTAT_DIAG_SUSPENDED) {
				continue;
			}
		}
#endif /* CONFIG_JETSAM && (DEVELOPMENT || DEBUG) */

#if CONFIG_FREEZE
		if (!skip) {
			if (p->p_memstat_state & P_MEMSTAT_LOCKED) {
				skip = TRUE;
			} else {
				skip = FALSE;
			}				
		}
#endif

		if (skip) {
			continue;
		} else {

			if (memorystatus_jetsam_snapshot_count == 0) {
				memorystatus_init_jetsam_snapshot_locked(NULL,0);
				new_snapshot = TRUE;
			}
	
			if (proc_ref_locked(p) == p) {
				/*
				 * Mark as terminated so that if exit1() indicates success, but the process (for example)
				 * is blocked in task_exception_notify(), it'll be skipped if encountered again - see
				 * <rdar://problem/13553476>. This is cheaper than examining P_LEXIT, which requires the
				 * acquisition of the proc lock.
				 */
				p->p_memstat_state |= P_MEMSTAT_TERMINATED;

				proc_list_unlock();
			} else {
				/*
				 * We need to restart the search again because
				 * proc_ref_locked _can_ drop the proc_list lock
				 * and we could have lost our stored next_p via
				 * an exit() on another core.
				 */
				i = 0;
				next_p = memorystatus_get_first_proc_locked(&i, TRUE);
				continue;
			}
		
			freed_mem = memorystatus_kill_proc(p, kMemorystatusKilledHiwat, jetsam_reason, &killed); /* purged and/or killed 'p' */

			/* Success? */
			if (freed_mem) {
				if (killed == FALSE) {
					/* purged 'p'..don't reset HWM candidate count */
					*purged = TRUE;

					proc_list_lock();
					p->p_memstat_state &= ~P_MEMSTAT_TERMINATED;
					proc_list_unlock();
				}
				proc_rele(p);
				goto exit;
			}
			/*
			 * Failure - first unwind the state,
			 * then fall through to restart the search.
			 */
			proc_list_lock();
			proc_rele_locked(p);
			p->p_memstat_state &= ~P_MEMSTAT_TERMINATED;
			p->p_memstat_state |= P_MEMSTAT_ERROR;
			*errors += 1;

			i = 0;
			next_p = memorystatus_get_first_proc_locked(&i, TRUE);
		}
	}
	
	proc_list_unlock();
	
exit:
	os_reason_free(jetsam_reason);

	/* Clear snapshot if freshly captured and no target was found */
	if (new_snapshot && !killed) {
		proc_list_lock();
		memorystatus_jetsam_snapshot->entry_count = memorystatus_jetsam_snapshot_count = 0;
		proc_list_unlock();
	}
	
	KERNEL_DEBUG_CONSTANT(BSDDBG_CODE(DBG_BSD_MEMSTAT, BSD_MEMSTAT_JETSAM_HIWAT) | DBG_FUNC_END, 
			      memorystatus_available_pages, killed ? aPid : 0, 0, 0, 0);

	return killed;
}

首先通過memorystatus_get_first_proc_locked(&i, TRUE)去優先級隊列裏面取出優先級最低的進程。如果這個進程內存小於閾值(footprint_in_bytes <= memlimit_in_bytes),則繼續尋找下一個優先級次低的進程memorystatus_get_next_proc_locked,直到找到內存超過閾值的進程,將通過memorystatus_do_kill殺掉這個進程,並結束循環。

normal kill

不過high-water的閾值較高,一般不容易觸發。如果通過high-water相關代碼不能結束任何進程,將走到memorystatus_act_aggressive()函數中,也就是大部分OOM發生的地方。

static boolean_t
memorystatus_act_aggressive(uint32_t cause, os_reason_t jetsam_reason, int *jld_idle_kills, boolean_t *corpse_list_purged, boolean_t *post_snapshot)
{
	if (memorystatus_jld_enabled == TRUE) {

		boolean_t killed;
		uint32_t errors = 0;

		/* Jetsam Loop Detection - locals */
		memstat_bucket_t *bucket;
		int		jld_bucket_count = 0;
		struct timeval	jld_now_tstamp = {0,0};
		uint64_t 	jld_now_msecs = 0;
		int		elevated_bucket_count = 0;

		/* Jetsam Loop Detection - statics */
		static uint64_t  jld_timestamp_msecs = 0;
		static int	 jld_idle_kill_candidates = 0;	/* Number of available processes in band 0,1 at start */
		static int	 jld_eval_aggressive_count = 0;		/* Bumps the max priority in aggressive loop */
		static int32_t   jld_priority_band_max = JETSAM_PRIORITY_UI_SUPPORT;
		/*
		 * Jetsam Loop Detection: attempt to detect
		 * rapid daemon relaunches in the lower bands.
		 */
		
		microuptime(&jld_now_tstamp);

		/*
		 * Ignore usecs in this calculation.
		 * msecs granularity is close enough.
		 */
		jld_now_msecs = (jld_now_tstamp.tv_sec * 1000);

		proc_list_lock();
		switch (jetsam_aging_policy) {
		case kJetsamAgingPolicyLegacy:
			bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
			jld_bucket_count = bucket->count;
			bucket = &memstat_bucket[JETSAM_PRIORITY_AGING_BAND1];
			jld_bucket_count += bucket->count;
			break;
		case kJetsamAgingPolicySysProcsReclaimedFirst:
		case kJetsamAgingPolicyAppsReclaimedFirst:
			bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
			jld_bucket_count = bucket->count;
			bucket = &memstat_bucket[system_procs_aging_band];
			jld_bucket_count += bucket->count;
			bucket = &memstat_bucket[applications_aging_band];
			jld_bucket_count += bucket->count;
			break;
		case kJetsamAgingPolicyNone:
		default:
			bucket = &memstat_bucket[JETSAM_PRIORITY_IDLE];
			jld_bucket_count = bucket->count;
			break;
		}

		bucket = &memstat_bucket[JETSAM_PRIORITY_ELEVATED_INACTIVE];
		elevated_bucket_count = bucket->count;

		proc_list_unlock();

		/*
		 * memorystatus_jld_eval_period_msecs is a tunable
		 * memorystatus_jld_eval_aggressive_count is a tunable
		 * memorystatus_jld_eval_aggressive_priority_band_max is a tunable
		 */
		if ( (jld_bucket_count == 0) || 
		     (jld_now_msecs > (jld_timestamp_msecs + memorystatus_jld_eval_period_msecs))) {

			/* 
			 * Refresh evaluation parameters 
			 */
			jld_timestamp_msecs	 = jld_now_msecs;
			jld_idle_kill_candidates = jld_bucket_count;
			*jld_idle_kills		 = 0;
			jld_eval_aggressive_count = 0;
			jld_priority_band_max	= JETSAM_PRIORITY_UI_SUPPORT;
		}

		if (*jld_idle_kills > jld_idle_kill_candidates) {
			jld_eval_aggressive_count++;

#if DEVELOPMENT || DEBUG
			printf("memorystatus: aggressive%d: beginning of window: %lld ms, : timestamp now: %lld ms\n",
					jld_eval_aggressive_count,
					jld_timestamp_msecs,
					jld_now_msecs);
			printf("memorystatus: aggressive%d: idle candidates: %d, idle kills: %d\n",
					jld_eval_aggressive_count,
					jld_idle_kill_candidates,
					*jld_idle_kills);
#endif /* DEVELOPMENT || DEBUG */

			if ((jld_eval_aggressive_count == memorystatus_jld_eval_aggressive_count) &&
			    (total_corpses_count() > 0) && (*corpse_list_purged == FALSE)) {
				/*
				 * If we reach this aggressive cycle, corpses might be causing memory pressure.
				 * So, in an effort to avoid jetsams in the FG band, we will attempt to purge
				 * corpse memory prior to this final march through JETSAM_PRIORITY_UI_SUPPORT.
				 */
				task_purge_all_corpses();
				*corpse_list_purged = TRUE;
			}
			else if (jld_eval_aggressive_count > memorystatus_jld_eval_aggressive_count) {
				/* 
				 * Bump up the jetsam priority limit (eg: the bucket index)
				 * Enforce bucket index sanity.
				 */
				if ((memorystatus_jld_eval_aggressive_priority_band_max < 0) || 
				    (memorystatus_jld_eval_aggressive_priority_band_max >= MEMSTAT_BUCKET_COUNT)) {
					/*
					 * Do nothing.  Stick with the default level.
					 */
				} else {
					jld_priority_band_max = memorystatus_jld_eval_aggressive_priority_band_max;
				}
			}

			/* Visit elevated processes first */
			while (elevated_bucket_count) {

				elevated_bucket_count--;

				/*
				 * memorystatus_kill_elevated_process() drops a reference,
				 * so take another one so we can continue to use this exit reason
				 * even after it returns.
				 */

				os_reason_ref(jetsam_reason);
				killed = memorystatus_kill_elevated_process(
					cause,
					jetsam_reason,
					JETSAM_PRIORITY_ELEVATED_INACTIVE,
					jld_eval_aggressive_count,
					&errors);

				if (killed) {
					*post_snapshot = TRUE;
					if (memorystatus_avail_pages_below_pressure()) {
						/*
						 * Still under pressure.
						 * Find another pinned processes.
						 */
						continue;
					} else {
						return TRUE;
					}
				} else {
					/*
					 * No pinned processes left to kill.
					 * Abandon elevated band.
					 */
					break;
				}
			}

			/*
			 * memorystatus_kill_top_process_aggressive() allocates its own
			 * jetsam_reason so the kMemorystatusKilledProcThrashing cause
			 * is consistent throughout the aggressive march.
			 */
			killed = memorystatus_kill_top_process_aggressive(
				kMemorystatusKilledProcThrashing,
				jld_eval_aggressive_count, 
				jld_priority_band_max, 
				&errors);
				
			if (killed) {
				/* Always generate logs after aggressive kill */
				*post_snapshot = TRUE;
				*jld_idle_kills = 0;
				return TRUE;
			} 
		}

		return FALSE;
	}

	return FALSE;
}

首先有一個jld_bucket_count,這裏包含可以直接幹掉的低優先級進程的數量。根據jetsam_aging_policy確定哪些優先級類型的進程需要被直接殺掉(正常情況下就是優先級極低的進程和一些正常情況下隨時可回收的進程:JETSAM_PRIORITY_IDLEsystem_procs_aging_bandapplications_aging_band)。

如果內存壓力依然存在,則通過memorystatus_kill_elevated_process殺掉後臺進程。每殺掉一個後臺進程,通過memorystatus_available_pages檢測一下內存壓力。如果memorystatus_available_pages還是小於閾值,則繼續殺掉下一個進程。

如果殺掉了所有低優先級的進程,還有內存壓力,再通過memorystatus_kill_top_process_aggressive殺掉優先級最低的進程。這裏是觸發FOOM的關鍵,如果當前前臺進程已經是最低優先級的進程了,那就會發生FOOM。

LRU殺死top process

如果上面memorystatus_act_aggressive函數沒有殺死任何進程,那麼就需要通過LRU來殺死Jetsam隊列中的第一個進程。

/*
 * memorystatus_kill_top_process() drops a reference,
 * so take another one so we can continue to use this exit reason
 * even after it returns
 */
os_reason_ref(jetsam_reason);

/* LRU */
killed = memorystatus_kill_top_process(TRUE, sort_flag, cause, jetsam_reason, &priority, &errors);
sort_flag = FALSE;

if (killed) {
	if (memorystatus_post_snapshot(priority, cause) == TRUE) {

			post_snapshot = TRUE;
	}

	/* Jetsam Loop Detection */
	if (memorystatus_jld_enabled == TRUE) {
		if ((priority == JETSAM_PRIORITY_IDLE) || (priority == system_procs_aging_band) || (priority == applications_aging_band)) {
			jld_idle_kills++;
		} else {
			/*
			 * We've reached into bands beyond idle deferred.
			 * We make no attempt to monitor them
			 */
		}
	}

	if ((priority >= JETSAM_PRIORITY_UI_SUPPORT) && (total_corpses_count() > 0) && (corpse_list_purged == FALSE)) {
		/*
		 * If we have jetsammed a process in or above JETSAM_PRIORITY_UI_SUPPORT
		 * then we attempt to relieve pressure by purging corpse memory.
		 */
		task_purge_all_corpses();
		corpse_list_purged = TRUE;
	}
	goto done;
}

if (memorystatus_avail_pages_below_critical()) {
	/*
	 * Still under pressure and unable to kill a process - purge corpse memory
	 */
	if (total_corpses_count() > 0) {
		task_purge_all_corpses();
		corpse_list_purged = TRUE;
	}

	if (memorystatus_avail_pages_below_critical()) {
		/*
		 * Still under pressure and unable to kill a process - panic
		 */
		panic("memorystatus_jetsam_thread: no victim! available pages:%llu\n", (uint64_t)memorystatus_available_pages);
	}
}

當所有流程執行完畢,則做一些收尾的工作。


		/*
		 * We do not want to over-kill when thrashing has been detected.
		 * To avoid that, we reset the flag here and notify the
		 * compressor.
		 */
		if (is_reason_thrashing(kill_under_pressure_cause)) {
			kill_under_pressure_cause = 0;
#if CONFIG_JETSAM
			vm_thrashing_jetsam_done();
#endif /* CONFIG_JETSAM */
		} else if (is_reason_zone_map_exhaustion(kill_under_pressure_cause)) {
			kill_under_pressure_cause = 0;
		}

		os_reason_free(jetsam_reason);
	}

	kill_under_pressure_cause = 0;
	
	if (errors) {
		memorystatus_clear_errors();
	}

	if (post_snapshot) {
		proc_list_lock();
		size_t snapshot_size = sizeof(memorystatus_jetsam_snapshot_t) +
			sizeof(memorystatus_jetsam_snapshot_entry_t) * (memorystatus_jetsam_snapshot_count);
		uint64_t timestamp_now = mach_absolute_time();
		memorystatus_jetsam_snapshot->notification_time = timestamp_now;
		memorystatus_jetsam_snapshot->js_gencount++;
		if (memorystatus_jetsam_snapshot_count > 0 && (memorystatus_jetsam_snapshot_last_timestamp == 0 ||
				timestamp_now > memorystatus_jetsam_snapshot_last_timestamp + memorystatus_jetsam_snapshot_timeout)) {
			proc_list_unlock();
			int ret = memorystatus_send_note(kMemorystatusSnapshotNote, &snapshot_size, sizeof(snapshot_size));
			if (!ret) {
				proc_list_lock();
				memorystatus_jetsam_snapshot_last_timestamp = timestamp_now;
				proc_list_unlock();
			}
		} else {
			proc_list_unlock();
		}
	}

	KERNEL_DEBUG_CONSTANT(BSDDBG_CODE(DBG_BSD_MEMSTAT, BSD_MEMSTAT_SCAN) | DBG_FUNC_END,
		memorystatus_available_pages, 0, 0, 0, 0);

	memorystatus_thread_block(0, memorystatus_thread);

當檢測到顛簸時,系統並不想過度殺戮。爲避免這種情況,系統在此處重置標誌並通知compressor。如果需要記錄當前內存快照,記錄後掛起當前線程,等待之後遇到OOM時再次喚醒。

如何觸發OOM檢測

memorystatus_thread初始化後,會立刻檢測一次OOM。

task.c中,當物理內存達到限制時,觸發回調,會調用memorystatus_on_ledger_footprint_exceeded,來同步觸發per-process-limit類型的OOM。

與上一個類似的,如:memorystatus_kill_on_vnode_limit也是同步觸發的。也就是最終調用了memorystatus_kill_process_sync方法,直接殺死對應的進程,如果pid爲-1則殺死隊列頭部的進程。

static boolean_t 
memorystatus_kill_process_sync(pid_t victim_pid, uint32_t cause, os_reason_t jetsam_reason) {
	boolean_t res;

	uint32_t errors = 0;

	if (victim_pid == -1) {
		/* No pid, so kill first process */
		res = memorystatus_kill_top_process(TRUE, TRUE, cause, jetsam_reason, NULL, &errors);
	} else {
		res = memorystatus_kill_specific_process(victim_pid, cause, jetsam_reason);
	}
	
	if (errors) {
		memorystatus_clear_errors();
	}

	if (res == TRUE) {
		/* Fire off snapshot notification */
		proc_list_lock();
		size_t snapshot_size = sizeof(memorystatus_jetsam_snapshot_t) + 
			sizeof(memorystatus_jetsam_snapshot_entry_t) * memorystatus_jetsam_snapshot_count;
		uint64_t timestamp_now = mach_absolute_time();
		memorystatus_jetsam_snapshot->notification_time = timestamp_now;
		if (memorystatus_jetsam_snapshot_count > 0 && (memorystatus_jetsam_snapshot_last_timestamp == 0 ||
				timestamp_now > memorystatus_jetsam_snapshot_last_timestamp + memorystatus_jetsam_snapshot_timeout)) {
			proc_list_unlock();
			int ret = memorystatus_send_note(kMemorystatusSnapshotNote, &snapshot_size, sizeof(snapshot_size));
			if (!ret) {
				proc_list_lock();
				memorystatus_jetsam_snapshot_last_timestamp = timestamp_now;
				proc_list_unlock();
			}
		} else {
			proc_list_unlock();
		}
	}

	return res;
}

memorystatus_kill_on_VM_compressor_space_shortagememorystatus_kill_on_VM_compressor_thrashingmemorystatus_kill_on_FC_thrashing都是異步觸發的,也就是說他們調用的是memorystatus_kill_process_sync方法。

static boolean_t 
memorystatus_kill_process_async(pid_t victim_pid, uint32_t cause) {
	/*
	 * TODO: allow a general async path
	 *
	 * NOTE: If a new async kill cause is added, make sure to update memorystatus_thread() to
	 * add the appropriate exit reason code mapping.
	 */
	if ((victim_pid != -1) ||
			(cause != kMemorystatusKilledVMPageShortage &&
			cause != kMemorystatusKilledVMCompressorThrashing &&
			cause != kMemorystatusKilledVMCompressorSpaceShortage &&
			cause != kMemorystatusKilledFCThrashing &&
			cause != kMemorystatusKilledZoneMapExhaustion)) {
		return FALSE;
	}
    
	kill_under_pressure_cause = cause;
	memorystatus_thread_wake();
	return TRUE;
}

也就是最終喚醒了memorystatus_thread,來執行剛剛咱們查看源碼的那套流程。

memorystatus_kill_on_zone_map_exhaustion(pid_t pid)中,如果pid爲-1,則調用異步方法;否則調用同步方法。

梳理源碼邏輯流程

  1. JetSam線程初始化完畢,從外部接收到內存壓力
  2. 如果接收到的內存壓力是當前物理內存達到限制時,同步觸發per-process-limit類型的OOM,退出流程
  3. 如果接受到的內存壓力是其他類型時,則喚醒JetSam線程,判斷kill_under_pressure_cause值爲kMemorystatusKilledVMThrashingkMemorystatusKilledFCThrashingkMemorystatusKilledZoneMapExhaustion時,或者當前可用內存memorystatus_available_pages小於閾值memorystatus_available_pages_pressure時,進入OOM邏輯。
  4. 遍歷優先級最低的每個進程,根據phys_footprint,判斷當前進程是否高於閾值,如果沒有超過閾值的,則據需查找下一個次低優先級的進程,直到找到後,觸發high-water類型OOM
  5. 此時先回一個收優先級較低的進程或正常情況下隨時可回收的進程,再次走到4的判斷邏輯
  6. 當所有低優先級進程或正常情況下課隨時回收的進程都被殺掉後,如果memorystatus_available_pages依然小於閾值,先殺掉後臺的進程,每殺掉一個進程,判斷一下memorystatus_available_pages是否還小於閾值,如果已經小於閾值了,則掛起線程,等待喚醒
  7. 當所有後臺進程都被殺掉後,調用memorystatus_kill_top_process_aggressive,殺掉前臺的進程,掛起線程,等待喚醒
  8. 如果上面的memorystatus_kill_top_process_aggressive沒有殺掉任何進程,就通過LRU殺死Jetsam隊列中的第一個進程,掛起線程,等待喚醒

如何判定發生了OOM

facebook和微信的Matrix都是採用的排除法。在Matrix初始化的時候調用checkRebootType方法,來判定是否發生了OOM,具體流程如下:

  1. 如果當前設備正在DEBUG,則直接返回,不繼續執行。
  2. 上次打開app是否發生了普通的崩潰,如果不是繼續執行
  3. 上次打開app後,是用戶是否主動退出的應用(監聽UIApplicationWillTerminateNotification消息),如果不是繼續執行
  4. 上次打開app後,是否調用exit相關的函數(通過atexit函數監控),如果不是繼續執行
  5. 上次打開app後,app是否掛起suspend或者執行backgroundFetch,如果此時沒有被看門狗殺死,則是一種OOM,Matrix起名叫Suspend OOM,如果不是繼續執行
  6. app的uuid是否變化了,如果不是繼續執行
  7. 上次打開app後,系統是否升級了,如果不是繼續執行
  8. 上次打開app後,設備是否重啓了,如果不是繼續執行
  9. 上次打開app時,app是否處於後臺,如果是,則觸發了Background OOM,如果不是繼續執行
  10. 上次打開app後,app是否處於前臺,是否主線程卡死了,如果沒有卡死,則說明觸發了Foreground OOM

我們平時談論的OOM,其實大部分都是FOOM。因爲如果我們的程序在後臺,優先級很低,即便我們不佔用大量的內存,也可能會由於前臺應用程序佔用了大量的內存,而把我們在後臺的程序殺掉。這是系統的機制,我們沒有太多的辦法。

所以主要關注FOOM。而針對於FOOM,我們需要着重關注dirty pagesIOKit mappings,當然注意系統做的緩存,例如圖片、字體等。針對於OOM問題監控與解決,可以參考Matrix和OOMDetector兩個開源庫。目前針對OOM的監控也處於探索階段,日後如果在監控及處理OOM上有了一些經驗後,也會主動分享給大家。

參考資料

OOM探究:XNU 內存狀態管理

你真的瞭解OOM嗎?——京東iOS APP內存優化實錄

(譯)Handling low memory conditions in iOS and Mavericks

iOS Out-Of-Memory 原理闡述及方案調研

iOS微信內存監控

本文鏈接http://www.taodudu.cc/news/show-5381.html

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章