Out of memory, OOM killer輸出信息分析

當out of memory發生時,out_of_memory函數會選擇一個內核認爲犯有分配過多內存 “罪行”的進程,並殺死該進程。

這就有很大的機率騰出較多的空閒頁,然後再跳轉回重試內存分配的操作。

這裏我們不討論out_of_memory 函數流程 選擇要犧牲進程的策略方法。

我們僅討論out of memory發生時,內核輸出信息的含義。


1. OOM 信息

以下是一段典型的out of memory 內核輸出信息:

<4>[12345.342532] systemd-journal invoked oom-killer: gfp_mask=0x800d0, order=0, oom_score_adj=0
<4>[12345.351216] CPU: 1 PID: 1371 Comm: systemd-journal Tainted: G           O 3.14.31-00017-g40fab71 #1
<4>[12345.360695] Backtrace:
<4>[12345.363263] [<c0012fcc>] (dump_backtrace) from [<c00131a4>] (show_stack+0x20/0x24)
<4>[12345.371192]  r6:00000000 r5:ffffffff r4:00000000 r3:bd943631
<4>[12345.377136] [<c0013184>] (show_stack) from [<c07bbe78>] (dump_stack+0x7c/0xc8)
<4>[12345.384710] [<c07bbdfc>] (dump_stack) from [<c07ba7e4>] (dump_header.isra.14+0x74/0x188)
<4>[12345.393184]  r6:000800d0 r5:00000000 r4:e8088000 r3:00000002
<4>[12345.399126] [<c07ba770>] (dump_header.isra.14) from [<c00f8a28>] (oom_kill_process+0x230/0x3e0)
<4>[12345.408234]  r10:00000000 r8:000800d0 r7:00000000 r6:c0b89aa8 r5:000800d0 r4:e9bb79c0
<4>[12345.416462] [<c00f87f8>] (oom_kill_process) from [<c00f90c8>] (out_of_memory+0x2f4/0x354)
<4>[12345.425024]  r10:00000000 r9:00000000 r8:000800d0 r7:00000000 r6:c0b89aa8 r5:c0b89d08
<4>[12345.433249]  r4:c0b89aa8
<4>[12345.435903] [<c00f8dd4>] (out_of_memory) from [<c00fd6c8>] (__alloc_pages_nodemask+0x93c/0x988)
<4>[12345.445011]  r10:00000000 r9:c0c38fc0 r8:c0b871d8 r7:e8088000 r6:c0c39bc0 r5:00000000
<4>[12345.453234]  r4:000800d0
<4>[12345.455887] [<c00fcd8c>] (__alloc_pages_nodemask) from [<c00fd734>] (__get_free_pages+0x20/0x3c)
<4>[12345.465087]  r10:e97d36a8 r9:00000063 r8:e8089f6c r7:00000063 r6:b6f79f68 r5:e97d36a8
<4>[12345.473311]  r4:00000000
<4>[12345.475965] [<c00fd714>] (__get_free_pages) from [<c0196878>] (proc_pid_readlink+0x68/0x110)
<4>[12345.484808] [<c0196810>] (proc_pid_readlink) from [<c013dcb8>] (SyS_readlinkat+0xf0/0x104)
<4>[12345.493461]  r7:bea40520 r6:ffffff9c r5:00004000 r4:00000000
<4>[12345.499402] [<c013dbc8>] (SyS_readlinkat) from [<c000eee0>] (ret_fast_syscall+0x0/0x34)
<4>[12345.507785]  r10:00000000 r9:e8088000 r8:c000f148 r7:0000014c r6:00000063 r5:b6f79f68
<4>[12345.516011]  r4:00000064
<4>[12345.518663] Mem-info:
<4>[12345.521049] Normal per-cpu:
<4>[12345.523969] CPU    0: hi:   42, btch:   7 usd:  23
<4>[12345.528979] CPU    1: hi:   42, btch:   7 usd:  25
<4>[12345.534004] HighMem per-cpu:
<4>[12345.537013] CPU    0: hi:  186, btch:  31 usd:  27
<4>[12345.542199] CPU    1: hi:  186, btch:  31 usd:  29
<4>[12345.547247] active_anon:21860 inactive_anon:14790 isolated_anon:0
<4>[12345.547247]  active_file:41585 inactive_file:10422 isolated_file:0
<4>[12345.547247]  unevictable:0 dirty:9 writeback:205 unstable:0
<4>[12345.547247]  free:285748 slab_reclaimable:2100 slab_unreclaimable:26286
<4>[12345.547247]  mapped:26079 shmem:14857 pagetables:687 bounce:0
<4>[12345.547247]  free_cma:57779
<4>[12345.581839] Normal free:233460kB min:2488kB low:3108kB high:3732kB active_anon:17312kB 
inactive_anon:10824kB active_file:128kB inactive_file:4kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:774144kB managed:387568kB mlocked:0kB dirty:16kB writeback:76kB 
mapped:3296kB shmem:10840kB slab_reclaimable:8400kB slab_unreclaimable:105144kB kernel_stack:1168kB 
pagetables:2748kB unstable:0kB bounce:0kB free_cma:231116kB writeback_tmp:0kB pages_scanned:1648 
all_unreclaimable? yes
<4>[12345.627014] lowmem_reserve[]: 0 10168 10168
<4>[12345.631565] HighMem free:909036kB min:512kB low:2604kB high:4696kB active_anon:70632kB 
inactive_anon:48336kB active_file:166212kB inactive_file:41684kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:1301504kB managed:1301504kB mlocked:0kB dirty:20kB writeback:744kB 
mapped:101020kB shmem:48588kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 
all_unreclaimable? no
<4>[12345.675614] lowmem_reserve[]: 0 0 0
<4>[12345.679437] Normal: 1165*4kB (MRC) 1122*8kB (RC) 1119*16kB (RC) 1118*32kB (C) 1068*64kB (RC) 
748*128kB (C) 0*256kB 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB 0*8192kB = 233460kB
<4>[12345.695797] HighMem: 99*4kB (M) 1148*8kB (UM) 1314*16kB (UM) 880*32kB (UM) 327*64kB (M) 
87*128kB (M) 34*256kB (M) 38*512kB (M) 12*1024kB (M) 10*2048kB (M) 3*4096kB (M) 91*8192kB (UMR) = 909516kB
<4>[12345.714293] 66770 total pagecache pages
<4>[12345.718309] 0 pages in swap cache
<4>[12345.724832] Swap cache stats: add 0, delete 0, find 0/0
<4>[12345.730308] Free swap  = 0kB
<4>[12345.733412] Total swap = 0kB
<4>[12345.747245] 520192 pages of RAM
<4>[12345.750577] 286253 free pages
<4>[12345.753778] 97924 reserved pages
<4>[12345.757258] 28061 slab pages
<4>[12345.760574] 115601 pages shared
<4>[12345.764283] 0 pages swap cached
<6>[12345.767572] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
<6>[12345.775906] [ 1366]     0  1366      459      125       3        0             0 sh
<6>[12345.785861] [ 1367]     0  1367      665      235       4        0             0 propertyd
<6>[12345.794802] [ 1368]     0  1368    26553     8835      58        0             0 seed
<6>[12345.803296] [ 1371]     0  1371     1648      772       5        0             0 systemd-journal
<6>[12345.812792] [ 1375]     0  1375      750      300       4        0         -1000 systemd-udevd
<6>[12345.822449] [ 2416]  1040  2416     3852      510       7        0             0 secd
<6>[12345.831341] [ 2419]     0  2419     6678      923       9        0             0 storagemanagerd
<6>[12345.840944] [ 2420]     0  2420     1267      497       5        0             0 connmand
<6>[12345.849566] [ 2422]     0  2422     4484      687       8        0             0 uuid
<6>[12345.857843] [ 2424]     0  2424     1161      358       5        0             0 connman-vpnd
<6>[12345.867271] [ 2427]  1000  2427     1593      461       6        0             0 logboxd
<6>[12345.875846] [ 2432]     0  2432     9483     1718      15        0             0 cmns
<6>[12345.884104] [ 2451]    81  2451     1355      474       4        0          -900 dbus-daemon
<6>[12345.893018] [ 2532]     0  2532    11794      246      10        0             0 adbd
<6>[12345.901304] [ 2535]     0  2535     1502      347       5        0             0 wpa_supplicant
<6>[12345.910473] [ 2536]     0  2536    12820      866      12        0             0 udisksd
<6>[12345.919119] [ 2537]     0  2537     1898      527       6        0             0 tyid
<6>[12345.927361] [ 2540]     0  2540    10076     2157      16        0             0 datamanagerd
<6>[12345.936349] [ 2554]     0  2554     5983      574       7        0             0 connectivityser
<6>[12345.945635] [ 2558]     0  2558    10604     5388      21        0             0 weston
<6>[12345.964101] [ 2589]     0  2589    14597     1917      17        0             0 pagemanagerd
<6>[12345.973272] [ 2590]     0  2590     3832      515       7        0             0 amt
<6>[12345.981730] [ 2593]     0  2593     6176     1343      12        0             0 weston-desktop-
<6>[12345.991046] [ 2599]     0  2599     7185      761      12        0             0 scim-launcher
<6>[12346.098925] [ 5580]     0  5580      458      116       3        0             0 sh
<6>[12346.107065] [ 5581]     0  5581      492      175       3        0             0 gzip
<3>[12346.115335] Out of memory: Kill process 5575 thread_x score 481 or sacrifice child
<3>[12346.124212] Killed process 5575 thread_x total-vm:106212kB, anon-rss:18036kB, file-rss:2704kB

2 OOM信息分析

2.1

<4>[12345.342532] systemd-journal invoked oom-killer: gfp_mask=0x800d0, order=0, oom_score_adj=0
systemd-joural: 當前進程爲systemd-journal,請求分配頁面時,引發了oom-killer

gfp_mask=0x800d0: 是alloc_page的GFP標誌,對於當前場景,代表___GFP_RECLAIMABLE | ___GFP_HIGH | ___GFP_IO | ___GFP_FS

order=0 : 表示alloc_page的order爲0, 也就是說僅請求1^0=1個頁面,

oom_score_adj=0: 表明這個進程被殺的機率, oom_score_adj取值0(never kill)~1000(always kill)


<4>[12345.351216] CPU: 1 PID: 1371 Comm: systemd-journal Tainted: G           O 3.14.31-00017-g40fab71 #1
<4>[12345.360695] Backtrace:
<4>[12345.363263] [<c0012fcc>] (dump_backtrace) from [<c00131a4>] (show_stack+0x20/0x24)
<4>[12345.371192]  r6:00000000 r5:ffffffff r4:00000000 r3:bd943631
<4>[12345.377136] [<c0013184>] (show_stack) from [<c07bbe78>] (dump_stack+0x7c/0xc8)
<4>[12345.384710] [<c07bbdfc>] (dump_stack) from [<c07ba7e4>] (dump_header.isra.14+0x74/0x188)
<4>[12345.393184]  r6:000800d0 r5:00000000 r4:e8088000 r3:00000002
<4>[12345.399126] [<c07ba770>] (dump_header.isra.14) from [<c00f8a28>] (oom_kill_process+0x230/0x3e0)
<4>[12345.408234]  r10:00000000 r8:000800d0 r7:00000000 r6:c0b89aa8 r5:000800d0 r4:e9bb79c0
<4>[12345.416462] [<c00f87f8>] (oom_kill_process) from [<c00f90c8>] (out_of_memory+0x2f4/0x354)
<4>[12345.425024]  r10:00000000 r9:00000000 r8:000800d0 r7:00000000 r6:c0b89aa8 r5:c0b89d08
<4>[12345.433249]  r4:c0b89aa8
<4>[12345.435903] [<c00f8dd4>] (out_of_memory) from [<c00fd6c8>] (__alloc_pages_nodemask+0x93c/0x988)
<4>[12345.445011]  r10:00000000 r9:c0c38fc0 r8:c0b871d8 r7:e8088000 r6:c0c39bc0 r5:00000000
<4>[12345.453234]  r4:000800d0
<4>[12345.455887] [<c00fcd8c>] (__alloc_pages_nodemask) from [<c00fd734>] (__get_free_pages+0x20/0x3c)
<4>[12345.465087]  r10:e97d36a8 r9:00000063 r8:e8089f6c r7:00000063 r6:b6f79f68 r5:e97d36a8
<4>[12345.473311]  r4:00000000
<4>[12345.475965] [<c00fd714>] (__get_free_pages) from [<c0196878>] (proc_pid_readlink+0x68/0x110)
<4>[12345.484808] [<c0196810>] (proc_pid_readlink) from [<c013dcb8>] (SyS_readlinkat+0xf0/0x104)
<4>[12345.493461]  r7:bea40520 r6:ffffff9c r5:00004000 r4:00000000
<4>[12345.499402] [<c013dbc8>] (SyS_readlinkat) from [<c000eee0>] (ret_fast_syscall+0x0/0x34)
<4>[12345.507785]  r10:00000000 r9:e8088000 r8:c000f148 r7:0000014c r6:00000063 r5:b6f79f68
<4>[12345.516011]  r4:00000064

dump_header->dump_stack的輸出的引發OOM的調用函數棧,從ret_fast_syscall開始dump_backtrace。

通過這段輸出,可以推測systemd-journal調用readlink系統調用時,引發的一次分頁操作,導致了OOM。


2.2

<4>[12345.518663] Mem-info:
<4>[12345.521049] Normal per-cpu:
<4>[12345.523969] CPU    0: hi:   42, btch:   7 usd:  23
<4>[12345.528979] CPU    1: hi:   42, btch:   7 usd:  25
<4>[12345.534004] HighMem per-cpu:
<4>[12345.537013] CPU    0: hi:  186, btch:  31 usd:  27
<4>[12345.542199] CPU    1: hi:  186, btch:  31 usd:  29
<4>[12345.547247] active_anon:21860 inactive_anon:14790 isolated_anon:0
<4>[12345.547247]  active_file:41585 inactive_file:10422 isolated_file:0
<4>[12345.547247]  unevictable:0 dirty:9 writeback:205 unstable:0
<4>[12345.547247]  free:285748 slab_reclaimable:2100 slab_unreclaimable:26286
<4>[12345.547247]  mapped:26079 shmem:14857 pagetables:687 bounce:0
<4>[12345.547247]  free_cma:57779
<4>[12345.581839] Normal free:233460kB min:2488kB low:3108kB high:3732kB active_anon:17312kB 
inactive_anon:10824kB active_file:128kB inactive_file:4kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:774144kB managed:387568kB mlocked:0kB dirty:16kB writeback:76kB 
mapped:3296kB shmem:10840kB slab_reclaimable:8400kB slab_unreclaimable:105144kB kernel_stack:1168kB 
pagetables:2748kB unstable:0kB bounce:0kB free_cma:231116kB writeback_tmp:0kB pages_scanned:1648 
all_unreclaimable? yes
<4>[12345.627014] lowmem_reserve[]: 0 10168 10168
<4>[12345.631565] HighMem free:909036kB min:512kB low:2604kB high:4696kB active_anon:70632kB 
inactive_anon:48336kB active_file:166212kB inactive_file:41684kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:1301504kB managed:1301504kB mlocked:0kB dirty:20kB writeback:744kB 
mapped:101020kB shmem:48588kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
<4>[12345.675614] lowmem_reserve[]: 0 0 0
<4>[12345.679437] Normal: 1165*4kB (MRC) 1122*8kB (RC) 1119*16kB (RC) 1118*32kB (C) 1068*64kB (RC) 
748*128kB (C) 0*256kB 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB 0*8192kB = 233460kB
<4>[12345.695797] HighMem: 99*4kB (M) 1148*8kB (UM) 1314*16kB (UM) 880*32kB (UM) 327*64kB (M) 
87*128kB (M) 34*256kB (M) 38*512kB (M) 12*1024kB (M) 10*2048kB (M) 3*4096kB (M) 91*8192kB (UMR) = 909516kB
<4>[12345.714293] 66770 total pagecache pages
<4>[12345.718309] 0 pages in swap cache
<4>[12345.724832] Swap cache stats: add 0, delete 0, find 0/0
<4>[12345.730308] Free swap  = 0kB
<4>[12345.733412] Total swap = 0kB
<4>[12345.747245] 520192 pages of RAM
<4>[12345.750577] 286253 free pages
<4>[12345.753778] 97924 reserved pages
<4>[12345.757258] 28061 slab pages
<4>[12345.760574] 115601 pages shared
<4>[12345.764283] 0 pages swap cached
dump_header->show_mem輸出當前系統內存信息。

2.2.1

<4>[12345.521049] Normal per-cpu:
<4>[12345.523969] CPU    0: hi:   42, btch:   7 usd:  23
<4>[12345.528979] CPU    1: hi:   42, btch:   7 usd:  25
<4>[12345.534004] HighMem per-cpu:
<4>[12345.537013] CPU    0: hi:  186, btch:  31 usd:  27
<4>[12345.542199] CPU    1: hi:  186, btch:  31 usd:  29
每個內存管理區定義了一個“每CPU”頁框高速緩存,所有“每CPU”高速緩存包含一些預先分配的頁框,它們被用於滿足本地CPU 發出的單個頁內存請求。

CPU    0: hi:   42, btch:   7 usd:  23

表示 CPU 0,

hi: 42 表示上限值,超過這個數字,則釋放batch個頁框到buddy系統中

btch: 7 表示向高速緩存添加或者刪除頁框時,頁框塊的大小

usd: 23 頁框高速緩存中的頁框數目

2.2.2

<4>[12345.547247] active_anon:21860 inactive_anon:14790 isolated_anon:0
<4>[12345.547247]  active_file:41585 inactive_file:10422 isolated_file:0
<4>[12345.547247]  unevictable:0 dirty:9 writeback:205 unstable:0
<4>[12345.547247]  free:285748 slab_reclaimable:2100 slab_unreclaimable:26286
<4>[12345.547247]  mapped:26079 shmem:14857 pagetables:687 bounce:0
<4>[12345.547247]  free_cma:57779

active_anon: 活動的匿名映射,"活動"是指最近被訪問過,"匿名"則指頁面映射不與任何數據源相關

inactive_anon: 非活動的匿名映射

isolated_anon: DON'T KNOW

active_file: 活動的文件映射,頁面映射和磁盤文件相關聯

inactive_file: 非活動的文件映射

isolated_file: DON'T KNOW

unevictable:

dirty: 髒頁面,表示頁面的內容和快設備上的原始內容已經不一至

writeback: 當前頁面正處在回寫狀態

unstable:

free: 空閒頁面

slab_relaimable: slab cache中可回收的頁面

slab_unreclaimable: slab cache中不可以回收的頁面

mapped: BH_MAPPED,表示這個頁面被用做快設備的buffer映射,注意這個映射不同於anon和file映射。

shmem: 用於共享內存映射的頁面

pagetable: 頁表佔用的頁面,也就是PTE PTD佔用的頁面數目

bounce:

free_cma: continuous memory allocator的空閒頁面。


2.2.3

<4>[12345.581839] Normal free:233460kB min:2488kB low:3108kB high:3732kB active_anon:17312kB inactive_anon:10824kB 
active_file:128kB inactive_file:4kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:774144kB 
managed:387568kB mlocked:0kB dirty:16kB writeback:76kB mapped:3296kB shmem:10840kB slab_reclaimable:8400kB 
slab_unreclaimable:105144kB kernel_stack:1168kB pagetables:2748kB unstable:0kB bounce:0kB free_cma:231116kB 
writeback_tmp:0kB pages_scanned:1648 all_unreclaimable? yes
<4>[12345.627014] lowmem_reserve[]: 0 10168 10168
<4>[12345.631565] HighMem free:909036kB min:512kB low:2604kB high:4696kB active_anon:70632kB inactive_anon:48336kB 
active_file:166212kB inactive_file:41684kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1301504kB 
managed:1301504kB mlocked:0kB dirty:20kB writeback:744kB mapped:101020kB shmem:48588kB slab_reclaimable:0kB 
slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB 
pages_scanned:0 all_unreclaimable? no
<4>[12345.675614] lowmem_reserve[]: 0 0 0

Normal free: Normal zone的空閒空間

min, low, high是normal zone執行頁面置換的幾個水印

lowmem_reserve: 表示該分zone爲其他zone預留的可分配頁面數

present: 表示zone的物理內存大小

managed: 是buddy系統管理的present內存大小,managed = preset - reserved

其他值可參考2.2.2節,除了數值代表Normal zone,其他含義類似。

注意1,有幾項是Normal特有的,比如kernel_stack, pagetables, free_cma, slab_reclaimable, slab_unreclaimable,是因爲normal zone的頁面是直接映射,這些頁面是供內核中使用的。

對於highmem,主要用來匿名映射,文件映射,mapped,以及共享內存。


2.2.4

<4>[12345.679437] Normal: 1165*4kB (MRC) 1122*8kB (RC) 1119*16kB (RC) 1118*32kB (C) 1068*64kB (RC) 
748*128kB (C) 0*256kB 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB 0*8192kB = 233460kB
<4>[12345.695797] HighMem: 99*4kB (M) 1148*8kB (UM) 1314*16kB (UM) 880*32kB (UM) 327*64kB (M) 
87*128kB (M) 34*256kB (M) 38*512kB (M) 12*1024kB (M) 10*2048kB (M) 3*4096kB (M) 91*8192kB (UMR) = 909516kB
buddy系統信息信息, order範圍0~11

M表示 moveable

R表示 Reserve

C表示 CMA

U表示 unmovable

E表示 reclaimable

1. 僅有 (C),表示這個freelist只能分配給帶有ALLOC_CMA標誌的分配

2. Highmem沒有C標記,這是因爲連續內存分配只發生在Normal zone中

3. MRC,表示這個freelist既有CMA內存,Reserve內存還有Movable內存


3 Who triggered OOM

有幾個因素影響OOM的發生

1. 分配的order大小,以及系統對待order的方式

2. 分配發生在哪個zone

3. Zone的水印大小

4. 內存碎片化程度

5. 據說不停的分配地址空間,也會導致OOM的發生(還沒驗證過)


對於上面的OOM信息,我們可以看到系統內有很大的空閒空間:233460KB,但是OOM仍然發生了。

首先分配的order爲0,所以和碎片化是無關的,gfp_mask=0x800d0說明分配發生在Normal分區,並且類型爲Reclaimable,Reclaimable也就意味着無法從CMA分配內存。

既然不是order過大導致分配失敗,那麼就是free空間小於內存水印min導致了OOM killer。

static bool __zone_watermark_ok(struct zone *z, unsigned int order,
            unsigned long mark, int classzone_idx, int alloc_flags,
            long free_pages)
{ 
     ...

#ifdef CONFIG_CMA
    /* If allocation can't use CMA areas don't use free CMA pages */
    if (!(alloc_flags & ALLOC_CMA))
        free_cma = zone_page_state(z, NR_FREE_CMA_PAGES);
#endif

    if (free_pages - free_cma <= min + z->lowmem_reserve[classzone_idx])
        return false;
    ...
} 
分配類型爲Reclaimable,導致free空間必須減去CMA空閒空間233460KB - 223116kB = 2352KB,小於min 水印,系統啓動OOM killer

發佈了183 篇原創文章 · 獲贊 42 · 訪問量 142萬+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章