Linux X86_64 內核態缺頁處理

 Linux X86針對meltdown漏洞開啓了頁表隔離(PTI)功能,PTI使用兩組PGD來表示整個進程空間.在進程陷入內核態時,CR3會相應的切換到進程內核態的PGD頁表.

假如進程調用了vmalloc分配內存,此時會進行頁表映射,但是vmalloc是把這塊內存映射到了內核的虛擬地址空間(init_mm),並沒有映射到當前進程的內核態頁表.

vmalloc->__vmalloc_node_range->__vmalloc_area_node->map_vm_area->vmap_page_range->vmap_page_range_noflush->pgd_offset_k //建立內核空間頁表

#define pgd_offset_k(address) pgd_offset(&init_mm, (address))

所以當ring3進程陷入內核態,並訪問vmalloc分配的內存時,這裏會產生內核態缺頁異常.

處理流程:

do_page_fault->__do_page_fault->do_kern_addr_fault->vmalloc_fault

在vmalloc_fault中,會複製init_mm的頁表到當前CR3指向的頁表。

static noinline int vmalloc_fault(unsigned long address)
{
	pgd_t *pgd, *pgd_k;
	p4d_t *p4d, *p4d_k;
	pud_t *pud;
	pmd_t *pmd;
	pte_t *pte;

	/* Make sure we are in vmalloc area: */
	if (!(address >= VMALLOC_START && address < VMALLOC_END))
		return -1;

	WARN_ON_ONCE(in_nmi());

	/*
	 * Copy kernel mappings over when needed. This can also
	 * happen within a race in page table update. In the later
	 * case just flush:
	 */
        /*讀取當前進程內核態PGD */
	pgd = (pgd_t *)__va(read_cr3_pa()) + pgd_index(address);
	/*pgd_k爲vmalloc建立的內核頁表 */
        pgd_k = pgd_offset_k(address);
	if (pgd_none(*pgd_k))
		return -1;

        /*下面就是依次複製內核頁表的P4D->PUD->PMD->PTE */

	/* With 4-level paging, copying happens on the p4d level. */
	p4d = p4d_offset(pgd, address);
	p4d_k = p4d_offset(pgd_k, address);
	if (p4d_none(*p4d_k))
		return -1;

	if (p4d_none(*p4d) && !pgtable_l5_enabled()) {
		set_p4d(p4d, *p4d_k);
		arch_flush_lazy_mmu_mode();
	} else {
		BUG_ON(p4d_pfn(*p4d) != p4d_pfn(*p4d_k));
	}

	
	pud = pud_offset(p4d, address);
	if (pud_none(*pud))
		return -1;

	pmd = pmd_offset(pud, address);
	if (pmd_none(*pmd))
		return -1;

	pte = pte_offset_kernel(pmd, address);
	if (!pte_present(*pte))
		return -1;

	return 0;
}

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章