From 4f06b0436b2ddbd3b67b10e77098a6862787b3eb Mon Sep 17 00:00:00 2001 From: Jeremy Fitzhardinge Date: Wed, 11 Feb 2009 09:32:19 -0800 Subject: x86/cpa: make sure cpa is safe to call in lazy mmu mode Impact: fix race leading to crash under KVM and Xen The CPA code may be called while we're in lazy mmu update mode - for example, when using DEBUG_PAGE_ALLOC and doing a slab allocation in an interrupt handler which interrupted a lazy mmu update. In this case, the in-memory pagetable state may be out of date due to pending queued updates. We need to flush any pending updates before inspecting the page table. Similarly, we must explicitly flush any modifications CPA may have made (which comes down to flushing queued operations when flushing the TLB). Signed-off-by: Jeremy Fitzhardinge Acked-by: Marcelo Tosatti Cc: Stable Kernel Signed-off-by: Ingo Molnar --- arch/x86/mm/pageattr.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) (limited to 'arch/x86/mm/pageattr.c') diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 84ba74820ad..f664bc1c409 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -576,6 +576,13 @@ static int __change_page_attr(struct cpa_data *cpa, int primary) else address = *cpa->vaddr; + /* + * If we're called with lazy mmu updates enabled, the + * in-memory pte state may be stale. Flush pending updates to + * bring them up to date. + */ + arch_flush_lazy_mmu_mode(); + repeat: kpte = lookup_address(address, &level); if (!kpte) @@ -854,6 +861,13 @@ static int change_page_attr_set_clr(unsigned long *addr, int numpages, } else cpa_flush_all(cache); + /* + * If we've been called with lazy mmu updates enabled, then + * make sure that everything gets flushed out before we + * return. + */ + arch_flush_lazy_mmu_mode(); + out: return ret; } -- cgit v1.2.2 From 7ad9de6ac83bd825996d2de98c92e0f425c31050 Mon Sep 17 00:00:00 2001 From: Thomas Gleixner Date: Thu, 12 Feb 2009 21:16:09 +0100 Subject: x86: CPA avoid repeated lazy mmu flush Impact: Flush the lazy MMU only once Pending mmu updates only need to be flushed once to bring the in-memory pagetable state up to date. Signed-off-by: Thomas Gleixner --- arch/x86/mm/pageattr.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) (limited to 'arch/x86/mm/pageattr.c') diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index f664bc1c409..8ca0d8566fc 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -575,14 +575,6 @@ static int __change_page_attr(struct cpa_data *cpa, int primary) address = cpa->vaddr[cpa->curpage]; else address = *cpa->vaddr; - - /* - * If we're called with lazy mmu updates enabled, the - * in-memory pte state may be stale. Flush pending updates to - * bring them up to date. - */ - arch_flush_lazy_mmu_mode(); - repeat: kpte = lookup_address(address, &level); if (!kpte) @@ -819,6 +811,13 @@ static int change_page_attr_set_clr(unsigned long *addr, int numpages, vm_unmap_aliases(); + /* + * If we're called with lazy mmu updates enabled, the + * in-memory pte state may be stale. Flush pending updates to + * bring them up to date. + */ + arch_flush_lazy_mmu_mode(); + cpa.vaddr = addr; cpa.numpages = numpages; cpa.mask_set = mask_set; -- cgit v1.2.2 From 07a66d7c53a538e1a9759954a82bb6c07365eff9 Mon Sep 17 00:00:00 2001 From: Ingo Molnar Date: Fri, 20 Feb 2009 08:04:13 +0100 Subject: x86: use the right protections for split-up pagetables Steven Rostedt found a bug in where in his modified kernel ftrace was unable to modify the kernel text, due to the PMD itself having been marked read-only as well in split_large_page(). The fix, suggested by Linus, is to not try to 'clone' the reference protection of a huge-page, but to use the standard (and permissive) page protection bits of KERNPG_TABLE. The 'cloning' makes sense for the ptes but it's a confused and incorrect concept at the page table level - because the pagetable entry is a set of all ptes and hence cannot 'clone' any single protection attribute - the ptes can be any mixture of protections. With the permissive KERNPG_TABLE, even if the pte protections get changed after this point (due to ftrace doing code-patching or other similar activities like kprobes), the resulting combined protections will still be correct and the pte's restrictive (or permissive) protections will control it. Also update the comment. This bug was there for a long time but has not caused visible problems before as it needs a rather large read-only area to trigger. Steve possibly hacked his kernel with some really large arrays or so. Anyway, the bug is definitely worth fixing. [ Huang Ying also experienced problems in this area when writing the EFI code, but the real bug in split_large_page() was not realized back then. ] Reported-by: Steven Rostedt Reported-by: Huang Ying Acked-by: Linus Torvalds Signed-off-by: Ingo Molnar --- arch/x86/mm/pageattr.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) (limited to 'arch/x86/mm/pageattr.c') diff --git a/arch/x86/mm/pageattr.c b/arch/x86/mm/pageattr.c index 8ca0d8566fc..7be47d1a97e 100644 --- a/arch/x86/mm/pageattr.c +++ b/arch/x86/mm/pageattr.c @@ -508,18 +508,13 @@ static int split_large_page(pte_t *kpte, unsigned long address) #endif /* - * Install the new, split up pagetable. Important details here: + * Install the new, split up pagetable. * - * On Intel the NX bit of all levels must be cleared to make a - * page executable. See section 4.13.2 of Intel 64 and IA-32 - * Architectures Software Developer's Manual). - * - * Mark the entry present. The current mapping might be - * set to not present, which we preserved above. + * We use the standard kernel pagetable protections for the new + * pagetable protections, the actual ptes set above control the + * primary protection behavior: */ - ref_prot = pte_pgprot(pte_mkexec(pte_clrhuge(*kpte))); - pgprot_val(ref_prot) |= _PAGE_PRESENT; - __set_pmd_pte(kpte, address, mk_pte(base, ref_prot)); + __set_pmd_pte(kpte, address, mk_pte(base, __pgprot(_KERNPG_TABLE))); base = NULL; out_unlock: -- cgit v1.2.2