aboutsummaryrefslogtreecommitdiffstats
path: root/arch/x86/mm
Commit message (Collapse)AuthorAge
*-------------. Merge branches 'x86/alternatives', 'x86/cleanups', 'x86/commandline', ↵Ingo Molnar2008-10-06
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | 'x86/crashdump', 'x86/debug', 'x86/defconfig', 'x86/doc', 'x86/exports', 'x86/fpu', 'x86/gart', 'x86/idle', 'x86/mm', 'x86/mtrr', 'x86/nmi-watchdog', 'x86/oprofile', 'x86/paravirt', 'x86/reboot', 'x86/sparse-fixes', 'x86/tsc', 'x86/urgent' and 'x86/vmalloc' into x86-v28-for-linus-phase1
| | | | | | | | * i386: vmalloc size fixDave Young2008-08-21
| | | | | | |_|/ | | | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Booting kernel with vmalloc=[any size<=16m] will oops on my pc (i386/1G memory). BUG_ON in arch/x86/mm/init_32.c triggered: BUG_ON((unsigned long)high_memory > VMALLOC_START); It's due to the vm area hole. In include/asm-x86/pgtable_32.h: #define VMALLOC_OFFSET (8 * 1024 * 1024) #define VMALLOC_START (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) \ & ~(VMALLOC_OFFSET - 1)) There's several related point: 1. MAXMEM : (-__PAGE_OFFSET - __VMALLOC_RESERVE). The space after VMALLOC_END is included as well, I set it to (VMALLOC_END - PAGE_OFFSET - __VMALLOC_RESERVE) 2. VMALLOC_OFFSET is not considered in __VMALLOC_RESERVE fixed by adding VMALLOC_OFFSET to it. 3. VMALLOC_START : (((unsigned long)high_memory + 2 * VMALLOC_OFFSET - 1) & ~(VMALLOC_OFFSET - 1)) So it's not always 8M, bigger than 8M possible. I set it to ((unsigned long)high_memory + VMALLOC_OFFSET) 4. the VMALLOC_RESERVE is an unused macro, so remove it here. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Cc: akpm@linux-foundation.org Cc: hidave.darkstar@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
| | | | | | | * x86/paravirt: Remove duplicate paravirt_pagetable_setup_{start, done}()Alex Nixon2008-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They were already called once in arch/x86/kernel/setup.c - we don't need to call them again. Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | | | * | x86: export set_memory_ro and set_memory_rwBruce Allan2008-09-30
| | | | |_|/ / | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Export set_memory_ro() and set_memory_rw() calls for use by drivers that need to have more debug information about who might be writing to memory space. this was initially developed for use while debugging a memory corruption problem with e1000e. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | x86: pgd_{c,d}tor() cleanupJan Beulich2008-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Giving pgd_ctor() a properly typed parameter allows eliminating a local variable. Adjust pgd_dtor() to match. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: "Jeremy Fitzhardinge" <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | Merge branch 'x86/urgent' into x86/cleanupsIngo Molnar2008-08-25
| | |\ \ \ \ \ | | | | |/ / / | | | |/| | |
| | * | | | | x86: another user of PTE_FLAGS_MASKJeremy Fitzhardinge2008-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | Merge branch 'linus' into x86/cleanupsIngo Molnar2008-08-20
| | |\ \ \ \ \ | | | | |_|/ / | | | |/| | |
| | * | | | | Merge branch 'linus' into x86/cleanupsIngo Molnar2008-08-11
| | |\ \ \ \ \ | | | | |_|_|/ | | | |/| | |
| | * | | | | x86: convert discontig_32.c from round_up to roundupJoerg Roedel2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | x86: convert numa_64.c from round_up to roundupJoerg Roedel2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | x86: convert init_64.c from round_up to roundupJoerg Roedel2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | x86: convert pageattr.c from round_up to roundupJoerg Roedel2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | Merge branch 'x86/prototypes' into x86-v28-for-linus-phase1Ingo Molnar2008-10-06
|\ \ \ \ \ \ \ | |_|_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/process_32.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | Merge commit 'v2.6.27-rc3' into x86/prototypesIngo Molnar2008-08-14
| |\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: include/asm-x86/dma-mapping.h Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | x86: mm/ioremap.c declare early_ioremap_debug and early_ioremap_nested as staticJaswinder Singh2008-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
| * | | | | | x86: mm/fault.c declare do_page_fault before they get usedJaswinder Singh2008-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | declared do_page_fault() in asm-x86/trap.h for both X86_32 and X86_64 removed do_invalid_op declaration from mm/fault.c as it is already declared in asm-x86/trap.h Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
| * | | | | | x86: mm/init_XX.c declare functions before they get usedJaswinder Singh2008-07-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | included <asm/smp.h> in mm/init_32.c for zap_low_mappings() declared free_initmem() in asm-x86/page_XX.h Signed-off-by: Jaswinder Singh <jaswinder@infradead.org>
* | | | | | | x86/paravirt: Remove duplicate paravirt_pagetable_setup_{start, done}()Alex Nixon2008-09-14
| |_|_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | They were already called once in arch/x86/kernel/setup.c - we don't need to call them again. fixes: http://bugzilla.kernel.org/show_bug.cgi?id=11485 Signed-off-by: Alex Nixon <alex.nixon@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | x86: fix two modpost warnings in mm/init_64.cJan Beulich2008-08-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | early_io{re,un}map() are __init and hence can't be called from __meminit functions. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | x86: fix 1:1 mapping init on 64-bit (memory hotplug case)Jan Beulich2008-08-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While I don't have a hotplug capable system at hand, I think two issues need fixing: - pud_phys (in kernel_physical_ampping_init()) would remain uninitialized in the after_bootmem case - the locking done just around phys_pmd_{init,update}() would leave out pgd updates, and it was needlessly covering code portions that do allocations (perhaps using a more friendly gfp value in alloc_low_page() would then be possible) Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | devmem, x86: PAT Change /dev/mem mmap with O_SYNC to use UC_MINUSvenkatesh.pallipadi@intel.com2008-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All kernel mappings like ioremap(), etc uses UC_MINUS as the type. /dev/mem mappings with /dev/mem being opened with O_SYNC however was using UC, resulting in a conflict with /dev/mem mmap failing. This seems to be affecting some apps (one being flashrom) which are using O_SYNC and which were working before. Switch /dev/mem with O_SYNC also to UC_MINUS. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | x86: PAT proper tracking of set_memory_uc and friendsvenkatesh.pallipadi@intel.com2008-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Big thinko in pat memtype tracking code. reserve_memtype should be called with physical address and not virtual address. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | x86, mmiotrace: silence section mismatch warning - leave_uniprocessorMarcin Slusarz2008-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WARNING: vmlinux.o(.text+0x180af): Section mismatch in reference from the function leave_uniprocessor() to the function .cpuinit.text:cpu_up() The function leave_uniprocessor() references the function __cpuinit cpu_up(). This is often because leave_uniprocessor lacks a __cpuinit annotation or the annotation of cpu_up is wrong. leave_uniprocessor calls cpu_up only when CONFIG_HOTPLUG_CPU is set, so it can be safely annotated as __ref Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Cc: Pekka Paalanen <pq@iki.fi> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Ingo Molnar <mingo@elte.hu> Cc: Pekka Paalanen <pq@iki.fi>
* | | | | | x86: use WARN() in arch/x86/mm/ioremap.cArjan van de Ven2008-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use WARN() instead of a printk+WARN_ON() pair; this way the message becomes part of the warning section for better reporting/collection. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Cc: akpm@linux-foundation.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | x86: fix Xorg startup/shutdown slowdown with PATVenki Pallipadi2008-08-20
| |_|_|_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rene Herman reported significant Xorg startup/shutdown slowdown due to PAT. It turns out that the memtype list has thousands of entries. Add cached_entry to list add routine, in order to speed up the lookup for sequential reserve_memtype calls. Reported-by: Rene Herman <rene.herman@keyaccess.nl> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2008-08-16
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (32 commits) x86: add MAP_STACK mmap flag x86: fix section mismatch warning - spp_getpage() x86: change init_gdt to update the gdt via write_gdt, rather than a direct write. x86-64: fix overlap of modules and fixmap areas x86, geode-mfgpt: check IRQ before using MFGPT as clocksource x86, acpi: cleanup, temp_stack is used only when CONFIG_SMP is set x86: fix spin_is_contended() x86, nmi: clean UP NMI watchdog failure message x86, NMI: fix watchdog failure message x86: fix /proc/meminfo DirectMap x86: fix readb() et al compile error with gcc-3.2.3 arch/x86/Kconfig: clean up, experimental adjustement x86: invalidate caches before going into suspend x86, perfctr: don't use CCCR_OVF_PMI1 on Pentium 4Ds x86, AMD IOMMU: initialize dma_ops after sysfs registration x86m AMD IOMMU: cleanup: replace LOW_U32 macro with generic lower_32_bits x86, AMD IOMMU: initialize device table properly x86, AMD IOMMU: use status bit instead of memory write-back for completion wait x86: silence mmconfig printk x86, msr: fix NULL pointer deref due to msr_open on nonexistent CPUs ...
| * | | | | x86: fix section mismatch warning - spp_getpage()Marcin Slusarz2008-08-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WARNING: vmlinux.o(.text+0x17a3e): Section mismatch in reference from the function set_pte_vaddr_pud() to the function .init.text:spp_getpage() The function set_pte_vaddr_pud() references the function __init spp_getpage(). This is often because set_pte_vaddr_pud lacks a __init annotation or the annotation of spp_getpage is wrong. spp_getpage is called from __init (__init_extra_mapping) and non __init (set_pte_vaddr_pud) functions, so it can't be __init. Unfortunately it calls alloc_bootmem_pages which is __init, but does it only when bootmem allocator is available (after_bootmem == 0). So annotate it accordingly. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: H. Peter Anvin <hpa@zytor.com>
| * | | | | x86: fix /proc/meminfo DirectMapHugh Dickins2008-08-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do we actually want these DirectMap lines in the x86 /proc/meminfo? I can see they're interesting to CPA developers and TLB optimizers, but they don't fit its usual "where has all my memory gone?" usage. If they are to stay, here are some fixes. 1. On x86_32 without PAE, they're not 2M but 4M pages: no need to mess with the internal enum, but show the right name to users. 2. Many machines can never show anything but 0 for DirectMap1G, so suppress that line unless direct_gbpages are really enabled. 3. The unit in /proc/meminfo is kB not number of pages: HugePages messed that up, but they're an example to regret not to follow. 4. Once we use kB, it's easy to see that 1GB has gone missing (which explains why CONFIG_CPA_DEBUG=y soon wraps DirectMap2M negative): because head_64.S's level2_ident_pgt entries were not counted. My fix is not ideal, but works for more and for less than 1G, and avoids interfering with early bootup pagetable contortions. Signed-off-by: Hugh Dickins <hugh@veritas.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86: don't call e820_regiter_active_regions if out of range on nodeYinghai Lu2008-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | so we don't get warning on 32bit system with 64g RAM or more Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | x86: use WARN() in arch/x86/mm/pageattr.cArjan van de Ven2008-08-13
| | |/ / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use WARN() instead of a printk+WARN_ON() pair; this way the message becomes part of the warning section for better reporting/collection. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: akpm@linux-foundation.org Cc: arjan@linux.intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
* / | | | x86: Fix ioremap off by one BUGAndi Kleen2008-08-15
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jean Delvare's machine triggered this BUG acpi_os_map_memory phys ffff0000 size 65535 ------------[ cut here ]------------ kernel BUG at arch/x86/mm/pat.c:233! with ACPI in the backtrace. Adding some debugging output showed that ACPI calls acpi_os_map_memory phys ffff0000 size 65535 And ioremap/PAT does this check in 32bit, so addr+size wraps and the BUG in reserve_memtype() triggers incorrectly. BUG_ON(start >= end); /* end is exclusive */ But reserve_memtype already uses u64: int reserve_memtype(u64 start, u64 end, so the 32bit truncation must happen in the caller. Presumably in ioremap when it passes this information to reserve_memtype(). This patch does this computation in 64bit. http://bugzilla.kernel.org/show_bug.cgi?id=11346 Signed-off-by: Andi Kleen <ak@linux.intel.com>
* | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linusLinus Torvalds2008-08-12
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: fix spinlock recursion in hvc_console stop_machine: remove unused variable modules: extend initcall_debug functionality to the module loader export virtio_rng.h lguest: use get_user_pages_fast() instead of get_user_pages() mm: Make generic weak get_user_pages_fast and EXPORT_GPL it lguest: don't set MAC address for guest unless specified
| * | | | mm: Make generic weak get_user_pages_fast and EXPORT_GPL itRusty Russell2008-08-12
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Out of line get_user_pages_fast fallback implementation, make it a weak symbol, get rid of CONFIG_HAVE_GET_USER_PAGES_FAST. Export the symbol to modules so lguest can use it. Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* / | | x86: work around gcc 3.4.x bugJeremy Fitzhardinge2008-08-11
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simon Horman reported that gcc-3.4.x crashes when compiling pgd_prepopulate_pmd() when PREALLOCATED_PMDS == 0 and CONFIG_DEBUG_INFO is enabled. Adding an extra check for PREALLOCATED_PMDS == 0 [which is compiled out by gcc] seems to avoid the problem. Reported-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Fix 'get_user_pages_fast()' with non-page-aligned start addressLinus Torvalds2008-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexey Dobriyan reported trouble with LTP with the new fast-gup code, and Johannes Weiner debugged it to non-page-aligned addresses, where the new get_user_pages_fast() code would do all the wrong things, including just traversing past the end of the requested area due to 'addr' never matching 'end' exactly. This is not a pretty fix, and we may actually want to move the alignment into generic code, leaving just the core code per-arch, but Alexey verified that the vmsplice01 LTP test doesn't crash with this. Reported-and-tested-by: Alexey Dobriyan <adobriyan@gmail.com> Debugged-by: Johannes Weiner <hannes@saeurebad.de> Cc: Nick Piggin <npiggin@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | x86: use generic show_mem()Johannes Weiner2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove arch-specific show_mem() in favor of the generic version. This also removes the following redundant information display: - pages in swapcache, printed by show_swap_cache_info() - dirty pages, writeback pages, mapped pages, slab pages, pagetable pages, printed by show_free_areas() where show_mem() calls show_free_areas(), which calls show_swap_cache_info(). Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | x86: support 1GB hugepages with get_user_pages_lockless()Nick Piggin2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | x86: lockless get_user_pages_fast()Nick Piggin2008-07-26
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement get_user_pages_fast without locking in the fastpath on x86. Do an optimistic lockless pagetable walk, without taking mmap_sem or any page table locks or even mmap_sem. Page table existence is guaranteed by turning interrupts off (combined with the fact that we're always looking up the current mm, means we can do the lockless page table walk within the constraints of the TLB shootdown design). Basically we can do this lockless pagetable walk in a similar manner to the way the CPU's pagetable walker does not have to take any locks to find present ptes. This patch (combined with the subsequent ones to convert direct IO to use it) was found to give about 10% performance improvement on a 2 socket 8 core Intel Xeon system running an OLTP workload on DB2 v9.5 "To test the effects of the patch, an OLTP workload was run on an IBM x3850 M2 server with 2 processors (quad-core Intel Xeon processors at 2.93 GHz) using IBM DB2 v9.5 running Linux 2.6.24rc7 kernel. Comparing runs with and without the patch resulted in an overall performance benefit of ~9.8%. Correspondingly, oprofiles showed that samples from __up_read and __down_read routines that is seen during thread contention for system resources was reduced from 2.8% down to .05%. Monitoring the /proc/vmstat output from the patched run showed that the counter for fast_gup contained a very high number while the fast_gup_slow value was zero." (fast_gup is the old name for get_user_pages_fast, fast_gup_slow is a counter we had for the number of times the slowpath was invoked). The main reason for the improvement is that DB2 has multiple threads each issuing direct-IO. Direct-IO uses get_user_pages, and thus the threads contend the mmap_sem cacheline, and can also contend on page table locks. I would anticipate larger performance gains on larger systems, however I think DB2 uses an adaptive mix of threads and processes, so it could be that thread contention remains pretty constant as machine size increases. In which case, we stuck with "only" a 10% gain. The downside of using get_user_pages_fast is that if there is not a pte with the correct permissions for the access, we end up falling back to get_user_pages and so the get_user_pages_fast is a bit of extra work. However this should not be the common case in most performance critical code. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: Kconfig fix] [akpm@linux-foundation.org: Makefile fix/cleanup] [akpm@linux-foundation.org: warning fix] Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andi Kleen <andi@firstfloor.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Badari Pulavarty <pbadari@us.ibm.com> Cc: Zach Brown <zach.brown@oracle.com> Cc: Jens Axboe <jens.axboe@oracle.com> Reviewed-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | x86: add hugepagesz option on 64-bitAndi Kleen2008-07-24
| | | | | | | | | | | | | | | | | | | | | | | | Add an hugepagesz=... option similar to IA64, PPC etc. to x86-64. This finally allows to select GB pages for hugetlbfs in x86 now that all the infrastructure is in place. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | x86: support GB hugepages on 64-bitAndi Kleen2008-07-24
| | | | | | | | | | | | | | | | Acked-by: Adam Litke <agl@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | hugetlb: introduce pud_hugeAndi Kleen2008-07-24
| | | | | | | | | | | | | | | | | | | | | | | | Straight forward extensions for huge pages located in the PUD instead of PMDs. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | hugetlb: modular state for hugetlb page sizeAndi Kleen2008-07-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The goal of this patchset is to support multiple hugetlb page sizes. This is achieved by introducing a new struct hstate structure, which encapsulates the important hugetlb state and constants (eg. huge page size, number of huge pages currently allocated, etc). The hstate structure is then passed around the code which requires these fields, they will do the right thing regardless of the exact hstate they are operating on. This patch adds the hstate structure, with a single global instance of it (default_hstate), and does the basic work of converting hugetlb to use the hstate. Future patches will add more hstate structures to allow for different hugetlbfs mounts to have different page sizes. [akpm@linux-foundation.org: coding-style fixes] Acked-by: Adam Litke <agl@us.ibm.com> Acked-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | access_process_vm device memory infrastructureRik van Riel2008-07-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to be able to debug things like the X server and programs using the PPC Cell SPUs, the debugger needs to be able to access device memory through ptrace and /proc/pid/mem. This patch: Add the generic_access_phys access function and put the hooks in place to allow access_process_vm to access device or PPC Cell SPU memory. [riel@redhat.com: Add documentation for the vm_ops->access function] Signed-off-by: Rik van Riel <riel@redhat.com> Signed-off-by: Benjamin Herrensmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@linux.ie> Cc: Hugh Dickins <hugh@veritas.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnd Bergmann <arnd@arndb.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | mm: move bootmem descriptors definition to a single placeJohannes Weiner2008-07-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are a lot of places that define either a single bootmem descriptor or an array of them. Use only one central array with MAX_NUMNODES items instead. Signed-off-by: Johannes Weiner <hannes@saeurebad.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Tony Luck <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Kyle McMartin <kyle@parisc-linux.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: David S. Miller <davem@davemloft.net> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | x86: add PTE_FLAGS_MASKJeremy Fitzhardinge2008-07-22
| | | | | | | | | | | | | | | | PTE_PFN_MASK was getting lonely, so I made it a friend. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | x86: rename PTE_MASK to PTE_PFN_MASKJeremy Fitzhardinge2008-07-22
|/ | | | | | | | | | | | | | | | | | Rusty, in his peevish way, complained that macros defining constants should have a name which somewhat accurately reflects the actual purpose of the constant. Aside from the fact that PTE_MASK gives no clue as to what's actually being masked, and is misleadingly similar to the functionally entirely different PMD_MASK, PUD_MASK and PGD_MASK, I don't really see what the problem is. But if this patch silences the incessent noise, then it will have achieved its goal (TODO: write test-case). Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* x86: convert Dprintk to pr_debugThomas Gleixner2008-07-21
| | | | | | | | There are a couple of places where (P)Dprintk is used which is an old compile time enabled printk wrapper. Convert it to the generic pr_debug(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
*-------------. Merge branches 'x86/urgent', 'x86/amd-iommu', 'x86/apic', 'x86/cleanups', ↵Ingo Molnar2008-07-21
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | 'x86/core', 'x86/cpu', 'x86/fixmap', 'x86/gart', 'x86/kprobes', 'x86/memtest', 'x86/modules', 'x86/nmi', 'x86/pat', 'x86/reboot', 'x86/setup', 'x86/step', 'x86/unify-pci', 'x86/uv', 'x86/xen' and 'xen-64bit' into x86/for-linus
| | | | | | | | * Merge branch 'linus' into xen-64bitIngo Molnar2008-07-17
| | | | | | | | |\ | | |_|_|_|_|_|_|/ | |/| | | | | | |