aboutsummaryrefslogtreecommitdiffstats
path: root/include/asm-i386
Commit message (Collapse)AuthorAge
* x86-64: introduce struct pci_sysdata to facilitate sharing of ->sysdataMuli Ben-Yehuda2007-07-21
| | | | | | | | | | | | | | This patch introduces struct pci_sysdata to x86 and x86-64, and converts the existing two users (NUMA, Calgary) to use it. This lays the groundwork for having other users of sysdata, such as the PCI domains work. The Calgary bits are tested, the NUMA bits just look ok. Signed-off-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Muli Ben-Yehuda <muli@il.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: basic infrastructure support for AMD geode-class machinesAndres Salomon2007-07-21
| | | | | | | | | | | | | | | | This builds upon the existing geode infrastructure, but adds southbridge support, some GPIO functions, and a header file (asm-i386/geode.h) with some useful GX/LX detection tests. The majority of this code was written by Jordan Crouse. Signed-off-by: Jordan Crouse <jordan.crouse@amd.com> Signed-off-by: Andres Salomon <dilinger@debian.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: David Brownell <david-b@pacbell.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: move PIT function declarations and constants to correct header fileThomas Gleixner2007-07-21
| | | | | | | | | | | | | | setup_pit_timer is declared in asm-i386/timer.h. Move it to the pit header file, so it can be used by x86_64 as well. Move also the PIT constants. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: replace hard-coded constant with appropriate macro from kernel.hRobert P. J. Day2007-07-21
| | | | | | | Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: add cpu_relax() to cmos_lock()Andreas Mohr2007-07-21
| | | | | | | | | | | | | Add cpu_relax() to cmos_lock() inline function for faster operation on SMT CPUs and less power consumption on others in case of lock contention (which probably doesn't happen too often, so admittedly this patch is not too exciting). [akpm@linux-foundation.org: Include the header file for cpu_relax()] Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: do not restore reserved memory after hibernationRafael J. Wysocki2007-07-21
| | | | | | | | | | | | | | | | | | | | | On some systems the ACPI NVS area is located in the first 1 MB of RAM and it is overwritten by the i386 code during the restore after hibernation. This confuses the ACPI platform firmware that doesn't update the AC adapter status appropriately as a result (http://bugzilla.kernel.org/show_bug.cgi?id=7995). The solution is to register the reserved memory in the first 1 MB as 'nosave', so that swsusp doesn't touch it during the restore. Also, this has been done on x86_64 for a long time now, so this patch makes the i386 restore code behave like the x86_64 one. [akpm@linux-foundation.org: build fix] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: remove support for the Rise CPUAdrian Bunk2007-07-21
| | | | | | | | | | | | | | | | | | | | The Rise CPUs were only very short-lived, and there are no reports of anyone both owning one and running Linux on it. Googling for the printk string "CPU: Rise iDragon" didn't find any dmesg available online. If it turns out that against all expectations there are actually users reverting this patch would be easy. This patch will make the kernel images smaller by a few bytes for all i386 users. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Dave Jones <davej@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: add reference to the argumentsAndrew Morton2007-07-21
| | | | | | | | | | | | Prevent stuff like this: mm/vmalloc.c: In function 'unmap_kernel_range': mm/vmalloc.c:75: warning: unused variable 'start' Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: PM_TRACE supportNigel Cunningham2007-07-21
| | | | | | | | | | | Signed-off-by: Nigel Cunningham <nigel@nigel.suspend2.net> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Pavel Machek <pavel@ucw.cz> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: fix machine rebootingTruxton Fulton2007-07-21
| | | | | | | | | | | | | | | Commit 59f4e7d572980a521b7bdba74ab71b21f5995538 fixed machine rebooting on Truxton's machine (when no keyboard was present). But it broke it on Lee's machine. The patch reinstates the old (pre-59f4e7d572980a521b7bdba74ab71b21f5995538) code and if that doesn't work out, try the new, post-59f4e7d572980a521b7bdba74ab71b21f5995538 code instead. Cc: Lee Garrett <lee-in-berlin@web.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: minor nx handling adjustmentJan Beulich2007-07-21
| | | | | | | | Constrain __supported_pte_mask and NX handling to just the PAE kernel. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* x86: share hpet.h with i386Thomas Gleixner2007-07-21
| | | | | | | | | | | | | hpet.h in asm-i386 and asm-x86_64 contain tons of duplicated stuff. Consolidate into one shared header file. AK: Fix i386 compilation with !X86_IO_APIC Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: remove pit_interrupt_hookChris Wright2007-07-21
| | | | | | | | | | Remove pit_interrupt_hook as it adds just an extra layer. Signed-off-by: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: Move all simple string operations out of lineAndi Kleen2007-07-21
| | | | | | | | | | | | | | | | | The compiler generally generates reasonable inline code for the simple cases and for the rest it's better for code size for them to be out of line. Also there they can be potentially optimized more in the future. In fact they probably should be in a .S file because they're all pure assembly, but that's for another day. Also some code style cleanup on them while I was on it (this seems to be the last untouched really early Linux code) This saves ~12k text for a defconfig kernel with gcc 4.1. Signed-off-by: Andi Kleen <ak@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* NTP: move the cmos update code into ntp.cThomas Gleixner2007-07-21
| | | | | | | | | | | | | | | | i386 and sparc64 have the identical code to update the cmos clock. Move it into kernel/time/ntp.c as there are other architectures coming along with the same requirements. [akpm@linux-foundation.org: build fixes] Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Chris Wright <chrisw@sous-sol.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: john stultz <johnstul@us.ibm.com> Cc: David Miller <davem@davemloft.net> Cc: Roman Zippel <zippel@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* i386: Allow KVM on i386 nonpaeAvi Kivity2007-07-19
| | | | | | | | | | | | | | | Currently, CONFIG_X86_CMPXCHG64 both enables boot-time checking of the cmpxchg64b feature and enables compilation of the set_64bit() family. Since the option is dependent on PAE, and since KVM depends on set_64bit(), this effectively disables KVM on i386 nopae. Simplify by removing the config option altogether: the boot check is made dependent on CONFIG_X86_PAE directly, and the set_64bit() family is exposed without constraints. It is up to users to check for the feature flag (KVM does not as virtualiation extensions imply its existence). Signed-off-by: Avi Kivity <avi@qumranet.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] sched: sched_cacheflush is now unusedRalf Baechle2007-07-19
| | | | | | | | Since Ingo's recent scheduler rewrite which was merged as commit 0437e109e1841607f2988891eaa36c531c6aa6ac sched_cacheflush is unused. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* lguest: the host codeRusty Russell2007-07-19
| | | | | | | | | | | | | | | | This is the code for the "lg.ko" module, which allows lguest guests to be launched. [akpm@linux-foundation.org: update for futex-new-private-futexes] [akpm@linux-foundation.org: build fix] [jmorris@namei.org: lguest: use hrtimers] [akpm@linux-foundation.org: x86_64 build fix] Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: Andi Kleen <ak@suse.de> Cc: Eric Dumazet <dada1@cosmosbay.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* arch: personality independent stack topPeter Zijlstra2007-07-19
| | | | | | | | | | | | | | | | New arch macro STACK_TOP_MAX it gives the larges valid stack address for the architecture in question. It differs from STACK_TOP in that it will not distinguish between personalities but will always return the largest possible address. This is used to create the initial stack on execve, which we will move down to the proper location once the binfmt code has figured out where that is. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ollie Wild <aaw@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* define new percpu interface for shared dataFenghua Yu2007-07-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | per cpu data section contains two types of data. One set which is exclusively accessed by the local cpu and the other set which is per cpu, but also shared by remote cpus. In the current kernel, these two sets are not clearely separated out. This can potentially cause the same data cacheline shared between the two sets of data, which will result in unnecessary bouncing of the cacheline between cpus. One way to fix the problem is to cacheline align the remotely accessed per cpu data, both at the beginning and at the end. Because of the padding at both ends, this will likely cause some memory wastage and also the interface to achieve this is not clean. This patch: Moves the remotely accessed per cpu data (which is currently marked as ____cacheline_aligned_in_smp) into a different section, where all the data elements are cacheline aligned. And as such, this differentiates the local only data and remotely accessed data cleanly. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Christoph Lameter <clameter@sgi.com> Cc: <linux-arch@vger.kernel.org> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* jprobes: remove JPROBE_ENTRY()Michael Ellerman2007-07-19
| | | | | | | | | | | | | | | | AFAICT now that jprobe.entry is a void *, JPROBE_ENTRY doesn't do anything useful - so remove it .. I've left a do-nothing version so that out-of-tree jprobes code will still compile without modifications. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com> Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for_linus' of ↵Linus Torvalds2007-07-18
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: extent macros cleanup Fix compilation with EXT_DEBUG, also fix leXX_to_cpu conversions. ext4: remove extra IS_RDONLY() check ext4: Use is_power_of_2() Use zero_user_page() in ext4 where possible ext4: Remove 65000 subdirectory limit ext4: Expand extra_inodes space per the s_{want,min}_extra_isize fields ext4: Add nanosecond timestamps jbd2: Move jbd2-debug file to debugfs jbd2: Fix CONFIG_JBD_DEBUG ifdef to be CONFIG_JBD2_DEBUG ext4: Set the journal JBD2_FEATURE_INCOMPAT_64BIT on large devices ext4: Make extents code sanely handle on-disk corruption ext4: copy i_flags to inode flags on write ext4: Enable extents by default Change on-disk format to support 2^15 uninitialized extents write support for preallocated blocks fallocate support in ext4 sys_fallocate() implementation on i386, x86_64 and powerpc
| * sys_fallocate() implementation on i386, x86_64 and powerpcAmit Arora2007-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fallocate() is a new system call being proposed here which will allow applications to preallocate space to any file(s) in a file system. Each file system implementation that wants to use this feature will need to support an inode operation called ->fallocate(). Applications can use this feature to avoid fragmentation to certain level and thus get faster access speed. With preallocation, applications also get a guarantee of space for particular file(s) - even if later the the system becomes full. Currently, glibc provides an interface called posix_fallocate() which can be used for similar cause. Though this has the advantage of working on all file systems, but it is quite slow (since it writes zeroes to each block that has to be preallocated). Without a doubt, file systems can do this more efficiently within the kernel, by implementing the proposed fallocate() system call. It is expected that posix_fallocate() will be modified to call this new system call first and incase the kernel/filesystem does not implement it, it should fall back to the current implementation of writing zeroes to the new blocks. ToDos: 1. Implementation on other architectures (other than i386, x86_64, and ppc). Patches for s390(x) and ia64 are already available from previous posts, but it was decided that they should be added later once fallocate is in the mainline. Hence not including those patches in this take. 2. Changes to glibc, a) to support fallocate() system call b) to make posix_fallocate() and posix_fallocate64() call fallocate() Signed-off-by: Amit Arora <aarora@in.ibm.com>
* | xen: add the Xenbus sysfs and virtual device hotplug driverJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This communicates with the machine control software via a registry residing in a controlling virtual machine. This allows dynamic creation, destruction and modification of virtual device configurations (network devices, block devices and CPUS, to name some examples). [ Greg, would you mind giving this a review? Thanks -J ] Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Greg KH <greg@kroah.com>
* | xen: Core Xen implementationJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a rollup of all the core pieces of the Xen implementation, including: - booting and setup - pagetable setup - privileged instructions - segmentation - interrupt flags - upcalls - multicall batching BOOTING AND SETUP The vmlinux image is decorated with ELF notes which tell the Xen domain builder what the kernel's requirements are; the domain builder then constructs the address space accordingly and starts the kernel. Xen has its own entrypoint for the kernel (contained in an ELF note). The ELF notes are set up by xen-head.S, which is included into head.S. In principle it could be linked separately, but it seems to provoke lots of binutils bugs. Because the domain builder starts the kernel in a fairly sane state (32-bit protected mode, paging enabled, flat segments set up), there's not a lot of setup needed before starting the kernel proper. The main steps are: 1. Install the Xen paravirt_ops, which is simply a matter of a structure assignment. 2. Set init_mm to use the Xen-supplied pagetables (analogous to the head.S generated pagetables in a native boot). 3. Reserve address space for Xen, since it takes a chunk at the top of the address space for its own use. 4. Call start_kernel() PAGETABLE SETUP Once we hit the main kernel boot sequence, it will end up calling back via paravirt_ops to set up various pieces of Xen specific state. One of the critical things which requires a bit of extra care is the construction of the initial init_mm pagetable. Because Xen places tight constraints on pagetables (an active pagetable must always be valid, and must always be mapped read-only to the guest domain), we need to be careful when constructing the new pagetable to keep these constraints in mind. It turns out that the easiest way to do this is use the initial Xen-provided pagetable as a template, and then just insert new mappings for memory where a mapping doesn't already exist. This means that during pagetable setup, it uses a special version of xen_set_pte which ignores any attempt to remap a read-only page as read-write (since Xen will map its own initial pagetable as RO), but lets other changes to the ptes happen, so that things like NX are set properly. PRIVILEGED INSTRUCTIONS AND SEGMENTATION When the kernel runs under Xen, it runs in ring 1 rather than ring 0. This means that it is more privileged than user-mode in ring 3, but it still can't run privileged instructions directly. Non-performance critical instructions are dealt with by taking a privilege exception and trapping into the hypervisor and emulating the instruction, but more performance-critical instructions have their own specific paravirt_ops. In many cases we can avoid having to do any hypercalls for these instructions, or the Xen implementation is quite different from the normal native version. The privileged instructions fall into the broad classes of: Segmentation: setting up the GDT and the GDT entries, LDT, TLS and so on. Xen doesn't allow the GDT to be directly modified; all GDT updates are done via hypercalls where the new entries can be validated. This is important because Xen uses segment limits to prevent the guest kernel from damaging the hypervisor itself. Traps and exceptions: Xen uses a special format for trap entrypoints, so when the kernel wants to set an IDT entry, it needs to be converted to the form Xen expects. Xen sets int 0x80 up specially so that the trap goes straight from userspace into the guest kernel without going via the hypervisor. sysenter isn't supported. Kernel stack: The esp0 entry is extracted from the tss and provided to Xen. TLB operations: the various TLB calls are mapped into corresponding Xen hypercalls. Control registers: all the control registers are privileged. The most important is cr3, which points to the base of the current pagetable, and we handle it specially. Another instruction we treat specially is CPUID, even though its not privileged. We want to control what CPU features are visible to the rest of the kernel, and so CPUID ends up going into a paravirt_op. Xen implements this mainly to disable the ACPI and APIC subsystems. INTERRUPT FLAGS Xen maintains its own separate flag for masking events, which is contained within the per-cpu vcpu_info structure. Because the guest kernel runs in ring 1 and not 0, the IF flag in EFLAGS is completely ignored (and must be, because even if a guest domain disables interrupts for itself, it can't disable them overall). (A note on terminology: "events" and interrupts are effectively synonymous. However, rather than using an "enable flag", Xen uses a "mask flag", which blocks event delivery when it is non-zero.) There are paravirt_ops for each of cli/sti/save_fl/restore_fl, which are implemented to manage the Xen event mask state. The only thing worth noting is that when events are unmasked, we need to explicitly see if there's a pending event and call into the hypervisor to make sure it gets delivered. UPCALLS Xen needs a couple of upcall (or callback) functions to be implemented by each guest. One is the event upcalls, which is how events (interrupts, effectively) are delivered to the guests. The other is the failsafe callback, which is used to report errors in either reloading a segment register, or caused by iret. These are implemented in i386/kernel/entry.S so they can jump into the normal iret_exc path when necessary. MULTICALL BATCHING Xen provides a multicall mechanism, which allows multiple hypercalls to be issued at once in order to mitigate the cost of trapping into the hypervisor. This is particularly useful for context switches, since the 4-5 hypercalls they would normally need (reload cr3, update TLS, maybe update LDT) can be reduced to one. This patch implements a generic batching mechanism for hypercalls, which gets used in many places in the Xen code. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Ian Pratt <ian.pratt@xensource.com> Cc: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Cc: Adrian Bunk <bunk@stusta.de>
* | xen: Add Xen interface header filesJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | Add Xen interface header files. These are taken fairly directly from the Xen tree, but somewhat rearranged to suit the kernel's conventions. Define macros and inline functions for doing hypercalls into the hypervisor. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Ian Pratt <ian.pratt@xensource.com> Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
* | Add a sched_clock paravirt_opJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tsc-based get_scheduled_cycles interface is not a good match for Xen's runstate accounting, which reports everything in nanoseconds. This patch replaces this interface with a sched_clock interface, which matches both Xen and VMI's requirements. In order to do this, we: 1. replace get_scheduled_cycles with sched_clock 2. hoist cycles_2_ns into a common header 3. update vmi accordingly One thing to note: because sched_clock is implemented as a weak function in kernel/sched.c, we must define a real function in order to override this weak binding. This means the usual paravirt_ops technique of using an inline function won't work in this case. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Zachary Amsden <zach@vmware.com> Cc: Dan Hecht <dhecht@vmware.com> Cc: john stultz <johnstul@us.ibm.com>
* | paravirt: helper to disable all IO spaceJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | | | | | | | | | | | | | In a virtual environment, device drivers such as legacy IDE will waste quite a lot of time probing for their devices which will never appear. This helper function allows a paravirt implementation to lay claim to the whole iomem and ioport space, thereby disabling all device drivers trying to claim IO resources. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org> Cc: Rusty Russell <rusty@rustcorp.com.au>
* | paravirt: make siblingmap functions visibleJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | Paravirt implementations need to set the sibling map on new cpus. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
* | paravirt: unstatic smp_store_cpu_infoJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | Paravirt implementations need to store cpu info when bringing up cpus. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
* | paravirt: unstatic leave_mmJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | | | | | Make globally leave_mm visible, specifically so that Xen can use it to shoot-down lazy uses of cr3. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Signed-off-by: Chris Wright <chrisw@sous-sol.org>
* | paravirt: increase IRQ limitJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | | | | | When running with CONFIG_PARAVIRT, we may want lots of IRQs even if there's no IO APIC. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
* | paravirt: add a hook for once the allocator is readyJeremy Fitzhardinge2007-07-18
| | | | | | | | | | | | | | | | | | Add a hook so that the paravirt backend knows when the allocator is ready. This is useful for the obvious reason that the allocator is available, but the other side-effect of having the bootmem allocator available is that each page now has an associated "struct page". Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
* | paravirt: add an "mm" argument to alloc_ptJeremy Fitzhardinge2007-07-18
|/ | | | | | | It's useful to know which mm is allocating a pagetable. Xen uses this to determine whether the pagetable being added to is pinned or not. Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
* fbdev: detect primary display deviceAntonino A. Daplas2007-07-17
| | | | | | | | | | | | | Add function helper, fb_is_primary_device(). Given struct fb_info, it will return a nonzero value if the device is the primary display. Currently, only the i386 is supported where the function checks for the IORESOURCE_ROM_SHADOW flag. Signed-off-by: Antonino Daplas <adaplas@gmail.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fbdev: move arch-specific bits to their respective subdirectoriesAntonino A. Daplas2007-07-17
| | | | | | | | | | | | Move arch-specific bits of fb_mmap() to their respective subdirectories [bob.picco@hp.com: efi_range_is_wc is referenced but not declared] [bunk@stusta.de: fix include/asm-m68k/fb.h] Signed-off-by: Antonino Daplas <adaplas@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Add __GFP_MOVABLE for callers to flag allocations from high memory that may ↵Mel Gorman2007-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | be migrated It is often known at allocation time whether a page may be migrated or not. This patch adds a flag called __GFP_MOVABLE and a new mask called GFP_HIGH_MOVABLE. Allocations using the __GFP_MOVABLE can be either migrated using the page migration mechanism or reclaimed by syncing with backing storage and discarding. An API function very similar to alloc_zeroed_user_highpage() is added for __GFP_MOVABLE allocations called alloc_zeroed_user_highpage_movable(). The flags used by alloc_zeroed_user_highpage() are not changed because it would change the semantics of an existing API. After this patch is applied there are no in-kernel users of alloc_zeroed_user_highpage() so it probably should be marked deprecated if this patch is merged. Note that this patch includes a minor cleanup to the use of __GFP_ZERO in shmem.c to keep all flag modifications to inode->mapping in the shmem_dir_alloc() helper function. This clean-up suggestion is courtesy of Hugh Dickens. Additional credit goes to Christoph Lameter and Linus Torvalds for shaping the concept. Credit to Hugh Dickens for catching issues with shmem swap vector and ramfs allocations. [akpm@linux-foundation.org: build fix] [hugh@veritas.com: __GFP_ZERO cleanup] Signed-off-by: Mel Gorman <mel@csn.ul.ie> Cc: Andy Whitcroft <apw@shadowen.org> Cc: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: remove ptep_test_and_clear_dirty and ptep_clear_flush_dirtyMartin Schwidefsky2007-07-17
| | | | | | | | | | Nobody is using ptep_test_and_clear_dirty and ptep_clear_flush_dirty. Remove the functions from all architectures. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: remove ptep_establish()Martin Schwidefsky2007-07-17
| | | | | | | | | | The last user of ptep_establish in mm/ is long gone. Remove the architecture primitive as well. Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* make seccomp zerocost in scheduleAndrea Arcangeli2007-07-16
| | | | | | | | | | | | This follows a suggestion from Chuck Ebbert on how to make seccomp absolutely zerocost in schedule too. The only remaining footprint of seccomp is in terms of the bzImage size that becomes a few bytes (perhaps even a few kbytes) larger, measure it if you care in the embedded. Signed-off-by: Andrea Arcangeli <andrea@cpushare.com> Cc: Andi Kleen <ak@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fix jvc cdrom drive lockupZhang, Yanmin2007-07-16
| | | | | | | | | | | | | | | | | Before calling init_hwif_default, ide_unregister gets lock ide_lock and disables irq. init_hwif_default calls ide_default_io_base which calls pci_get_device and later pci_get_subsys tries to apply for semaphore pci_bus_sem and goes to sleep. Mostly, pci_get_device should be called when irq is turned on. ide_default_io_base just needs find if list pci_devices is empty. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Cc: Greg KH <greg@kroah.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* kill vmalloc_earlyreserveJan Beulich2007-07-16
| | | | | | | | This symbol got orphaned quite a while ago. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* page table handling cleanupJan Beulich2007-07-16
| | | | | | | | | | | | Kill pte_rdprotect(), pte_exprotect(), pte_mkread(), pte_mkexec(), pte_read(), pte_exec(), and pte_user() except where arch-specific code is making use of them. Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* serial: convert early_uart to earlycon for 8250Yinghai Lu2007-07-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and include/asm-x86_64/serial.h. the serial8250_ports need to be probed late in serial initializing stage. the console_init=>serial8250_console_init=> register_console=>serial8250_console_setup will return -ENDEV, and console ttyS0 can not be enabled at that time. need to wait till uart_add_one_port in drivers/serial/serial_core.c to call register_console to get console ttyS0. that is too late. Make early_uart to use early_param, so uart console can be used earlier. Make it to be bootconsole with CON_BOOT flag, so can use console handover feature. and it will switch to corresponding normal serial console automatically. new command line will be: console=uart8250,io,0x3f8,9600n8 console=uart8250,mmio,0xff5e0000,115200n8 or earlycon=uart8250,io,0x3f8,9600n8 earlycon=uart8250,mmio,0xff5e0000,115200n8 it will print in very early stage: Early serial console at I/O port 0x3f8 (options '9600n8') console [uart0] enabled later for console it will print: console handover: boot [uart0] -> real [ttyS0] Signed-off-by: <yinghai.lu@sun.com> Cc: Andi Kleen <ak@suse.de> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Gerd Hoffmann <kraxel@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6Linus Torvalds2007-07-12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (34 commits) PCI: Only build PCI syscalls on architectures that want them PCI: limit pci_get_bus_and_slot to domain 0 PCI: hotplug: acpiphp: avoid acpiphp "cannot get bridge info" PCI hotplug failure PCI: hotplug: acpiphp: remove hot plug parameter write to PCI host bridge PCI: hotplug: acpiphp: fix slot poweroff problem on systems without _PS3 PCI: hotplug: pciehp: wait for 1 second after power off slot PCI: pci_set_power_state(): check for PM capabilities earlier PCI: cpci_hotplug: Convert to use the kthread API PCI: add pci_try_set_mwi PCI: pcie: remove SPIN_LOCK_UNLOCKED PCI: ROUND_UP macro cleanup in drivers/pci PCI: remove pci_dac_dma_... APIs PCI: pci-x-pci-express-read-control-interfaces cleanups PCI: Fix typo in include/linux/pci.h PCI: pci_ids, remove double or more empty lines PCI: pci_ids, add atheros and 3com_2 vendors PCI: pci_ids, reorder some entries PCI: i386: traps, change VENDOR to DEVICE PCI: ATM: lanai, change VENDOR to DEVICE PCI: Change all drivers to use pci_device->revision ...
| * PCI: remove pci_dac_dma_... APIsJan Beulich2007-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Based on replies to a respective query, remove the pci_dac_dma_...() APIs (except for pci_dac_dma_supported() on Alpha, where this function is used in non-DAC PCI DMA code). Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Andi Kleen <ak@suse.de> Cc: Jesse Barnes <jesse.barnes@intel.com> Cc: Christoph Hellwig <hch@infradead.org> Acked-by: David Miller <davem@davemloft.net> Cc: Jeff Garzik <jeff@garzik.org> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
| * PCI: Use a weak symbol for the empty version of pcibios_add_platform_entries()Michael Ellerman2007-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm not sure if this is going to fly, weak symbols work on the compilers I'm using, but whether they work for all of the affected architectures I can't say. I've cc'ed as many arch maintainers/lists as I could find. But assuming they do, we can use a weak empty definition of pcibios_add_platform_entries() to avoid having an empty definition on every arch. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* | Make struct boot_params a real structure, and remove obsolete fieldsH. Peter Anvin2007-07-12
| | | | | | | | | | | | | | | | | | | | Make struct boot_params a real structure, and remove the handling of some obsolete fields, in particular hd*_info, which was only used by the ST-506 driver, and likely to be wrong for that driver on any modern BIOS. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Make definitions for struct e820entry and struct e820map consistentH. Peter Anvin2007-07-12
| | | | | | | | | | | | | | | | Make definitions for struct e820entry and struct e820map consistent between i386 and x86-64. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Use a new CPU feature word to cover features that are spread aroundVenki Pallipadi2007-07-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some Intel features are spread around in different CPUID leafs like 0x5, 0x6 and 0xA. Make this feature detection code common across i386 and x86_64. Display Intel Dynamic Acceleration feature in /proc/cpuinfo. This feature will be enabled automatically by current acpi-cpufreq driver. Refer to Intel Software Developer's Manual for more details about the feature. Thanks to hpa (H Peter Anvin) for the making the actual code detecting the scattered features data-driven. Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>