aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux
Commit message (Collapse)AuthorAge
* tracing: Kernel TracepointsMathieu Desnoyers2008-10-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implementation of kernel tracepoints. Inspired from the Linux Kernel Markers. Allows complete typing verification by declaring both tracing statement inline functions and probe registration/unregistration static inline functions within the same macro "DEFINE_TRACE". No format string is required. See the tracepoint Documentation and Samples patches for usage examples. Taken from the documentation patch : "A tracepoint placed in code provides a hook to call a function (probe) that you can provide at runtime. A tracepoint can be "on" (a probe is connected to it) or "off" (no probe is attached). When a tracepoint is "off" it has no effect, except for adding a tiny time penalty (checking a condition for a branch) and space penalty (adding a few bytes for the function call at the end of the instrumented function and adds a data structure in a separate section). When a tracepoint is "on", the function you provide is called each time the tracepoint is executed, in the execution context of the caller. When the function provided ends its execution, it returns to the caller (continuing from the tracepoint site). You can put tracepoints at important locations in the code. They are lightweight hooks that can pass an arbitrary number of parameters, which prototypes are described in a tracepoint declaration placed in a header file." Addition and removal of tracepoints is synchronized by RCU using the scheduler (and preempt_disable) as guarantees to find a quiescent state (this is really RCU "classic"). The update side uses rcu_barrier_sched() with call_rcu_sched() and the read/execute side uses "preempt_disable()/preempt_enable()". We make sure the previous array containing probes, which has been scheduled for deletion by the rcu callback, is indeed freed before we proceed to the next update. It therefore limits the rate of modification of a single tracepoint to one update per RCU period. The objective here is to permit fast batch add/removal of probes on _different_ tracepoints. Changelog : - Use #name ":" #proto as string to identify the tracepoint in the tracepoint table. This will make sure not type mismatch happens due to connexion of a probe with the wrong type to a tracepoint declared with the same name in a different header. - Add tracepoint_entry_free_old. - Change __TO_TRACE to get rid of the 'i' iterator. Masami Hiramatsu <mhiramat@redhat.com> : Tested on x86-64. Performance impact of a tracepoint : same as markers, except that it adds about 70 bytes of instructions in an unlikely branch of each instrumented function (the for loop, the stack setup and the function call). It currently adds a memory read, a test and a conditional branch at the instrumentation site (in the hot path). Immediate values will eventually change this into a load immediate, test and branch, which removes the memory read which will make the i-cache impact smaller (changing the memory read for a load immediate removes 3-4 bytes per site on x86_32 (depending on mov prefixes), or 7-8 bytes on x86_64, it also saves the d-cache hit). About the performance impact of tracepoints (which is comparable to markers), even without immediate values optimizations, tests done by Hideo Aoki on ia64 show no regression. His test case was using hackbench on a kernel where scheduler instrumentation (about 5 events in code scheduler code) was added. Quoting Hideo Aoki about Markers : I evaluated overhead of kernel marker using linux-2.6-sched-fixes git tree, which includes several markers for LTTng, using an ia64 server. While the immediate trace mark feature isn't implemented on ia64, there is no major performance regression. So, I think that we don't have any issues to propose merging marker point patches into Linus's tree from the viewpoint of performance impact. I prepared two kernels to evaluate. The first one was compiled without CONFIG_MARKERS. The second one was enabled CONFIG_MARKERS. I downloaded the original hackbench from the following URL: http://devresources.linux-foundation.org/craiger/hackbench/src/hackbench.c I ran hackbench 5 times in each condition and calculated the average and difference between the kernels. The parameter of hackbench: every 50 from 50 to 800 The number of CPUs of the server: 2, 4, and 8 Below is the results. As you can see, major performance regression wasn't found in any case. Even if number of processes increases, differences between marker-enabled kernel and marker- disabled kernel doesn't increase. Moreover, if number of CPUs increases, the differences doesn't increase either. Curiously, marker-enabled kernel is better than marker-disabled kernel in more than half cases, although I guess it comes from the difference of memory access pattern. * 2 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 4.811 | 4.872 | +0.061 | +1.27 | 100 | 9.854 | 10.309 | +0.454 | +4.61 | 150 | 15.602 | 15.040 | -0.562 | -3.6 | 200 | 20.489 | 20.380 | -0.109 | -0.53 | 250 | 25.798 | 25.652 | -0.146 | -0.56 | 300 | 31.260 | 30.797 | -0.463 | -1.48 | 350 | 36.121 | 35.770 | -0.351 | -0.97 | 400 | 42.288 | 42.102 | -0.186 | -0.44 | 450 | 47.778 | 47.253 | -0.526 | -1.1 | 500 | 51.953 | 52.278 | +0.325 | +0.63 | 550 | 58.401 | 57.700 | -0.701 | -1.2 | 600 | 63.334 | 63.222 | -0.112 | -0.18 | 650 | 68.816 | 68.511 | -0.306 | -0.44 | 700 | 74.667 | 74.088 | -0.579 | -0.78 | 750 | 78.612 | 79.582 | +0.970 | +1.23 | 800 | 85.431 | 85.263 | -0.168 | -0.2 | -------------------------------------------------------------- * 4 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.586 | 2.584 | -0.003 | -0.1 | 100 | 5.254 | 5.283 | +0.030 | +0.56 | 150 | 8.012 | 8.074 | +0.061 | +0.76 | 200 | 11.172 | 11.000 | -0.172 | -1.54 | 250 | 13.917 | 14.036 | +0.119 | +0.86 | 300 | 16.905 | 16.543 | -0.362 | -2.14 | 350 | 19.901 | 20.036 | +0.135 | +0.68 | 400 | 22.908 | 23.094 | +0.186 | +0.81 | 450 | 26.273 | 26.101 | -0.172 | -0.66 | 500 | 29.554 | 29.092 | -0.461 | -1.56 | 550 | 32.377 | 32.274 | -0.103 | -0.32 | 600 | 35.855 | 35.322 | -0.533 | -1.49 | 650 | 39.192 | 38.388 | -0.804 | -2.05 | 700 | 41.744 | 41.719 | -0.025 | -0.06 | 750 | 45.016 | 44.496 | -0.520 | -1.16 | 800 | 48.212 | 47.603 | -0.609 | -1.26 | -------------------------------------------------------------- * 8 CPUs Number of | without | with | diff | diff | processes | Marker [Sec] | Marker [Sec] | [Sec] | [%] | -------------------------------------------------------------- 50 | 2.094 | 2.072 | -0.022 | -1.07 | 100 | 4.162 | 4.273 | +0.111 | +2.66 | 150 | 6.485 | 6.540 | +0.055 | +0.84 | 200 | 8.556 | 8.478 | -0.078 | -0.91 | 250 | 10.458 | 10.258 | -0.200 | -1.91 | 300 | 12.425 | 12.750 | +0.325 | +2.62 | 350 | 14.807 | 14.839 | +0.032 | +0.22 | 400 | 16.801 | 16.959 | +0.158 | +0.94 | 450 | 19.478 | 19.009 | -0.470 | -2.41 | 500 | 21.296 | 21.504 | +0.208 | +0.98 | 550 | 23.842 | 23.979 | +0.137 | +0.57 | 600 | 26.309 | 26.111 | -0.198 | -0.75 | 650 | 28.705 | 28.446 | -0.259 | -0.9 | 700 | 31.233 | 31.394 | +0.161 | +0.52 | 750 | 34.064 | 33.720 | -0.344 | -1.01 | 800 | 36.320 | 36.114 | -0.206 | -0.57 | -------------------------------------------------------------- Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Acked-by: Masami Hiramatsu <mhiramat@redhat.com> Acked-by: 'Peter Zijlstra' <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* Merge phase #5 (misc) of ↵Linus Torvalds2008-10-13
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip Merges oprofile, timers/hpet, x86/traps, x86/time, and x86/core misc items. * 'x86-core-v4-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (132 commits) x86: change early_ioremap to use slots instead of nesting x86: adjust dependencies for CONFIG_X86_CMOV dumpstack: x86: various small unification steps, fix x86: remove additional_cpus x86: remove additional_cpus configurability x86: improve UP kernel when CPU-hotplug and SMP is enabled dumpstack: x86: various small unification steps dumpstack: i386: make kstack= an early boot-param and add oops=panic dumpstack: x86: use log_lvl and unify trace formatting dumptrace: x86: consistently include loglevel, print stack switch dumpstack: x86: add "end" parameter to valid_stack_ptr and print_context_stack dumpstack: x86: make printk_address equal dumpstack: x86: move die_nmi to dumpstack_32.c traps: x86: finalize unification of traps.c traps: x86: make traps_32.c and traps_64.c equal traps: x86: various noop-changes preparing for unification of traps_xx.c traps: x86_64: use task_pid_nr(tsk) instead of tsk->pid in do_general_protection traps: i386: expand clear_mem_error and remove from mach_traps.h traps: x86_64: make io_check_error equal to the one on i386 traps: i386: use preempt_conditional_sti/cli in do_int3 ...
| *-. Merge branches 'oprofile-v2' and 'timers/hpet' into x86/core-v4Ingo Molnar2008-10-13
| |\ \
| | | * Merge commit 'v2.6.27' into timers/hpetIngo Molnar2008-10-10
| | | |\
| | | * \ Merge commit 'v2.6.27-rc6' into timers/hpetIngo Molnar2008-09-14
| | | |\ \
| | | * | | hpet: /dev/hpet - fixes and cleanupDavid Brownell2008-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minor /dev/hpet updates and bugfixes: * Remove dead code, mostly remnants of an incomplete/unusable kernel interface ... noted when addressing "sparse" warnings: + hpet_unregister() and a routine it calls + hpet_task and all references, including hpet_task_lock + hpet_data.hd_flags (and HPET_DATA_PLATFORM) * Correct and improve boot message: + displays *counter* (shared between comparators) bit width, not *timer* bit widths (which are often mixed) + relabel "timers" as "comparators"; this is less confusing, they are not independent like normal timers are (sigh) + display MHz not Hz; it's never less than 10 MHz. * Tighten and correct the userspace interface code + don't accidentally program comparators in 64-bit mode using 32-bit values ... always force comparators into 32-bit mode + provide the correct bit definition flagging comparators with periodic capability ... the ABI is unchanged * Update Documentation/hpet.txt + be more correct and current + expand description a bit + don't mention that now-gone kernel interface Plus, add a FIXME comment for something that could cause big trouble on systems with more capable HPETs than at least Intel seems to ship. It seems that few folk use this userspace interface; it's not very usable given the general lack of HPET IRQ routing. I'm told that the only real point of it any more is to mmap for fast timestamps; IMO that's handled better through the gettimeofday() vsyscall. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | * | | Merge branch 'linus' into timers/hpetIngo Molnar2008-07-31
| | | |\ \ \
| | | * \ \ \ Merge branch 'linus' into timers/hpetIngo Molnar2008-06-16
| | | |\ \ \ \
| | | * | | | | x86: get irq for hpet timerKevin Hao2008-06-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HPET timer's IRQ is 0 by default. So we have to select which irq will be used by these timers. We wait to set the timer's irq until we really open it in order to reduce the chance of conflicting with other device. Signed-off-by: Kevin Hao <kexin.hao@windriver.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | Merge branch 'linus' into oprofile-v2Ingo Molnar2008-10-13
| | |\ \ \ \ \ \ | | |/ / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/apic_32.c arch/x86/oprofile/nmi_int.c include/linux/pci_ids.h
| | * | | | | | OProfile: add IBS code macrosRobert Richter2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: oprofile-list <oprofile-list@lists.sourceforge.net> Cc: Barry Kasindorf <barry.kasindorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | x86: add PCI IDs for AMD Barcelona PCI devicesRobert Richter2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Robert Richter <robert.richter@amd.com> Cc: oprofile-list <oprofile-list@lists.sourceforge.net> Cc: Barry Kasindorf <barry.kasindorf@amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | | | | | tty: some ICANON magic is in the wrong placesAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the set up on ldisc change into the ldisc Move the INQ/OUTQ cases into the driver not in shared ioctl code where it gives bogus answers for other ldisc values Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | Add an instance parameter devpts interfacesSukadev Bhattiprolu2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass-in 'inode' or 'tty' parameter to devpts interfaces. With multiple devpts instances, these parameters will be used in subsequent patches to identify the instance of devpts mounted. The parameters also help simplify devpts implementation. Changelog[v3]: - minor changes due to merge with ttydev updates - rename parameters to emphasize they are ptmx or pts inodes - pass-in tty_struct * to devpts_pty_kill() (this will help cleanup the get_node() call in a subsequent patch) Signed-off-by: Sukadev Bhattiprolu <sukadev@us.ibm.com> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: extract the pty init time special casesAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The majority of the remaining init_dev code is pty special cases. We refactor this code into the driver->install method. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: Finish fixing up the init_dev interface to use ERR_PTRAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Original suggestion and proposal from Sukadev Bhattiprolu. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: More driver operationsAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have the lookup operation abstracted which is nice for pty cleanup but we really want to abstract the add/remove entries as well so that we can pull the pty code out of the tty core and create a clear defined interface for the tty driver table. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: kref the tty driver objectAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: Clean up the tty_init_dev changes furtherAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix up the naming, style and extract some bits of code into the driver specific code Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: Remove more special casing and out of place codeAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Carry on pushing code out of tty_io when it belongs to other drivers. I'm not 100% happy with some of this and it will be worth revisiting some of the exports later when the restructuring work is done. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: shutdown methodAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now there are various drivers that try to use tty->count to know when they get the final close. Aristeau Rozanski showed while debugging the vt sysfs race that this isn't entirely safe. Instead of driver side tricks to work around this introduce a shutdown which is called when the tty is being destructed. This also means that the shutdown method is tied into the refcounting. Use this to rework the console close/sysfs logic. Remove lots of special case code from the tty core code. The pty code can now have a shutdown() method that replaces the special case hackery in the tree free up paths. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: the vhangup syscall is racyAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We now have the infrastructure to sort this out but rather than teaching the syscall tty lock rules we move the hard work into a tty helper Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: usb-serial krefsAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use kref in the USB serial drivers so that we don't free tty structures from under the URB receive handlers as has historically been the case if you were unlucky. This also gives us a framework for general tty drivers to use tty_port objects and refcount. Contains two err->dev_err changes merged together to fix clashes in the -next tree. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: Add termioxAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need a way to describe the various additional modes and flow control features that random weird hardware shows up and software such as wine wants to emulate as Windows supports them. TCGETX/TCSETX and the termiox ioctl are a SYS5 extension that we might as well adopt. This patches adds the structures and the basic ioctl interfaces when the TCGETX etc defines are added for an architecture. Drivers wishing to use this stuff need to add new methods. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: Cris has a nice RS485 ioctl so we should steal itAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | JP Tosoni observed: "About a RS485 ioctl: could you consider the attached files which are already in the Linux kernel (in include/asm-cris). They define a TIOCSERSETRS485 (ioctl.h), and the data structure (rs485.h) with allows to specify timings. Sounds just like what we want ?" and he's right: sort of. Rework the structure to use flag bits and make the time delay a fixed sized field so we don't get 32/64bit problems. Add the ioctls to x86 so that people know what to add to their platform of choice. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: Add a kref countAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a kref to the tty structure and use it to protect the tty->signal tty references. For now we don't introduce it for anything else. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | pps: Reserve a line discipline number for PPSAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new line discipline for "pulse per second" devices connected to a serial port. Signed-off-by: Rodolfo Giometti <giometti@linux.it> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | tty: split the buffering from tty_ioAlan Cox2008-10-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The two are basically independent chunks of code so lets split them up for readability and sanity. It also makes the API boundaries much clearer. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | serial: Make uart_port's ioport "unsigned long".David Miller2008-10-13
|/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise the top 32-bits of the resource value get chopped off on 64-bit systems, and the resulting I/O accesses go to random places. Thanks to testing and debugging by Josip Rodin, which helped track this down. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | Merge branch 'for_linus' of ↵Linus Torvalds2008-10-12
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix kconfig typo and extra whitespace ext4: fix build failure without procfs ext4: add an option to control error handling on file data jbd2: don't dirty original metadata buffer on abort ext4: add checks for errors from jbd2 jbd2: fix error handling for checkpoint io jbd2: abort when failed to log metadata buffers
| * | | | | | | ext4: add an option to control error handling on file dataHidehiro Kawai2008-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the journal doesn't abort when it gets an IO error in file data blocks, the file data corruption will spread silently. Because most of applications and commands do buffered writes without fsync(), they don't notice the IO error. It's scary for mission critical systems. On the other hand, if the journal aborts whenever it gets an IO error in file data blocks, the system will easily become inoperable. So this patch introduces a filesystem option to determine whether it aborts the journal or just call printk() when it gets an IO error in file data. If you mount an ext4 fs with data_err=abort option, it aborts on file data write error. If you mount it with data_err=ignore, it doesn't abort, just call printk(). data_err=ignore is the default. Here is the corresponding patch of the ext3 version: http://kerneltrap.org/mailarchive/linux-kernel/2008/9/9/3239374 Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
| * | | | | | | jbd2: fix error handling for checkpoint ioHidehiro Kawai2008-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a checkpointing IO fails, current JBD2 code doesn't check the error and continue journaling. This means latest metadata can be lost from both the journal and filesystem. This patch leaves the failed metadata blocks in the journal space and aborts journaling in the case of jbd2_log_do_checkpoint(). To achieve this, we need to do: 1. don't remove the failed buffer from the checkpoint list where in the case of __try_to_free_cp_buf() because it may be released or overwritten by a later transaction 2. jbd2_log_do_checkpoint() is the last chance, remove the failed buffer from the checkpoint list and abort the journal 3. when checkpointing fails, don't update the journal super block to prevent the journaled contents from being cleaned. For safety, don't update j_tail and j_tail_sequence either 4. when checkpointing fails, notify this error to the ext4 layer so that ext4 don't clear the needs_recovery flag, otherwise the journaled contents are ignored and cleaned in the recovery phase 5. if the recovery fails, keep the needs_recovery flag 6. prevent jbd2_cleanup_journal_tail() from being called between __jbd2_journal_drop_transaction() and jbd2_journal_abort() (a possible race issue between jbd2_log_do_checkpoint()s called by jbd2_journal_flush() and __jbd2_log_wait_for_space()) Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* | | | | | | | Merge branch 'x86-core-v2-for-linus' of ↵Linus Torvalds2008-10-12
|\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip This merges in: x86/build, x86/microcode, x86/spinlocks, x86/memory-corruption-check, x86/early-printk, x86/xsave, x86/quirks, x86/setup, x86/signal, core/signal, x86/urgent, x86/xen * 'x86-core-v2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (142 commits) x86: make processor type select depend on CONFIG_EMBEDDED x86: extend processor type select help text x86, amd-iommu: propagate PCI device enabling error warnings: fix arch/x86/kernel/io_apic_64.c warnings: fix arch/x86/kernel/early_printk.c x86, fpu: check __clear_user() return value x86: memory corruption check - cleanup x86: ioperm user_regset xen: do not reserve 2 pages of padding between hypervisor and fixmap. xen: use spin_lock_nest_lock when pinning a pagetable x86: xsave: set FP, SSE bits in the xsave header in the user sigcontext x86: xsave: fix error condition in save_i387_xstate() x86: SB450: deprioritize DMI quirks x86: SB450: skip IRQ0 override if it is not routed to INT2 of IOAPIC x86: replace a magic number with a named constant in the VESA boot code x86 setup: remove IMAGE_OFFSET x86 setup: remove DEF_INITSEG and DEF_SETUPSEG Revert "x86: fix ghost EDD devices in /sys again" x86 setup: fix ghost entries under /sys/firmware/edd take 3 x86: signal: remove indent in restore_sigcontext() ...
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| | \ \ \ \ \ \ \
| *-------------. \ \ \ \ \ \ \ Merge branches 'x86/xen', 'x86/build', 'x86/microcode', 'x86/mm-debug-v2', ↵Ingo Molnar2008-10-12
| |\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ | | |_|_|_|_|_|_|_|/ / / / / / / | |/| | | | | | | | | | | | / / | | | | |_|_|_|_|_|_|_|_|_|/ / | | | |/| | | | | | | | | | | 'x86/memory-corruption-check', 'x86/early-printk', 'x86/xsave', 'x86/ptrace-v2', 'x86/quirks', 'x86/setup', 'x86/spinlocks' and 'x86/signal' into x86/core-v2
| | | | | | | | * | | | | | | x86: ioperm user_regsetRoland McGrath2008-10-12
| | |_|_|_|_|_|/ / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a user_regset type for the x86 io permissions bitmap. This makes it appear in core dumps (when ioperm has been used). It will also make it visible to debuggers in the future. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> [conflict resolutions: Signed-off-by: Ingo Molnar <mingo@elte.hu> ]
| | | | | | | * | | | | | | usb: move ehci reg defYinghai Lu2008-07-26
| | | | | | | | |/ / / / / | | | | | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | prepare x86: usb debug port early console move ehci struct def to linux/usrb/ehci_def.h from host/ehci.h Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: "Arjan van de Ven" <arjan@infradead.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: "Greg KH" <greg@kroah.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | | | * | | | | | | x86: memory corruption check - cleanupIngo Molnar2008-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the prototypes from the generic kernel.h header to the more appropriate include/asm-x86/bios_ebda.h header file. Also, remove the check from the power management code - this is a pure x86 matter for now. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | | | * | | | | | | Merge branch 'linus' into x86/memory-corruption-checkIngo Molnar2008-10-12
| | | | | | |\ \ \ \ \ \ \ | | |_|_|_|_|/ / / / / / / | |/| | | | | | | | | | |
| | | | | | * | | | | | | Merge commit 'v2.6.27-rc6' into x86/memory-corruption-checkIngo Molnar2008-09-16
| | | | | | |\ \ \ \ \ \ \ | | | | |_|_|/ / / / / / / | | | |/| | | | | | | | |
| | | | | | * | | | | | | x86: fix compile error with corruption checking disabledJeremy Fitzhardinge2008-09-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix compile error: arch/x86/mm/init_32.c: In function 'mem_init': arch/x86/mm/init_32.c:908: error: implicit declaration of function 'start_periodic_check_for_corruption' Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | | | * | | | | | | x86: add periodic corruption checkHugh Dickins2008-09-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Perodically check for corruption in low phusical memory. Don't bother checking at fault time, since it won't show anything useful. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | | | * | | | | | | x86: check for and defend against BIOS memory corruptionJeremy Fitzhardinge2008-09-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some BIOSes have been observed to corrupt memory in the low 64k. This change: - Reserves all memory which does not have to be in that area, to prevent it from being used as general memory by the kernel. Things like the SMP trampoline are still in the memory, however. - Clears the reserved memory so we can observe changes to it. - Adds a function check_for_bios_corruption() which checks and reports on memory becoming unexpectedly non-zero. Currently it's called in the x86 fault handler, and the powermanagement debug output. Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | | * | | | | | | | x86, MM: virtual address debug, cleanupsIngo Molnar2008-06-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | | | | * | | | | | | | MM: virtual address debugJiri Slaby2008-06-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add some (configurable) expensive sanity checking to catch wrong address translations on x86. - create linux/mmdebug.h file to be able include this file in asm headers to not get unsolvable loops in header files - __phys_addr on x86_32 became a function in ioremap.c since PAGE_OFFSET, is_vmalloc_addr and VMALLOC_* non-constasts are undefined if declared in page_32.h - add __phys_addr_const for initializing doublefault_tss.__cr3 Tested on 386, 386pae, x86_64 and x86_64 numa=fake=2. Contains Andi's enable numa virtual address debug patch. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | Merge branch 'linus' into x86/xenIngo Molnar2008-10-12
| | |\ \ \ \ \ \ \ \ \ \ \ | | |/ / / / / / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/cpu/common.c arch/x86/kernel/process_64.c arch/x86/xen/enlighten.c
| | * | | | | | | | | | | Merge branch 'timers/urgent' into x86/xenIngo Molnar2008-09-23
| | |\ \ \ \ \ \ \ \ \ \ \ | | | | |/ / / / / / / / / | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/process_32.c arch/x86/kernel/process_64.c Manual merge: arch/x86/kernel/smpboot.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | Merge branch 'core/xen' into x86/xenIngo Molnar2008-09-10
| | |\ \ \ \ \ \ \ \ \ \ \
| | | * | | | | | | | | | | mm: define USE_SPLIT_PTLOCKS rather than repeating expressionJeremy Fitzhardinge2008-09-10
| | | | |/ / / / / / / / / | | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Define USE_SPLIT_PTLOCKS as a constant expression rather than repeating "NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS" all over the place. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | | | | | | | | Merge branch 'linus' into x86/xenIngo Molnar2008-08-25
| | |\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: arch/x86/kernel/paravirt.c Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * \ \ \ \ \ \ \ \ \ \ \ Merge branch 'linus' into x86/xenIngo Molnar2008-08-20
| | |\ \ \ \ \ \ \ \ \ \ \ \