litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	[PATCH] i386: Selectable Frequency of the Timer Interrupt	Christoph Lameter	2005-06-23
\| \| \| \| \| \| \| \| \| \| \|	Make the timer frequency selectable. The timer interrupt may cause bus and memory contention in large NUMA systems since the interrupt occurs on each processor HZ times per second. Signed-off-by: Christoph Lameter <christoph@lameter.com> Signed-off-by: Shai Fultheim <shai@scalex86.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] uml: make hw_controller_type->release exist only for archs needing it	Paolo 'Blaisorblade' Giarrusso	2005-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	With Chris Wedgwood <cw@f00f.org> As suggested by Chris, we can make the "just added" method ->release conditional to UML only (better: to archs requesting it, i.e. only UML currently), so that other archs don't get this unneeded crud, and if UML won't need it any more we can kill this. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> CC: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] uml: add and use generic hw_controller_type->release	Paolo 'Blaisorblade' Giarrusso	2005-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With Chris Wedgwood <cw@f00f.org> Currently UML must explicitly call the UML-specific free_irq_by_irq_and_dev() for each free_irq call it's done. This is needed because ->shutdown and/or ->disable are only called when the last "action" for that irq is removed. Instead, for UML shared IRQs (UML IRQs are very often, if not always, shared), for each dev_id some setup is done, which must be cleared on the release of that fd. For instance, for each open console a new instance (i.e. new dev_id) of the same IRQ is requested(). Exactly, a fd is stored in an array (pollfds), which is after read by a host thread and passed to poll(). Each event registered by poll() triggers an interrupt. So, for each free_irq() we must remove the corresponding host fd from the table, which we do via this -release() method. In this patch we add an appropriate hook for this, and remove all uses of it by pointing the hook to the said procedure; this is safe to do since the said procedure. Also some cosmetic improvements are included. This is heavily based on some work by Chris Wedgwood, which however didn't get the patch merged for something I'd call a "misunderstanding" (the need for this patch wasn't cleanly explained, thus adding the generic hook was felt as undesirable). Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> CC: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] dup_mmap: update comment on new vma	Hugh Dickins	2005-06-21
\| \| \| \| \| \| \| \| \| \|	Remove part of comment on linking new vma in dup_mmap: since anon_vma rmap came in, try_to_unmap_one knows the vma without needing find_vma. But add a comment to note that here vma is inserted without mmap_sem. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] Avoiding mmap fragmentation	Wolfgang Wander	2005-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ingo recently introduced a great speedup for allocating new mmaps using the free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and causes huge performance increases in thread creation. The downside of this patch is that it does lead to fragmentation in the mmap-ed areas (visible via /proc/self/maps), such that some applications that work fine under 2.4 kernels quickly run out of memory on any 2.6 kernel. The problem is twofold: 1) the free_area_cache is used to continue a search for memory where the last search ended. Before the change new areas were always searched from the base address on. So now new small areas are cluttering holes of all sizes throughout the whole mmap-able region whereas before small holes tended to close holes near the base leaving holes far from the base large and available for larger requests. 2) the free_area_cache also is set to the location of the last munmap-ed area so in scenarios where we allocate e.g. five regions of 1K each, then free regions 4 2 3 in this order the next request for 1K will be placed in the position of the old region 3, whereas before we appended it to the still active region 1, placing it at the location of the old region 2. Before we had 1 free region of 2K, now we only get two free regions of 1K -> fragmentation. The patch addresses thes issues by introducing yet another cache descriptor cached_hole_size that contains the largest known hole size below the current free_area_cache. If a new request comes in the size is compared against the cached_hole_size and if the request can be filled with a hole below free_area_cache the search is started from the base instead. The results look promising: Whereas 2.6.12-rc4 fragments quickly and my (earlier posted) leakme.c test program terminates after 50000+ iterations with 96 distinct and fragmented maps in /proc/self/maps it performs nicely (as expected) with thread creation, Ingo's test_str02 with 20000 threads requires 0.7s system time. Taking out Ingo's patch (un-patch available per request) by basically deleting all mentions of free_area_cache from the kernel and starting the search for new memory always at the respective bases we observe: leakme terminates successfully with 11 distinctive hardly fragmented areas in /proc/self/maps but thread creating is gringdingly slow: 30+s(!) system time for Ingo's test_str02 with 20000 threads. Now - drumroll ;-) the appended patch works fine with leakme: it ends with only 7 distinct areas in /proc/self/maps and also thread creation seems sufficiently fast with 0.71s for 20000 threads. Signed-off-by: Wolfgang Wander <wwc@rentec.com> Credit-to: "Richard Purdie" <rpurdie@rpsys.net> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> (partly) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] VM: early zone reclaim	Martin Hicks	2005-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the core of the (much simplified) early reclaim. The goal of this patch is to reclaim some easily-freed pages from a zone before falling back onto another zone. One of the major uses of this is NUMA machines. With the default allocator behavior the allocator would look for memory in another zone, which might be off-node, before trying to reclaim from the current zone. This adds a zone tuneable to enable early zone reclaim. It is selected on a per-zone basis and is turned on/off via syscall. Adding some extra throttling on the reclaim was also required (patch 4/4). Without the machine would grind to a crawl when doing a "make -j" kernel build. Even with this patch the System Time is higher on average, but it seems tolerable. Here are some numbers for kernbench runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run: wall user sys %cpu ctx sw. sleeps ---- ---- --- ---- ------ ------ No patch 1009 1384 847 258 298170 504402 w/patch, no reclaim 880 1376 667 288 254064 396745 w/patch & reclaim 1079 1385 926 252 291625 548873 These numbers are the average of 2 runs of 3 "make -j" runs done right after system boot. Run-to-run variability for "make -j" is huge, so these numbers aren't terribly useful except to seee that with reclaim the benchmark still finishes in a reasonable amount of time. I also looked at the NUMA hit/miss stats for the "make -j" runs and the reclaim doesn't make any difference when the machine is thrashing away. Doing a "make -j8" on a single node that is filled with page cache pages takes 700 seconds with reclaim turned on and 735 seconds without reclaim (due to remote memory accesses). The simple zone_reclaim syscall program is at http://www.bork.org/~mort/sgi/zone_reclaim.c Signed-off-by: Martin Hicks <mort@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] smp_processor_id() cleanup	Ingo Molnar	2005-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements a number of smp_processor_id() cleanup ideas that Arjan van de Ven and I came up with. The previous __smp_processor_id/_smp_processor_id/smp_processor_id API spaghetti was hard to follow both on the implementational and on the usage side. Some of the complexity arose from picking wrong names, some of the complexity comes from the fact that not all architectures defined __smp_processor_id. In the new code, there are two externally visible symbols: - smp_processor_id(): debug variant. - raw_smp_processor_id(): nondebug variant. Replaces all existing uses of _smp_processor_id() and __smp_processor_id(). Defined by every SMP architecture in include/asm-*/smp.h. There is one new internal symbol, dependent on DEBUG_PREEMPT: - debug_smp_processor_id(): internal debug variant, mapped to smp_processor_id(). Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new lib/smp_processor_id.c file. All related comments got updated and/or clarified. I have build/boot tested the following 8 .config combinations on x86: {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT} I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT. (Other architectures are untested, but should work just fine.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] sysfs: (rest) if show/store is missing return -EIO	Dmitry Torokhov	2005-06-20
\| \| \| \| \| \| \| \| \|	sysfs: fix the rest of the kernel so if an attribute doesn't implement show or store method read/write will return -EIO instead of 0 or -EINVAL or -EPERM. Signed-off-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
*	Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git	David Woodhouse	2005-06-18
\|\
\| *	[PATCH] timer exit cleanup	Ingo Molnar	2005-06-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Do all timer zapping in exit_itimers. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] cond_resched_lock() fix	Jan Kara	2005-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On one path, cond_resched_lock() fails to return true if it dropped the lock. We think this might be causing the crashes in JBD's log_do_checkpoint(). Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* \|	Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git	David Woodhouse	2005-06-02
\|\\|
\| *	[PATCH] flush icache in correct context	Roman Zippel	2005-05-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	flush_icache_range() is used in two different situation - in binfmt_elf.c & co for user space mappings and module.c for kernel modules. On m68k flush_icache_range() doesn't know which data to flush, as it has separate address spaces and the pointer argument can be valid in either address space. First I considered splitting flush_icache_range(), but this patch is simpler. Setting the correct context gives flush_icache_range() enough information to flush the correct data. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] drop note_interrupt() for per-CPU for proper scaling	John Hawkes	2005-05-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The "unhandled interrupts" catcher, note_interrupt(), increments a global desc->irq_count and grossly damages scaling of very large systems, e.g., >192p ia64 Altix, because of this highly contented cacheline, especially for timer interrupts. 384p is severely crippled, and 512p is unuseable. All calls to note_interrupt() can be disabled by booting with "noirqdebug", but this disables the useful interrupt checking for all interrupts. I propose eliminating note_interrupt() for all per-CPU interrupts. This was the behavior of linux-2.6.10 and earlier, but in 2.6.11 a code restructuring added a call to note_interrupt() for per-CPU interrupts. Besides, note_interrupt() is a bit racy for concurrent CPU calls anyway, as the desc->irq_count++ increment isn't atomic (which, if done, would make scaling even worse). Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] cpuset exit NULL dereference fix	Paul Jackson	2005-05-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a race in the kernel cpuset code, between the code to handle notify_on_release, and the code to remove a cpuset. The notify_on_release code can end up trying to access a cpuset that has been removed. In the most common case, this causes a NULL pointer dereference from the routine cpuset_path. However all manner of bad things are possible, in theory at least. The existing code decrements the cpuset use count, and if the count goes to zero, processes the notify_on_release request, if appropriate. However, once the count goes to zero, unless we are holding the global cpuset_sem semaphore, there is nothing to stop another task from immediately removing the cpuset entirely, and recycling its memory. The obvious fix would be to always hold the cpuset_sem semaphore while decrementing the use count and dealing with notify_on_release. However we don't want to force a global semaphore into the mainline task exit path, as that might create a scaling problem. The actual fix is almost as easy - since this is only an issue for cpusets using notify_on_release, which the top level big cpusets don't normally need to use, only take the cpuset_sem for cpusets using notify_on_release. This code has been run for hours without a hiccup, while running a cpuset create/destroy stress test that could crash the existing kernel in seconds. This patch applies to the current -linus git kernel. Signed-off-by: Paul Jackson <pj@sgi.com> Acked-by: Simon Derr <simon.derr@bull.net> Acked-by: Dinakar Guniguntala <dino@in.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] sigkill priority fix	Kirill Korotaev	2005-05-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If SIGKILL does not have priority, we cannot instantly kill task before it makes some unexpected job. It can be critical, but we were unable to reproduce this easily until Heiko Carstens <Heiko.Carstens@de.ibm.com> reported this problem on LKML. Signed-Off-By: Kirill Korotaev <dev@sw.ru> Signed-Off-By: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] spin_unlock_bh() and preempt_check_resched()	Samuel Thibault	2005-05-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In _spin_unlock_bh(lock): do { \ _raw_spin_unlock(lock); \ preempt_enable(); \ local_bh_enable(); \ __release(lock); \ } while (0) there is no reason for using preempt_enable() instead of a simple preempt_enable_no_resched() Since we know bottom halves are disabled, preempt_schedule() will always return at once (preempt_count!=0), and hence preempt_check_resched() is useless here... This fixes it by using "preempt_enable_no_resched()" instead of the "preempt_enable()", and thus avoids the useless preempt_check_resched() just before re-enabling bottom halves. Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] cpusets+hotplug+preepmt broken	Paul Jackson	2005-05-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes the entwining of cpusets and hotplug code in the "No more Mr. Nice Guy" case of sched.c move_task_off_dead_cpu(). Since the hotplug code is holding a spinlock at this point, we cannot take the cpuset semaphore, cpuset_sem, as would seem to be required either to update the tasks cpuset, or to scan up the nested cpuset chain, looking for the nearest cpuset ancestor that still has some CPUs that are online. So we just punt and blast the tasks cpus_allowed with all bits allowed. This reverts these lines of code to what they were before the cpuset patch. And it updates the cpuset Doc file, to match. The one known alternative to this that seems to work came from Dinakar Guniguntala, and required the hotplug code to take the cpuset_sem semaphore much earlier in its processing. So far as we know, the increased locking entanglement between cpusets and hot plug of this alternative approach is not worth doing in this case. Signed-off-by: Paul Jackson <pj@sgi.com> Acked-by: Nathan Lynch <ntl@pobox.com> Acked-by: Dinakar Guniguntala <dino@in.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* \|	AUDIT: Record working directory when syscall arguments are pathnames	David Woodhouse	2005-05-27
\| \| \| \| \| \| \| \|	Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Defer freeing aux items until audit_free_context()	David Woodhouse	2005-05-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While they were all just simple blobs it made sense to just free them as we walked through and logged them. Now that there are pointers to other objects which need refcounting, we might as well revert to _only_ logging them in audit_log_exit(), and put the code to free them properly in only one place -- in audit_free_aux(). Signed-off-by: David Woodhouse <dwmw2@infradead.org> ----------------------------------------------------------
* \|	AUDIT: Escape comm when logging task info	David Woodhouse	2005-05-23
\| \| \| \| \| \| \| \| \| \| \| \|	It comes from the user; it needs to be escaped. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Unify auid reporting, put arch before syscall number	David Woodhouse	2005-05-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	These changes make processing of audit logs easier. Based on a patch from Steve Grubb <sgrubb@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Assign serial number to non-syscall messages	David Woodhouse	2005-05-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move audit_serial() into audit.c and use it to generate serial numbers on messages even when there is no audit context from syscall auditing. This allows us to disambiguate audit records when more than one is generated in the same millisecond. Based on a patch by Steve Grubb after he observed the problem. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Fix inconsistent use of loginuid vs. auid, signed vs. unsigned	Steve Grubb	2005-05-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The attached patch changes all occurrences of loginuid to auid. It also changes everything to %u that is an unsigned type. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Fix AVC_USER message passing.	Steve Grubb	2005-05-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original AVC_USER message wasn't consolidated with the new range of user messages. The attached patch fixes the kernel so the old messages work again. Signed-off-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Avoid sleeping function in SElinux AVC audit.	Stephen Smalley	2005-05-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch changes the SELinux AVC to defer logging of paths to the audit framework upon syscall exit, by saving a reference to the (dentry,vfsmount) pair in an auxiliary audit item on the current audit context for processing by audit_log_exit. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Honour audit_backlog_limit again.	David Woodhouse	2005-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The limit on the number of outstanding audit messages was inadvertently removed with the switch to queuing skbs directly for sending by a kernel thread. Put it back again. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	Merge with master.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux-2.6.git	David Woodhouse	2005-05-19
\|\\|
\| *	[PATCH] Driver Core: pm diagnostics update, check for errors	David Brownell	2005-05-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch includes various tweaks in the messaging that appears during system pm state transitions: * Warn about certain illegal calls in the device tree, like resuming child before parent or suspending parent before child. This could happen easily enough through sysfs, or in some cases when drivers use device_pm_set_parent(). * Be more consistent about dev_dbg() tracing ... do it for resume() and shutdown() too, and never if the driver doesn't have that method. * Say which type of system sleep state is being entered. Except for the warnings, these only affect debug messaging. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
\| *	[PATCH] profile.c: `schedule' parsing fix	William Lee Irwin III	2005-05-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	profile=schedule parsing is not quite what it should be. First, str[7] is 'e', not ',', but then even if it did fall through, prof_on = SCHED_PROFILING would be clobbered inside if (get_option(...)) So a small amount of rearrangement is done in this patch to correct it. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] add_preferred_console() build fix	Matt Mackall	2005-05-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move add_preferred_console out of CONFIG_PRINTK so serial console does the right thing. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
\| *	[PATCH] spurious interrupt fix	Zhang, Yanmin	2005-05-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On my IA64 machine, after kernel 2.6.12-rc3 boots, an edge-triggered interrupt (IRQ 46) keeps triggered over and over again. There is no IRQ 46 interrupt action handler. It has lots of impact on performance. Kernel 2.6.10 and its prior versions have no the problem. Basically, kernel 2.6.10 will mask the spurious edge interrupt if the interrupt is triggered for the second time and its status includes IRQ_DISABLE\|IRQ_PENDING. Originally, IA64 kernel has its own specific _irq_desc definitions in file arch/ia64/kernel/irq.c. The definition initiates _irq_desc[irq].status to IRQ_DISABLE. Since kernel 2.6.11, it was moved to architecture independent codes, i.e. kernel/irq/handle.c, but kernel/irq/handle.c initiates _irq_desc[irq].status to 0 instead of IRQ_DISABLE. Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* \|	AUDIT: Quis Custodiet Ipsos Custodes?	David Woodhouse	2005-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Nobody does. Really, it gets very silly if auditd is recording its own actions. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Send netlink messages from a separate kernel thread	David Woodhouse	2005-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	netlink_unicast() will attempt to reallocate and will free messages if the socket's rcvbuf limit is reached unless we give it an infinite timeout. So do that, from a kernel thread which is dedicated to spewing stuff up the netlink socket. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Clean up logging of untrusted strings	Steve Grubb	2005-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* If vsnprintf returns -1, it will mess up the sk buffer space accounting. This is fixed by not calling skb_put with bogus len values. * audit_log_hex was a loop that called audit_log_vformat with %02X for each character. This is very inefficient since conversion from unsigned character to Ascii representation is essentially masking, shifting, and byte lookups. Also, the length of the converted string is well known - it's twice the original. Fixed by rewriting the function. * audit_log_untrustedstring had no comments. This makes it hard for someone to understand what the string format will be. * audit_log_d_path was never fixed to use untrustedstring. This could mess up user space parsers. This was fixed to make a temp buffer, call d_path, and log temp buffer using untrustedstring. From: Steve Grubb <sgrubb@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Treat all user messages identically.	David Woodhouse	2005-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's silly to have to add explicit entries for new userspace messages as we invent them. Just treat all messages in the user range the same. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Capture sys_socketcall arguments and sockaddrs	David Woodhouse	2005-05-17
\| \| \| \| \| \| \| \|	Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: fix max_t thinko.	David Woodhouse	2005-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Der... if you use max_t it helps if you give it a type. Note to self: Always just apply the tested patches, don't try to port them by hand. You're not clever enough. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Fix some spelling errors	Steve Grubb	2005-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm going through the kernel code and have a patch that corrects several spelling errors in comments. From: Steve Grubb <sgrubb@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Add message types to audit records	Steve Grubb	2005-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds more messages types to the audit subsystem so that audit analysis is quicker, intuitive, and more useful. Signed-off-by: Steve Grubb <sgrubb@redhat.com> --- I forgot one type in the big patch. I need to add one for user space originating SE Linux avc messages. This is used by dbus and nscd. -Steve --- Updated to 2.6.12-rc4-mm1. -dwmw2 Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Round up audit skb expansion to AUDIT_BUFSIZ.	David Woodhouse	2005-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise, we will be repeatedly reallocating, even if we're only adding a few bytes at a time. Pointed out by Steve Grubb. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	Add audit_log_type	Chris Wright	2005-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add audit_log_type to allow callers to specify type and pid when logging. Convert audit_log to wrapper around audit_log_type. Could have converted all audit_log callers directly, but common case is default of type AUDIT_KERNEL and pid 0. Update audit_log_start to take type and pid values when creating a new audit_buffer. Move sequences that did audit_log_start, audit_log_format, audit_set_type, audit_log_end, to simply call audit_log_type directly. This obsoletes audit_set_type and audit_set_pid, so remove them. Signed-off-by: Chris Wright <chrisw@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	Move ifdef CONFIG_AUDITSYSCALL to header	Chris Wright	2005-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove code conditionally dependent on CONFIG_AUDITSYSCALL from audit.c. Move these dependencies to audit.h with the rest. Signed-off-by: Chris Wright <chrisw@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	Audit requires CONFIG_NET	Chris Wright	2005-05-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Audit now actually requires netlink. So make it depend on CONFIG_NET, and remove the inline dependencies on CONFIG_NET. Signed-off-by: Chris Wright <chrisw@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Properly account for alignment difference in nlmsg_len.	Chris Wright	2005-05-11
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Chris Wright <chrisw@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Fix abuse of va_args.	David Woodhouse	2005-05-10
\| \| \| \| \| \| \| \| \| \| \| \|	We're not allowed to use args twice; we need to use va_copy. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: pass size argument to audit_expand().	David Woodhouse	2005-05-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let audit_expand() know how much it's expected to grow the buffer, in the case that we have that information to hand. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Fix reported length of audit messages.	Steve Grubb	2005-05-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were setting nlmsg_len to skb->len, but we should be subtracting the size of the header. From: Steve Grubb <sgrubb@redhat.com> Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: Honour gfp_mask in audit_buffer_alloc()	David Woodhouse	2005-05-06
\| \| \| \| \| \| \| \|	Signed-off-by: David Woodhouse <dwmw2@infradead.org>
* \|	AUDIT: buffer audit msgs directly to skb	Chris Wright	2005-05-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Drop the use of a tmp buffer in the audit_buffer, and just buffer directly to the skb. All header data that was temporarily stored in the audit_buffer can now be stored directly in the netlink header in the skb. Resize skb as needed. This eliminates the extra copy (and the audit_log_move function which was responsible for copying). Signed-off-by: Chris Wright <chrisw@osdl.org> Signed-off-by: David Woodhouse <dwmw2@infradead.org>