litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	[PATCH] Avoiding mmap fragmentation	Wolfgang Wander	2005-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ingo recently introduced a great speedup for allocating new mmaps using the free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and causes huge performance increases in thread creation. The downside of this patch is that it does lead to fragmentation in the mmap-ed areas (visible via /proc/self/maps), such that some applications that work fine under 2.4 kernels quickly run out of memory on any 2.6 kernel. The problem is twofold: 1) the free_area_cache is used to continue a search for memory where the last search ended. Before the change new areas were always searched from the base address on. So now new small areas are cluttering holes of all sizes throughout the whole mmap-able region whereas before small holes tended to close holes near the base leaving holes far from the base large and available for larger requests. 2) the free_area_cache also is set to the location of the last munmap-ed area so in scenarios where we allocate e.g. five regions of 1K each, then free regions 4 2 3 in this order the next request for 1K will be placed in the position of the old region 3, whereas before we appended it to the still active region 1, placing it at the location of the old region 2. Before we had 1 free region of 2K, now we only get two free regions of 1K -> fragmentation. The patch addresses thes issues by introducing yet another cache descriptor cached_hole_size that contains the largest known hole size below the current free_area_cache. If a new request comes in the size is compared against the cached_hole_size and if the request can be filled with a hole below free_area_cache the search is started from the base instead. The results look promising: Whereas 2.6.12-rc4 fragments quickly and my (earlier posted) leakme.c test program terminates after 50000+ iterations with 96 distinct and fragmented maps in /proc/self/maps it performs nicely (as expected) with thread creation, Ingo's test_str02 with 20000 threads requires 0.7s system time. Taking out Ingo's patch (un-patch available per request) by basically deleting all mentions of free_area_cache from the kernel and starting the search for new memory always at the respective bases we observe: leakme terminates successfully with 11 distinctive hardly fragmented areas in /proc/self/maps but thread creating is gringdingly slow: 30+s(!) system time for Ingo's test_str02 with 20000 threads. Now - drumroll ;-) the appended patch works fine with leakme: it ends with only 7 distinct areas in /proc/self/maps and also thread creation seems sufficiently fast with 0.71s for 20000 threads. Signed-off-by: Wolfgang Wander <wwc@rentec.com> Credit-to: "Richard Purdie" <rpurdie@rpsys.net> Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> Acked-by: Ingo Molnar <mingo@elte.hu> (partly) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] Hugepage consolidation	David Gibson	2005-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A lot of the code in arch/*/mm/hugetlbpage.c is quite similar. This patch attempts to consolidate a lot of the code across the arch's, putting the combined version in mm/hugetlb.c. There are a couple of uglyish hacks in order to covert all the hugepage archs, but the result is a very large reduction in the total amount of code. It also means things like hugepage lazy allocation could be implemented in one place, instead of six. Tested, at least a little, on ppc64, i386 and x86_64. Notes: - this patch changes the meaning of set_huge_pte() to be more analagous to set_pte() - does SH4 need s special huge_ptep_get_and_clear()?? Acked-by: William Lee Irwin <wli@holomorphy.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] smp_processor_id() cleanup	Ingo Molnar	2005-06-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements a number of smp_processor_id() cleanup ideas that Arjan van de Ven and I came up with. The previous __smp_processor_id/_smp_processor_id/smp_processor_id API spaghetti was hard to follow both on the implementational and on the usage side. Some of the complexity arose from picking wrong names, some of the complexity comes from the fact that not all architectures defined __smp_processor_id. In the new code, there are two externally visible symbols: - smp_processor_id(): debug variant. - raw_smp_processor_id(): nondebug variant. Replaces all existing uses of _smp_processor_id() and __smp_processor_id(). Defined by every SMP architecture in include/asm-*/smp.h. There is one new internal symbol, dependent on DEBUG_PREEMPT: - debug_smp_processor_id(): internal debug variant, mapped to smp_processor_id(). Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new lib/smp_processor_id.c file. All related comments got updated and/or clarified. I have build/boot tested the following 8 .config combinations on x86: {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT} I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT. (Other architectures are untested, but should work just fine.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[SPARC64]: Fix cmsg length checks in Solaris emulation layer.	David S. Miller	2005-06-21
\| \| \| \|	Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SPARC64]: Refine PCI strbuf ctx-based flush.	David S. Miller	2005-05-31
\| \| \| \| \| \| \|	The initial peek read PIO of the match register is just a waste. Just do the flush writes first, as that is more efficient. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SPARC64]: Fix streaming buffer flushing on PCI and SBUS.	David S. Miller	2005-05-31
\| \| \| \| \| \| \| \| \| \| \| \| \|	Firstly, if the direction is TODEVICE, then dirty data in the streaming cache is impossible so we can elide the flush-flag synchronization in that case. Next, the context allocator is broken. It is highly likely that contexts get used multiple times for different dma mappings, which confuses the strbuf flushing code and makes it run inefficiently. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SPARC64]: Add boot option to force UltraSPARC-III P-Cache on.	David S. Miller	2005-05-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Older UltraSPARC-III chips have a P-Cache bug that makes us disable it by default at boot time. However, this does hurt performance substantially, particularly with memcpy(), and the bug is _incredibly_ obscure. I have never seen it triggered in practice, ever. So provide a "-P" boot option that forces the P-Cache on. It taints the kernel, so if it does trigger and cause some data corruption or OOPS, we will find out in the logs that this option was on when it happened. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SPARC64]: Fix bad performance side effect of strbuf timeout changes.	David S. Miller	2005-05-20
\| \| \| \| \| \| \| \| \| \| \| \| \|	The recent change to add a timeout to strbuf flushing had a negative performance impact. The udelay()'s are too long, and they were done in the wrong order wrt. the register read checks. Fix both, and things are happy again. There are more possible improvements in this area. In fact, PCI streaming buffer flushing seems to be part of the bottleneck in network receive performance on my SunBlade1000 box. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SPARC64]: Add timeouts to streaming buffer synchronization.	David S. Miller	2005-05-11
\| \| \| \| \| \| \| \|	If some hardware error occurs and the flush flag never updates, we will hang forever in these routines. Add a timeout, and print out a diagnostic if it is reached. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SPARC]: Remove legacy stuff from cpu_idle().	Coywolf Qi Hunt	2005-05-05
\| \| \| \| \| \| \| \| \|	Currently sparc and sparc64's UP cpu_idle() checks current pid. This is old time legacy. Now it's paranoia. Signed-off-by: Coywolf Qi Hunt <coywolf@lovecn.org> Acked-by: William Irwin <wli@holomorphy.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SPARC64]: Kill useless __pte_alloc_one_kernel indirection	Christoph Hellwig	2005-05-05
\| \| \| \| \| \|	warning: untested, but it there's not too much chance for screwups Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SPARC64]: Disable IRQ forwarding.	David S. Miller	2005-05-04
\| \| \| \| \| \| \| \| \| \|	There is some race whereby IRQs get stuck, the IRQ status is pending but no processor actually handles the IRQ vector and thus the interrupt. This is a temporary workaround. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[SPARC64]: Fix goal_cpu tracking in retarget_one_irq().	David S. Miller	2005-05-04
\| \| \| \| \| \| \|	We would never advance the goal_cpu counter like we should, so all IRQs would go to a single processor. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[PATCH] convert that currently tests _NSIG directly to use valid_signal()	Jesper Juhl	2005-05-01
\| \| \| \| \| \| \| \| \|	Convert most of the current code that uses _NSIG directly to instead use valid_signal(). This avoids gcc -W warnings and off-by-one errors. Signed-off-by: Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] mostek bogus sparse annotations fixed	Al Viro	2005-04-24
\| \| \| \| \| \| \| \| \| \|	void * __iomem foo is not a pointer to iomem - it's an iomem variable containing void . A pile of such guys in arch/sparc64/kernel/time.c, drivers/sbus/char/rtc.c and include/asm-sparc64/mostek.h turned into intended void __iomem . Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] missing dependency on sparc64	Al Viro	2005-04-24
\| \| \| \| \| \| \| \| \| \| \| \|	CONFIG_HW_CONSOLE selects vt.c; without the stuff pulled by CONFIG_VT it will not build. Normally we get both in drivers/char/Kconfig and there HW_CONSOLE depends on VT. sparc64 does not pull drivers/char/Kconfig and has that sutff in arch/sparc64/Kconfig instead. However, it forgets to add the same dependency. As the result, turning VT off [which is possible] will end up with broken build. For no good reason... Signed-off-by: Al Viro <viro@parcelfarce.linux.theplanet.co.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[SPARC]: Provide generic ioctls in Sparc RTC driver.	David S. Miller	2005-04-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Provide support for drivers/char/rtc.c ioctls in the Mostek rtc driver as well as the Sparc specific RTCGET and RTCSET. This allows userspace to be much less messy. Currently util-linux and other spots jump through hoops trying various ioctl variants until it hits the right one whatever driver actually being used supports. Eventually all of this should move over to the genrtc.c driver, but not today... While we are here, fix up the register types for sparse. Thanks to Frans Pop for helping point out this issue. Signed-off-by: David S. Miller <davem@davemloft.net>
*	[PATCH] sparc64: Fix stat	David S. Miller	2005-04-18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Like Alpha, sparc64's struct stat was defined before we had the nanosecond et al. fields added. So like Alpha I have to cons up a struct stat64 to get this stuff. I'll work on the glibc bits soon. Also, we were forgetting to fill in the nanosecond fields in the sparc compat stat64 syscalls. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] sparc64: Fix copy_sigingo_to_user32()	Jurij Smakov	2005-04-17
\| \| \| \| \| \| \| \| \|	The compat routine to copy over this data structure was not handling SI_POLL correctly, breaking various fcntl() variants in compat tasks. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] sparc64: Reduce ptrace cache flushing	David S. Miller	2005-04-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were flushing the D-cache excessively for ptrace() processing and this makes debugging threads so slow as to be totally unusable. All process page accesses via ptrace() go via access_process_vm(). This routine, for each process page, uses get_user_pages(). That in turn does a flush_dcache_page() on the child pages before we copy in/out the ptrace request data. Therefore, all we need to do after the data movement is: 1) Flush the D-cache pages if the kernel maps the page to a different color than userspace does. 2) If we wrote to the page, we need to flush the I-cache on older cpus. Previously we just flushed the entire cache at the end of a ptrace() request, and that was beyond stupid. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] sparc: Fix PTRACE_CONT bogosity	David S. Miller	2005-04-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SunOS aparently had this weird PTRACE_CONT semantic which we copied. If the addr argument is something other than 1, it sets the process program counter to whatever that value is. This is different from every other Linux architecture, which don't do anything with the addr and data args. This difference in particular breaks the Linux native GDB support for fork and vfork tracing on sparc and sparc64. There is no interest in running SunOS binaries using this weird PTRACE_CONT behavior, so just delete it so we behave like other platforms do. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] sparc64: use message queue compat syscalls	David S. Miller	2005-04-17
\| \| \| \| \| \| \| \| \|	A couple message queue system call entries for compat tasks were not using the necessary compat_sys_*() functions, causing some glibc test cases to fail. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	[PATCH] sparc64: Do not flush dcache for ZERO_PAGE.	David S. Miller	2005-04-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This case actually can get exercised a lot during an ELF coredump of a process which contains a lot of non-COW'd anonymous pages. GDB has this test case which in partiaular creates near terabyte process full of ZERO_PAGEes. It takes forever to just walk through the page tables because of all of these spurious cache flushes on sparc64. With this change it takes only a second or so. Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
*	Linux-2.6.12-rc2v2.6.12-rc2	Linus Torvalds	2005-04-16
	Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!