aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* S2io: Multiqueue network device support - FIFO selection based on L4 portsSreenivasa Honnur2008-03-17
| | | | | | | | | | | | | - Resubmit #2 - Transmit fifo selection based on TCP/UDP ports. - Added tx_steering_type loadable parameter for transmit fifo selection. 0x0 NO_STEERING: Default FIFO is selected. 0x1 TX_PRIORITY_STEERING: FIFO is selected based on skb->priority. 0x2 TX_DEFAULT_STEERING: FIFO is selected based on L4 Ports. Signed-off-by: Surjit Reang <surjit.reang@neterion.com> Signed-off-by: Ramkrishna Vepa <ram.vepa@neterion.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* S2io: Multiqueue network device support implementationSreenivasa Honnur2008-03-17
| | | | | | | | | | | | | | | | | | | | | | | - Resubmit #3 Multiqueue netwrok device support implementation. - Added a loadable parameter "multiq" to enable/disable multiqueue support, by default it is disabled. - skb->queue_mapping is not used for queue/fifo selection. FIFO selection is based on skb->priority. - Added per FIFO flags FIFO_QUEUE_START and FIFO_QUEUE_STOP. Check this flag for starting and stopping netif queue and update the flags accordingly. - In tx_intr_handler added a check to ensure that we have free TXDs before wak- ing up the queue. - Added helper functions for queue manipulation(start/stop/wakeup) to invoke appropriate netif_ functions. - Calling netif_start/stop for link up/down case respectively. - As per Andi kleen's review comments, using skb->priority field for FIFO selection. Signed-off-by: Surjit Reang <surjit.reang@neterion.com> Signed-off-by: Ramkrishna Vepa <ram.vepa@neterion.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* qeth: remove old qeth filesFrank Blaschka2008-03-17
| | | | | | | Remove all obsolete qeth files. Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* qeth: new qeth device driverFrank Blaschka2008-03-17
| | | | | | | | | | | | | | List of major changes and improvements: no manipulation of the global ARP constructor clean code split into core, layer 2 and layer 3 functionality better exploitation of the ethtool interface better representation of the various hardware capabilities fix packet socket support (tcpdump), no fake_ll required osasnmpd notification via udev events coding style and beautification Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* ctc: removal of the old ctc driverPeter Tiedemann2008-03-17
| | | | | | | | | | | | ctc driver is replaced by a new ctcm driver. The ctcm driver supports the channel-to-channel connections of the old ctc driver plus an additional MPC protocol to provide SNA connectivity. This patch removes the functions of the old ctc driver. Signed-off-by: Peter Tiedemann <ptiedem@de.ibm.com> Signed-off-by: Ursula Braun <braunu@de.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* ctcm: infrastructure for replaced ctc driverPeter Tiedemann2008-03-17
| | | | | | | | | | | | ctcm driver supports the channel-to-channel connections of the old ctc driver plus an additional MPC protocol to provide SNA connectivity. This new ctcm driver replaces the existing ctc driver. Signed-off-by: Peter Tiedemann <ptiedem@de.ibm.com> Signed-off-by: Ursula Braun <braunu@de.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* drivers/s390/net: Kconfig brush upUrsula Braun2008-03-17
| | | | | | | | | adapt drivers/s390/net/Kconfig to current IBM wording and further cosmetics Signed-off-by: Peter Tiedemann <ptiedem@de.ibm.com> Signed-off-by: Ursula Braun <braunu@de.ibm.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: reduce forward declarationsJay Cliburn2008-03-17
| | | | | | | | | Rearrange functions to allow removal of some forward declarations. Make certain global functions static along the way. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: make functions staticJay Cliburn2008-03-17
| | | | | | | | | Make needlessly global functions static. In a couple of cases this requires removing forward declarations and reordering functions. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: print debug info if rrd errorJay Cliburn2008-03-17
| | | | | | | | | Add some debug printks if we encounter a potentially bad receive return descriptor. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: use netif_msgJay Cliburn2008-03-17
| | | | | | | | | | Use netif_msg_* for console messages emitted by the driver. Add a parameter to allow control of messaging at driver startup, and also add the ability to control it with ethtool. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: use csum_startJay Cliburn2008-03-17
| | | | | | | | | | Use skb->csum_start for tx checksum offload preparation. Also swap the variables css and cso so they hold the intended values of csum start and offset, respectively. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: simplify tx packet descriptorJay Cliburn2008-03-17
| | | | | | | | | | | | | | | | | | | The transmit packet descriptor consists of four 32-bit words, with word 3 upper bits overloaded depending upon the condition of its bits 3 and 4. The driver currently duplicates all word 2 and some word 3 register bit definitions unnecessarily and also uses a set of nested structures in its definition of the TPD without good cause. This patch adds a lengthy comment describing the TPD, eliminates duplicate TPD bit definitions, and simplifies the TPD structure itself. It also expands the TSO check to correctly handle custom checksum versus TSO processing using the revised TPD definitions. Finally, shorten some variable names in the transmit processing path to reduce line lengths, rename some variables to better describe their purpose (e.g., nseg versus m), and add a comment or two to better describe what the code is doing. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: add ethtool register dumpJay Cliburn2008-03-17
| | | | | | | | Add the ethtool register dump option to the atl1 driver. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: fix broken TSOJay Cliburn2008-03-17
| | | | | | | | | | | | | | | | The L1 tx packet descriptor expects TCP Header Length to be expressed as a number of 32-bit dwords. The atl1 driver uses tcp_hdrlen() to populate the field, but tcp_hdrlen() returns the header length in bytes, not in dwords. Add a shift to convert tcp_hdrlen() to dwords when we write it to the tpd. Also, some of our bit assignments are made to the wrong tpd words. Change those to the correct words. Finally, since all this fixes TSO, enable TSO by default. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: move common functions to atlx filesJay Cliburn2008-03-17
| | | | | | | | | | | | | | | | | | | The future atl2 driver and the existing atl1 driver can share certain functions and definitions. Move these shareable functions and definitions out of atl1-specific files and into atlx.c and atlx.h. Some transitory hackery will be present until atl2 is merged. Reduce the number of source files by moving ethtool, hw, and param functions from separate files into atl1_main.c, then rename it to just atl1.c. Move all atl1-specific definitions from atl1_hw.h to atl1.h. Finally, clean up to make checkpatch.pl happy. Signed-off-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* atl1: relocate atl1 driver to /drivers/net/atlxJay Cliburn2008-03-17
| | | | | | | | | | In preparation for a future Atheros L2 NIC driver (called atl2), relocate the atl1 driver into a new /drivers/net/atlx directory that will ultimately be shared with the future atl2 driver. Signed-off-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* remove the obsolete xircom_tulip_cb driverAdrian Bunk2008-03-17
| | | | | | | | | | | The xircom_tulip_cb driver has been replaced the xircom_cb driver, and since it depended on BROKEN_ON_SMP it e.g. was no longer present in many distribution kernels. This patch therefore removes it. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* sk98lin: remove obsolete driverStephen Hemminger2008-03-17
| | | | | | | | | | | All the hardware supported by this driver is now supported by the skge driver. The last remaining issue was support for ancient dual port SysKonnect fiber boards, and the skge driver now does these correctly (p.s. sk98lin was always broken on these old dual port boards anyway). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* Linux 2.6.25-rc6v2.6.25-rc6Linus Torvalds2008-03-16
|
* Merge branch 'master' of ↵Linus Torvalds2008-03-16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6: [PARISC] make ptr_to_pide() static [PARISC] head.S: section mismatch fixes [PARISC] add back Crestone Peak cpu [PARISC] futex: special case cmpxchg NULL in kernel space [PARISC] clean up show_stack [PARISC] add pa8900 CPUs to hardware inventory [PARISC] clean up include/asm-parisc/elf.h [PARISC] move defconfig to arch/parisc/configs/ [PARISC] add back AD1889 MAINTAINERS entry [PARISC] pdc_console: fix bizarre panic on boot [PARISC] dump_stack in show_regs [PARISC] pdc_stable: fix compile errors [PARISC] remove unused pdc_iodc_printf function [PARISC] bump __NR_syscalls [PARISC] unbreak pgalloc.h [PARISC] move VMALLOC_* definitions to fixmap.h [PARISC] wire up timerfd syscalls [PARISC] remove old timerfd syscall
| * [PARISC] make ptr_to_pide() staticFUJITA Tomonori2008-03-15
| | | | | | | | | | Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
| * [PARISC] head.S: section mismatch fixesHelge Deller2008-03-15
| | | | | | | | | | | | | | | | | | | | | | - move boot_args[] into the init section - move $global$ into the read_mostly section - fix the following two section mismatches: WARNING: vmlinux.o(.text+0x9c): Section mismatch: reference to .init.text:start_kernel (between '$pgt_fill_loop' and '$is_pa20') WARNING: vmlinux.o(.text+0xa0): Section mismatch: reference to .init.text:start_kernel (between '$pgt_fill_loop' and '$is_pa20') Signed-off-by: Helge Deller <deller@gmx.de> SIgned-off-by: Kyle McMartin <kyle@mcmartin.ca>
| * [PARISC] add back Crestone Peak cpuKyle McMartin2008-03-15
| | | | | | | | | | | | | | Crestone Peak Slow is the 800MHz PA-8800 cpu in the C8000. 0x88B is probably the Crestone Peak Fast. Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
| * [PARISC] futex: special case cmpxchg NULL in kernel spaceKyle McMartin2008-03-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a0c1e9073ef7428a14309cba010633a6cd6719ea added code to futex.c to detect whether futex_atomic_cmpxchg_inatomic was implemented at run time: + curval = cmpxchg_futex_value_locked(NULL, 0, 0); + if (curval == -EFAULT) + futex_cmpxchg_enabled = 1; This is bogus on parisc, since page zero in kernel virtual space is the gateway page for syscall entry, and should not be read from the kernel. (That, and we really don't like the kernel faulting on its own address space...) Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
| * [PARISC] clean up show_stackKyle McMartin2008-03-15
| | | | | | | | | | | | | | | | When we show_regs, we obviously have a struct pt_regs of the calling frame. Use these in show_stack so we don't have the entire bogus call trace up to the show_stack call. Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
| * [PARISC] add pa8900 CPUs to hardware inventoryJames Bottomley2008-03-15
| | | | | | | | | | | | | | | | This patch adds the known pa8900 CPUs to the inventory list and removes the Crestone Peak one which apparently never escaped into the wild. Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
| * [PARISC] clean up include/asm-parisc/elf.hRandolph Chung2008-03-15
| | | | | | | | | | | | | | Cleanup some cruft. No functionality changes. Signed-off-by: Randolph Chung <tausq@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
| * [PARISC] move defconfig to arch/parisc/configs/Adrian Bunk2008-03-15
| | | | | | | | | | | | | | | | | | This patch moves the default parisc defconfig to arch/parisc/configs/generic_defconfig where it belongs and selects it as the default defconfig through KBUILD_DEFCONFIG. Signed-off-by: Adrian Bunk <adrian.bunk@movial.fi> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
| * [PARISC] add back AD1889 MAINTAINERS entryThibaut VARENE2008-03-15
| | | | | | | | | | Signed-off-by: Thibaut VARENE <T-Bone@parisc-linux.org> Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
| * [PARISC] pdc_console: fix bizarre panic on bootKyle McMartin2008-03-15
| | | | | | | | | | | | | | | | | | | | | | | | | | Commit 721fdf34167580ff98263c74cead8871d76936e6 introduced a subtle bug by accidently removing the "static" from iodc_dbuf. This resulted in, what appeared to be, a trap without *current set to a task. Probably the result of a trap in real mode while calling firmware. Also do other misc clean ups. Since the only input from firmware is non blocking, share iodc_dbuf between input and output, and spinlock the only callers. Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
| * [PARISC] dump_stack in show_regsKyle McMartin2008-03-15
| | | | | | | | | | | | | | | | Originally, show_stack was used in BUG() output. However, a recent commit changed it to print register state (no idea what that's supposed to help, really...) and parisc was missing a backtrace because of it. Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
| * [PARISC] pdc_stable: fix compile errorsJoel Soete2008-03-15
| | | | | | | | | | Signed-off-by: Joel Soete <rubisher@scarlet.be> Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
| * [PARISC] remove unused pdc_iodc_printf functionKyle McMartin2008-03-15
| | | | | | | | Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
| * [PARISC] bump __NR_syscallsKyle McMartin2008-03-15
| | | | | | | | | | | | oops, forgot this in the previous commit. Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
| * [PARISC] unbreak pgalloc.hKyle McMartin2008-03-15
| | | | | | | | | | | | | | Commit 2f569afd9ced9ebec9a6eb3dbf6f83429be0a7b4 broke the compile rather spectacularly. Fix code errors. Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
| * [PARISC] move VMALLOC_* definitions to fixmap.hKyle McMartin2008-03-15
| | | | | | | | | | | | They make way more sense here, really... Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
| * [PARISC] wire up timerfd syscallsKyle McMartin2008-03-15
| | | | | | | | Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
| * [PARISC] remove old timerfd syscallKyle McMartin2008-03-15
| | | | | | | | Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
* | ACPI: Remove ACPI_CUSTOM_DSDT_INITRD optionLinus Torvalds2008-03-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This essentially reverts commit 71fc47a9adf8ee89e5c96a47222915c5485ac437 ("ACPI: basic initramfs DSDT override support"), because the code simply isn't ready. It did ugly things to the init sequence to populate the rootfs image early, but that just ended up showing other problems with the whole approach. The fact is, the VFS layer simply isn't initialized this early, and the relevant ACPI code should either run much later, or this shouldn't be done at all. For 2.6.25, we'll just pick the latter option. We can revisit this concept later if necessary. Cc: Dave Hansen <haveblue@us.ibm.com> Cc: Tilman Schmidt <tilman@imap.cc> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Renninger <trenn@suse.de> Cc: Eric Piel <eric.piel@tremplin-utc.net> Cc: Len Brown <len.brown@intel.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Markus Gaugusch <dsdt@gaugusch.at> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | tifm_sd: DATA_CARRY is not boolean in tifm_sd_transfer_data()Roel Kluin2008-03-15
| | | | | | | | | | | | | | | | DATA_CARRY is not boolean Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds2008-03-15
|\ \ | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: [NET]: Fix tbench regression in 2.6.25-rc1
| * | [NET]: Fix tbench regression in 2.6.25-rc1Zhang Yanmin2008-03-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Comparing with kernel 2.6.24, tbench result has regression with 2.6.25-rc1. 1) On 2 quad-core processor stoakley: 4%. 2) On 4 quad-core processor tigerton: more than 30%. bisect located below patch. b4ce92775c2e7ff9cf79cca4e0a19c8c5fd6287b is first bad commit commit b4ce92775c2e7ff9cf79cca4e0a19c8c5fd6287b Author: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue Nov 13 21:33:32 2007 -0800 [IPV6]: Move nfheader_len into rt6_info The dst member nfheader_len is only used by IPv6. It's also currently creating a rather ugly alignment hole in struct dst. Therefore this patch moves it from there into struct rt6_info. Above patch changes the cache line alignment, especially member __refcnt. I did a testing by adding 2 unsigned long pading before lastuse, so the 3 members, lastuse/__refcnt/__use, are moved to next cache line. The performance is recovered. I created a patch to rearrange the members in struct dst_entry. With Eric and Valdis Kletnieks's suggestion, I made finer arrangement. 1) Move tclassid under ops in case CONFIG_NET_CLS_ROUTE=y. So sizeof(dst_entry)=200 no matter if CONFIG_NET_CLS_ROUTE=y/n. I tested many patches on my 16-core tigerton by moving tclassid to different place. It looks like tclassid could also have impact on performance. If moving tclassid before metrics, or just don't move tclassid, the performance isn't good. So I move it behind metrics. 2) Add comments before __refcnt. On 16-core tigerton: If CONFIG_NET_CLS_ROUTE=y, the result with below patch is about 18% better than the one without the patch; If CONFIG_NET_CLS_ROUTE=n, the result with below patch is about 30% better than the one without the patch. With 32bit 2.6.25-rc1 on 8-core stoakley, the new patch doesn't introduce regression. Thank Eric, Valdis, and David! Signed-off-by: Zhang Yanmin <yanmin.zhang@intel.com> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | sched: simplify sched_slice()Ingo Molnar2008-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use the existing calc_delta_mine() calculation for sched_slice(). This saves a divide and simplifies the code because we share it with the other /cfs_rq->load users. It also improves code size: text data bss dec hex filename 42659 2740 144 45543 b1e7 sched.o.before 42093 2740 144 44977 afb1 sched.o.after Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
* | | sched: fix fair sleepersIngo Molnar2008-03-14
| | | | | | | | | | | | | | | | | | | | | | | | Fair sleepers need to scale their latency target down by runqueue weight. Otherwise busy systems will gain ever larger sleep bonus. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
* | | sched: fix overload performance: buddy wakeupsPeter Zijlstra2008-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we schedule to the leftmost task in the runqueue. When the runtimes are very short because of some server/client ping-pong, especially in over-saturated workloads, this will cycle through all tasks trashing the cache. Reduce cache trashing by keeping dependent tasks together by running newly woken tasks first. However, by not running the leftmost task first we could starve tasks because the wakee can gain unlimited runtime. Therefore we only run the wakee if its within a small (wakeup_granularity) window of the leftmost task. This preserves fairness, but does alternate server/client task groups. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | sched: fix calc_delta_mine()Ingo Molnar2008-03-14
| | | | | | | | | | | | | | | | | | | | | lw->weight can be 0 for a short time during bootup. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
* | | sched: fix update_load_add()/sub()Ingo Molnar2008-03-14
| | | | | | | | | | | | | | | | | | | | | | | | Clear the cached inverse value when updating load. This is needed for calc_delta_mine() to work correctly when using the rq load. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
* | | sched: min_vruntime fixPeter Zijlstra2008-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current min_vruntime tracking is incorrect and will cause serious problems when we don't run the leftmost task for some reason. min_vruntime does two things; 1) it's used to determine a forward direction when the u64 vruntime wraps, 2) it's used to track the leftmost vruntime to position newly enqueued tasks from. The current logic advances min_vruntime whenever the current task's vruntime advance. Because the current task may pass the leftmost task still waiting we're failing the second goal. This causes new tasks to be placed too far ahead and thus penalizes their runtime. Fix this by making min_vruntime the min_vruntime of the waiting tasks by tracking it in enqueue/dequeue, and compare against current's vruntime to obtain the absolute minimum when placing new tasks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | sched: fix race in schedule()Hiroshi Shimamoto2008-03-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a hard to trigger crash seen in the -rt kernel that also affects the vanilla scheduler. There is a race condition between schedule() and some dequeue/enqueue functions; rt_mutex_setprio(), __setscheduler() and sched_move_task(). When scheduling to idle, idle_balance() is called to pull tasks from other busy processor. It might drop the rq lock. It means that those 3 functions encounter on_rq=0 and running=1. The current task should be put when running. Here is a possible scenario: CPU0 CPU1 | schedule() | ->deactivate_task() | ->idle_balance() | -->load_balance_newidle() rt_mutex_setprio() | | --->double_lock_balance() *get lock *rel lock * on_rq=0, ruuning=1 | * sched_class is changed | *rel lock *get lock : | : ->put_prev_task_rt() ->pick_next_task_fair() => panic The current process of CPU1(P1) is scheduling. Deactivated P1, and the scheduler looks for another process on other CPU's runqueue because CPU1 will be idle. idle_balance(), load_balance_newidle() and double_lock_balance() are called and double_lock_balance() could drop the rq lock. On the other hand, CPU0 is trying to boost the priority of P1. The result of boosting only P1's prio and sched_class are changed to RT. The sched entities of P1 and P1's group are never put. It makes cfs_rq invalid, because the cfs_rq has curr and no leaf, but pick_next_task_fair() is called, then the kernel panics. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>