litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
...
\| * \|	atl1: reduce forward declarations	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Rearrange functions to allow removal of some forward declarations. Make certain global functions static along the way. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	atl1: make functions static	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make needlessly global functions static. In a couple of cases this requires removing forward declarations and reordering functions. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	atl1: print debug info if rrd error	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add some debug printks if we encounter a potentially bad receive return descriptor. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	atl1: use netif_msg	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use netif_msg_* for console messages emitted by the driver. Add a parameter to allow control of messaging at driver startup, and also add the ability to control it with ethtool. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	atl1: use csum_start	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use skb->csum_start for tx checksum offload preparation. Also swap the variables css and cso so they hold the intended values of csum start and offset, respectively. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	atl1: simplify tx packet descriptor	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The transmit packet descriptor consists of four 32-bit words, with word 3 upper bits overloaded depending upon the condition of its bits 3 and 4. The driver currently duplicates all word 2 and some word 3 register bit definitions unnecessarily and also uses a set of nested structures in its definition of the TPD without good cause. This patch adds a lengthy comment describing the TPD, eliminates duplicate TPD bit definitions, and simplifies the TPD structure itself. It also expands the TSO check to correctly handle custom checksum versus TSO processing using the revised TPD definitions. Finally, shorten some variable names in the transmit processing path to reduce line lengths, rename some variables to better describe their purpose (e.g., nseg versus m), and add a comment or two to better describe what the code is doing. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	atl1: add ethtool register dump	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add the ethtool register dump option to the atl1 driver. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	atl1: fix broken TSO	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The L1 tx packet descriptor expects TCP Header Length to be expressed as a number of 32-bit dwords. The atl1 driver uses tcp_hdrlen() to populate the field, but tcp_hdrlen() returns the header length in bytes, not in dwords. Add a shift to convert tcp_hdrlen() to dwords when we write it to the tpd. Also, some of our bit assignments are made to the wrong tpd words. Change those to the correct words. Finally, since all this fixes TSO, enable TSO by default. Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Acked-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	atl1: move common functions to atlx files	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The future atl2 driver and the existing atl1 driver can share certain functions and definitions. Move these shareable functions and definitions out of atl1-specific files and into atlx.c and atlx.h. Some transitory hackery will be present until atl2 is merged. Reduce the number of source files by moving ethtool, hw, and param functions from separate files into atl1_main.c, then rename it to just atl1.c. Move all atl1-specific definitions from atl1_hw.h to atl1.h. Finally, clean up to make checkpatch.pl happy. Signed-off-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	atl1: relocate atl1 driver to /drivers/net/atlx	Jay Cliburn	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In preparation for a future Atheros L2 NIC driver (called atl2), relocate the atl1 driver into a new /drivers/net/atlx directory that will ultimately be shared with the future atl2 driver. Signed-off-by: Chris Snook <csnook@redhat.com> Signed-off-by: Jay Cliburn <jacliburn@bellsouth.net> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	remove the obsolete xircom_tulip_cb driver	Adrian Bunk	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The xircom_tulip_cb driver has been replaced the xircom_cb driver, and since it depended on BROKEN_ON_SMP it e.g. was no longer present in many distribution kernels. This patch therefore removes it. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
\| * \|	sk98lin: remove obsolete driver	Stephen Hemminger	2008-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the hardware supported by this driver is now supported by the skge driver. The last remaining issue was support for ancient dual port SysKonnect fiber boards, and the skge driver now does these correctly (p.s. sk98lin was always broken on these old dual port boards anyway). Signed-off-by: Stephen Hemminger <shemminger@linux-foundation.org> Signed-off-by: Jeff Garzik <jeff@garzik.org>
* \| \|	[IPV4] route: use read_mostly	Stephen Hemminger	2008-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The route table parameters are set based on system memory and sysctl values that almost never change. Also the genid only changes every 10 minutes. RTprint is defined by never used. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[IPV4]: sk parameter is unused in ipv4_dst_blackhole.	Denis V. Lunev	2008-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Just remove it. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NET]: NPROTO is redundant; it's equal to AF_MAX/PF_MAX.	Rusty Russell	2008-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	DaveM pointed out NPROTO exposed to userspace, so keep it around, just make sure it stays in sync. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[RAW]: Add raw_hashinfo member on struct proto.	Pavel Emelyanov	2008-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sorry for the patch sequence confusion :\| but I found that the similar thing can be done for raw sockets easily too late. Expand the proto.h union with the raw_hashinfo member and use it in raw_prot and rawv6_prot. This allows to drop the protocol specific versions of hash and unhash callbacks. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[UDP]: Make full use of proto.h.udp_hash innovation.	Pavel Emelyanov	2008-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After this we have only udp_lib_get_port to get the port and two stubs for ipv4 and ipv6. No difference in udp and udplite except for initialized h.udp_hash member. I tried to find a graceful way to drop the only difference between udp_v4_get_port and udp_v6_get_port (i.e. the rcv_saddr comparison routine), but adding one more callback on the struct proto didn't appear such :( Maybe later. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[SOCK]: Add udp_hash member to struct proto.	Pavel Emelyanov	2008-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Inspired by the commit ab1e0a13 ([SOCK] proto: Add hashinfo member to struct proto) from Arnaldo, I made similar thing for UDP/-Lite IPv4 and -v6 protocols. The result is not that exciting, but it removes some levels of indirection in udpxxx_get_port and saves some space in code and text. The first step is to union existing hashinfo and new udp_hash on the struct proto and give a name to this union, since future initialization of tcpxxx_prot, dccp_vx_protinfo and udpxxx_protinfo will cause gcc warning about inability to initialize anonymous member this way. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[IPV4]: Always pass ip_options pointer into ip_options_compile.	Denis V. Lunev	2008-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This makes code a bit more uniform and straigthforward. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[IPV4]: Remove unused ip_options->is_data.	Denis V. Lunev	2008-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ip_options->is_data is assigned only and never checked. The structure is not a part of kernel interface to the userspace. So, it is safe to remove this field. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[IPV4]: Remove unnecessary check for opt->is_data in ip_options_compile.	Denis V. Lunev	2008-03-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is the only way to reach ip_options compile with opt != NULL: ip_options_get_finish opt->is_data = 1; ip_options_compile(opt, NULL) So, checking for is_data inside opt != NULL branch is not needed. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[TCP]: TCP_DEFER_ACCEPT updates - process as established	Patrick McManus	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change TCP_DEFER_ACCEPT implementation so that it transitions a connection to ESTABLISHED after handshake is complete instead of leaving it in SYN-RECV until some data arrvies. Place connection in accept queue when first data packet arrives from slow path. Benefits: - established connection is now reset if it never makes it to the accept queue - diagnostic state of established matches with the packet traces showing completed handshake - TCP_DEFER_ACCEPT timeouts are expressed in seconds and can now be enforced with reasonable accuracy instead of rounding up to next exponential back-off of syn-ack retry. Signed-off-by: Patrick McManus <mcmanus@ducksong.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[TCP]: TCP_DEFER_ACCEPT updates - dont retxmt synack	Patrick McManus	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	a socket in LISTEN that had completed its 3 way handshake, but not notified userspace because of SO_DEFER_ACCEPT, would retransmit the already acked syn-ack during the time it was waiting for the first data byte from the peer. Signed-off-by: Patrick McManus <mcmanus@ducksong.com> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[TCP]: TCP_DEFER_ACCEPT updates - defer timeout conflicts with max_thresh	Patrick McManus	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	timeout associated with SO_DEFER_ACCEPT wasn't being honored if it was less than the timeout allowed by the maximum syn-recv queue size algorithm. Fix by using the SO_DEFER_ACCEPT value if the ack has arrived. Signed-off-by: Patrick McManus <mcmanus@ducksong.com> Acked-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	socket: SOCK_DEBUG type checking	Stephen Hemminger	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the inline trick (same as pr_debug) to get checking of debug statements even if no code is generated. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NET]: NULL pointer dereference and other nasty things in /proc/net/(tcp\|udp)[6]	Pavel Emelyanov	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commits f40c81 ([NETNS][IPV4] tcp - make proc handle the network namespaces) and a91275 ([NETNS][IPV6] udp - make proc handle the network namespace) both introduced bad checks on sockets and tw buckets to belong to proper net namespace. I.e. when checking for socket to belong to given net and family the do { sk = sk_next(sk); } while (sk && sk->sk_net != net && sk->sk_family != family); constructions were used. This is wrong, since as soon as the sk->sk_net fits the net the socket is immediately returned, even if it belongs to other family. As the result four /proc/net/(udp\|tcp)[6] entries show wrong info. The udp6 entry even oopses when dereferencing inet6_sk(sk) pointer: static void udp6_sock_seq_show(struct seq_file seq, struct sock sp, int bucket) { ... struct ipv6_pinfo np = inet6_sk(sp); ... dest = &np->daddr; / will be NULL for AF_INET sockets */ ... seq_printf(... dest->s6_addr32[0], dest->s6_addr32[1], dest->s6_addr32[2], dest->s6_addr32[3], ... Fix it by converting && to \|\|. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	netlink: make socket filters work on netlink	Stephen Hemminger	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make socket filters work for netlink unicast and notifications. This is useful for applications like Zebra that get overrun with messages that are then ignored. Note: netlink messages are in host byte order, but packet filter state machine operations are done as network byte order. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NETNS][IPV6] tcp6 - make proc per namespace	Daniel Lezcano	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Make the proc for tcp6 to be per namespace. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NETNS][IPV6] udp6 - make proc per namespace	Daniel Lezcano	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The proc init/exit functions take a new network namespace parameter in order to register/unregister /proc/net/udp6 for a namespace. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NETNS][IPV4] tcp - make proc handle the network namespaces	Daniel Lezcano	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch, like udp proc, makes the proc functions to take care of which namespace the socket belongs. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NETNS][IPV6] tcp - assign the netns for timewait sockets	Daniel Lezcano	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Copy the network namespace from the socket to the timewait socket. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NETNS][IPV6] udp - make proc handle the network namespace	Daniel Lezcano	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes the common udp proc functions to take care of which socket they should show taking into account the namespace it belongs. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NETNS][IPV6] mcast - fix compilation warning when procfs is not compiled in	Daniel Lezcano	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When CONFIG_PROC_FS=no, the out_sock_create label is not used because the code using it is disabled and that leads to a warning at compile time. This patch fix that by making a specific function to initialize proc for igmp6, and remove the annoying CONFIG_PROC_FS sections in init/exit function. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NET]: Add per-connection option to set max TSO frame size	Peter P Waskiewicz Jr	2008-03-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update: My mailer ate one of Jarek's feedback mails... Fixed the parameter in netif_set_gso_max_size() to be u32, not u16. Fixed the whitespace issue due to a patch import botch. Changed the types from u32 to unsigned int to be more consistent with other variables in the area. Also brought the patch up to the latest net-2.6.26 tree. Update: Made gso_max_size container 32 bits, not 16. Moved the location of gso_max_size within netdev to be less hotpath. Made more consistent names between the sock and netdev layers, and added a define for the max GSO size. Update: Respun for net-2.6.26 tree. Update: changed max_gso_frame_size and sk_gso_max_size from signed to unsigned - thanks Stephen! This patch adds the ability for device drivers to control the size of the TSO frames being sent to them, per TCP connection. By setting the netdevice's gso_max_size value, the socket layer will set the GSO frame size based on that value. This will propogate into the TCP layer, and send TSO's of that size to the hardware. This can be desirable to help tune the bursty nature of TSO on a per-adapter basis, where one may have 1 GbE and 10 GbE devices coexisting in a system, one running multiqueue and the other not, etc. This can also be desirable for devices that cannot support full 64 KB TSO's, but still want to benefit from some level of segmentation offloading. Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	Merge branch 'master' of ↵	David S. Miller	2008-03-21
\|\ \ \ \| \| \|/ \| \|/\| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
\| * \|	[NET] ifb: set separate lockdep classes for queue locks	Jarek Poplawski	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ 10.536424] ======================================================= [ 10.536424] [ INFO: possible circular locking dependency detected ] [ 10.536424] 2.6.25-rc3-devel #3 [ 10.536424] ------------------------------------------------------- [ 10.536424] swapper/0 is trying to acquire lock: [ 10.536424] (&dev->queue_lock){-+..}, at: [<c0299b4a>] dev_queue_xmit+0x175/0x2f3 [ 10.536424] [ 10.536424] but task is already holding lock: [ 10.536424] (&p->tcfc_lock){-+..}, at: [<f8a67154>] tcf_mirred+0x20/0x178 [act_mirred] [ 10.536424] [ 10.536424] which lock already depends on the new lock. lockdep warns of locking order while using ifb with sch_ingress and act_mirred: ingress_lock, tcfc_lock, queue_lock (usually queue_lock is at the beginning). This patch is only to tell lockdep that ifb is a different device (e.g. from eth) and has its own pair of queue locks. (This warning is a false-positive in common scenario of using ifb; yet there are possible situations, when this order could be dangerous; lockdep should warn in such a case.) (With suggestions by David S. Miller) Reported-and-tested-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	[IPV6] KCONFIG: Fix description about IPV6_TUNNEL.	YOSHIFUJI Hideaki	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Based on notice from "Colin" <colins@sjtu.edu.cn>. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	[TCP]: Fix shrinking windows with window scaling	Patrick McHardy	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When selecting a new window, tcp_select_window() tries not to shrink the offered window by using the maximum of the remaining offered window size and the newly calculated window size. The newly calculated window size is always a multiple of the window scaling factor, the remaining window size however might not be since it depends on rcv_wup/rcv_nxt. This means we're effectively shrinking the window when scaling it down. The dump below shows the problem (scaling factor 2^7): - Window size of 557 (71296) is advertised, up to 3111907257: IP 172.2.2.3.33000 > 172.2.2.2.33000: . ack 3111835961 win 557 <...> - New window size of 514 (65792) is advertised, up to 3111907217, 40 bytes below the last end: IP 172.2.2.3.33000 > 172.2.2.2.33000: . 3113575668:3113577116(1448) ack 3111841425 win 514 <...> The number 40 results from downscaling the remaining window: 3111907257 - 3111841425 = 65832 65832 / 2^7 = 514 65832 % 2^7 = 40 If the sender uses up the entire window before it is shrunk, this can have chaotic effects on the connection. When sending ACKs, tcp_acceptable_seq() will notice that the window has been shrunk since tcp_wnd_end() is before tp->snd_nxt, which makes it choose tcp_wnd_end() as sequence number. This will fail the receivers checks in tcp_sequence() however since it is before it's tp->rcv_wup, making it respond with a dupack. If both sides are in this condition, this leads to a constant flood of ACKs until the connection times out. Make sure the window is never shrunk by aligning the remaining window to the window scaling factor. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	netpoll: zap_completion_queue: adjust skb->users counter	Jarek Poplawski	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	zap_completion_queue() retrieves skbs from completion_queue where they have zero skb->users counter. Before dev_kfree_skb_any() it should be non-zero yet, so it's increased now. Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: use time_before() in br_fdb_cleanup()	Fabio Checconi	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In br_fdb_cleanup() next_timer and this_timer are in jiffies, so they should be compared using the time_after() macro. Signed-off-by: Fabio Checconi <fabio@gandalf.sssup.it> Signed-off-by: Stephen Hemminger <stephen.hemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	[TG3]: Fix build warning on sparc32.	David S. Miller	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sparc MAC address support should be protected consistently with CONFIG_SPARC, but there was a stray CONFIG_SPARC64 case. Bump driver version and release date. Reported by Andrew Morton. Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	MAINTAINERS: bluez-devel is subscribers-only	Pavel Machek	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Pavel Machek <pavel@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	audit: netlink socket can be auto-bound to pid other than current->pid (v2)	Pavel Emelyanov	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From: Pavel Emelyanov <xemul@openvz.org> This patch is based on the one from Thomas. The kauditd_thread() calls the netlink_unicast() and passes the audit_pid to it. The audit_pid, in turn, is received from the user space and the tool (I've checked the audit v1.6.9) uses getpid() to pass one in the kernel. Besides, this tool doesn't bind the netlink socket to this id, but simply creates it allowing the kernel to auto-bind one. That's the preamble. The problem is that netlink_autobind() _does_not_ guarantees that the socket will be auto-bound to the current pid. Instead it uses the current pid as a hint to start looking for a free id. So, in case of conflict, the audit messages can be sent to a wrong socket. This can happen (it's unlikely, but can be) in case some task opens more than one netlink sockets and then the audit one starts - in this case the audit's pid can be busy and its socket will be bound to another id. The proposal is to introduce an audit_nlk_pid in audit subsys, that will point to the netlink socket to send packets to. It will most often be equal to audit_pid. The socket id can be got from the skb's netlink CB right in the audit_receive_msg. The audit_nlk_pid reset to 0 is not required, since all the decisions are taken based on audit_pid value only. Later, if the audit tools will bind the socket themselves, the kernel will have to provide a way to setup the audit_nlk_pid as well. A good side effect of this patch is that audit_pid can later be converted to struct pid, as it is not longer safe to use pid_t-s in the presence of pid namespaces. But audit code still uses the tgid from task_struct in the audit_signal_info and in the audit_filter_syscall. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Eric Paris <eparis@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	[NET]: Fix permissions of /proc/net	Andre Noll	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit e9720ac ([NET]: Make /proc/net a symlink on /proc/self/net (v3)) broke ganglia and probably other applications that read /proc/net/dev. This is due to the change of permissions of /proc/net that was introduced in that commit. Before: dr-xr-xr-x 5 root root 0 Mar 19 11:30 /proc/net After: dr-xr--r-- 5 root root 0 Mar 19 11:29 /proc/self/net This patch restores the permissions to the old value which makes ganglia happy again. Pavel Emelyanov says: This also broke the postfix, as it was reported in bug #10286 and described in detail by Benjamin. Signed-off-by: Andre Noll <maan@systemlinux.org> Acked-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	[SCTP]: Fix a race between module load and protosw access	Vlad Yasevich	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is a race is SCTP between the loading of the module and the access by the socket layer to the protocol functions. In particular, a list of addresss that SCTP maintains is not initialized prior to the registration with the protosw. Thus it is possible for a user application to gain access to SCTP functions before everything has been initialized. The problem shows up as odd crashes during connection initializtion when we try to access the SCTP address list. The solution is to refactor how we do registration and initialize the lists prior to registering with the protosw. Care must be taken since the address list initialization depends on some other pieces of SCTP initialization. Also the clean-up in case of failure now also needs to be refactored. Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com> Acked-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	[NETFILTER]: ipt_recent: sanity check hit count	Daniel Hokka Zakrisson	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a rule using ipt_recent is created with a hit count greater than ip_pkt_list_tot, the rule will never match as it cannot keep track of enough timestamps. This patch makes ipt_recent refuse to create such rules. With ip_pkt_list_tot's default value of 20, the following can be used to reproduce the problem. nc -u -l 0.0.0.0 1234 & for i in `seq 1 100`; do echo $i \| nc -w 1 -u 127.0.0.1 1234; done This limits it to 20 packets: iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \ --rsource iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \ 60 --hitcount 20 --name test --rsource -j DROP While this is unlimited: iptables -A OUTPUT -p udp --dport 1234 -m recent --set --name test \ --rsource iptables -A OUTPUT -p udp --dport 1234 -m recent --update --seconds \ 60 --hitcount 21 --name test --rsource -j DROP With the patch the second rule-set will throw an EINVAL. Reported-by: Sean Kennedy <skennedy@vcn.com> Signed-off-by: Daniel Hokka Zakrisson <daniel@hozac.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	[NETFILTER]: nf_conntrack_h323: logical-bitwise & confusion in process_setup()	Roel Kluin	2008-03-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	logical-bitwise & confusion Signed-off-by: Roel Kluin <12o3l@tiscali.nl> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	[RT2X00] drivers/net/wireless/rt2x00/rt2x00dev.c: remove dead code, fix warning	Andrew Morton	2008-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	[NET]: Add debugging names to __RW_LOCK_UNLOCKED macros.	Robert P. J. Day	2008-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	Merge branch 'master' of ↵	David S. Miller	2008-03-18
\|\\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/rt2x00/rt2x00dev.c net/8021q/vlan_dev.c