litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	KS8851: Correct RX packet allocation	Eric Dumazet	2010-09-09
\| \| \| \| \| \| \| \|	Use netdev_alloc_skb_ip_align() helper and do correct allocation Tested-by: Abraham Arce <x0066660@ti.com> Signed-off-by: Abraham Arce <x0066660@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	udp: add rehash on connect()	Eric Dumazet	2010-09-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	commit 30fff923 introduced in linux-2.6.33 (udp: bind() optimisation) added a secondary hash on UDP, hashed on (local addr, local port). Problem is that following sequence : fd = socket(...) connect(fd, &remote, ...) not only selects remote end point (address and port), but also sets local address, while UDP stack stored in secondary hash table the socket while its local address was INADDR_ANY (or ipv6 equivalent) Sequence is : - autobind() : choose a random local port, insert socket in hash tables [while local address is INADDR_ANY] - connect() : set remote address and port, change local address to IP given by a route lookup. When an incoming UDP frame comes, if more than 10 sockets are found in primary hash table, we switch to secondary table, and fail to find socket because its local address changed. One solution to this problem is to rehash datagram socket if needed. We add a new rehash(struct socket *) method in "struct proto", and implement this method for UDP v4 & v6, using a common helper. This rehashing only takes care of secondary hash table, since primary hash (based on local port only) is not changed. Reported-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Krzysztof Piotr Oledzki <ole@ans.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: blackhole route should always be recalculated	Jianzhao Wang	2010-09-08
\| \| \| \| \| \| \| \| \| \| \| \|	Blackhole routes are used when xfrm_lookup() returns -EREMOTE (error triggered by IKE for example), hence this kind of route is always temporary and so we should check if a better route exists for next packets. Bug has been introduced by commit d11a4dc18bf41719c9f0d7ed494d295dd2973b92. Signed-off-by: Jianzhao Wang <jianzhao.wang@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: Suppress lockdep-RCU false positive in FIB trie (3)	Jarek Poplawski	2010-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Hi, Here is one more of these warnings and a patch below: Sep 5 23:52:33 del kernel: [46044.244833] =================================================== Sep 5 23:52:33 del kernel: [46044.269681] [ INFO: suspicious rcu_dereference_check() usage. ] Sep 5 23:52:33 del kernel: [46044.277000] --------------------------------------------------- Sep 5 23:52:33 del kernel: [46044.285185] net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection! Sep 5 23:52:33 del kernel: [46044.293627] Sep 5 23:52:33 del kernel: [46044.293632] other info that might help us debug this: Sep 5 23:52:33 del kernel: [46044.293634] Sep 5 23:52:33 del kernel: [46044.325333] Sep 5 23:52:33 del kernel: [46044.325335] rcu_scheduler_active = 1, debug_locks = 0 Sep 5 23:52:33 del kernel: [46044.348013] 1 lock held by pppd/1717: Sep 5 23:52:33 del kernel: [46044.357548] #0: (rtnl_mutex){+.+.+.}, at: [<c125dc1f>] rtnl_lock+0xf/0x20 Sep 5 23:52:33 del kernel: [46044.367647] Sep 5 23:52:33 del kernel: [46044.367652] stack backtrace: Sep 5 23:52:33 del kernel: [46044.387429] Pid: 1717, comm: pppd Not tainted 2.6.35.4.4a #3 Sep 5 23:52:33 del kernel: [46044.398764] Call Trace: Sep 5 23:52:33 del kernel: [46044.409596] [<c12f9aba>] ? printk+0x18/0x1e Sep 5 23:52:33 del kernel: [46044.420761] [<c1053969>] lockdep_rcu_dereference+0xa9/0xb0 Sep 5 23:52:33 del kernel: [46044.432229] [<c12b7235>] trie_firstleaf+0x65/0x70 Sep 5 23:52:33 del kernel: [46044.443941] [<c12b74d4>] fib_table_flush+0x14/0x170 Sep 5 23:52:33 del kernel: [46044.455823] [<c1033e92>] ? local_bh_enable_ip+0x62/0xd0 Sep 5 23:52:33 del kernel: [46044.467995] [<c12fc39f>] ? _raw_spin_unlock_bh+0x2f/0x40 Sep 5 23:52:33 del kernel: [46044.480404] [<c12b24d0>] ? fib_sync_down_dev+0x120/0x180 Sep 5 23:52:33 del kernel: [46044.493025] [<c12b069d>] fib_flush+0x2d/0x60 Sep 5 23:52:33 del kernel: [46044.505796] [<c12b06f5>] fib_disable_ip+0x25/0x50 Sep 5 23:52:33 del kernel: [46044.518772] [<c12b10d3>] fib_netdev_event+0x73/0xd0 Sep 5 23:52:33 del kernel: [46044.531918] [<c1048dfd>] notifier_call_chain+0x2d/0x70 Sep 5 23:52:33 del kernel: [46044.545358] [<c1048f0a>] raw_notifier_call_chain+0x1a/0x20 Sep 5 23:52:33 del kernel: [46044.559092] [<c124f687>] call_netdevice_notifiers+0x27/0x60 Sep 5 23:52:33 del kernel: [46044.573037] [<c124faec>] __dev_notify_flags+0x5c/0x80 Sep 5 23:52:33 del kernel: [46044.586489] [<c124fb47>] dev_change_flags+0x37/0x60 Sep 5 23:52:33 del kernel: [46044.599394] [<c12a8a8d>] devinet_ioctl+0x54d/0x630 Sep 5 23:52:33 del kernel: [46044.612277] [<c12aabb7>] inet_ioctl+0x97/0xc0 Sep 5 23:52:34 del kernel: [46044.625208] [<c123f6af>] sock_ioctl+0x6f/0x270 Sep 5 23:52:34 del kernel: [46044.638046] [<c109d2b0>] ? handle_mm_fault+0x420/0x6c0 Sep 5 23:52:34 del kernel: [46044.650968] [<c123f640>] ? sock_ioctl+0x0/0x270 Sep 5 23:52:34 del kernel: [46044.663865] [<c10c3188>] vfs_ioctl+0x28/0xa0 Sep 5 23:52:34 del kernel: [46044.676556] [<c10c38fa>] do_vfs_ioctl+0x6a/0x5c0 Sep 5 23:52:34 del kernel: [46044.688989] [<c1048676>] ? up_read+0x16/0x30 Sep 5 23:52:34 del kernel: [46044.701411] [<c1021376>] ? do_page_fault+0x1d6/0x3a0 Sep 5 23:52:34 del kernel: [46044.714223] [<c10b6588>] ? fget_light+0xf8/0x2f0 Sep 5 23:52:34 del kernel: [46044.726601] [<c1241f98>] ? sys_socketcall+0x208/0x2c0 Sep 5 23:52:34 del kernel: [46044.739140] [<c10c3eb3>] sys_ioctl+0x63/0x70 Sep 5 23:52:34 del kernel: [46044.751967] [<c12fca3d>] syscall_call+0x7/0xb Sep 5 23:52:34 del kernel: [46044.764734] [<c12f0000>] ? cookie_v6_check+0x3d0/0x630 --------------> This patch fixes the warning: =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- net/ipv4/fib_trie.c:1756 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 1 lock held by pppd/1717: #0: (rtnl_mutex){+.+.+.}, at: [<c125dc1f>] rtnl_lock+0xf/0x20 stack backtrace: Pid: 1717, comm: pppd Not tainted 2.6.35.4a #3 Call Trace: [<c12f9aba>] ? printk+0x18/0x1e [<c1053969>] lockdep_rcu_dereference+0xa9/0xb0 [<c12b7235>] trie_firstleaf+0x65/0x70 [<c12b74d4>] fib_table_flush+0x14/0x170 ... Allow trie_firstleaf() to be called either under rcu_read_lock() protection or with RTNL held. The same annotation is added to node_parent_rcu() to prevent a similar warning a bit later. Followup of commits 634a4b20 and 4eaa0e3c. Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	niu: Fix kernel buffer overflow for ETHTOOL_GRXCLSRLALL	Ben Hutchings	2010-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	niu_get_ethtool_tcam_all() assumes that its output buffer is the right size, and warns before returning if it is not. However, the output buffer size is under user control and ETHTOOL_GRXCLSRLALL is an unprivileged ethtool command. Therefore this is at least a local denial-of-service vulnerability. Change it to check before writing each entry and to return an error if the buffer is already full. Compile-tested only. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipvs: fix active FTP	Julian Anastasov	2010-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- Do not create expectation when forwarding the PORT command to avoid blocking the connection. The problem is that nf_conntrack_ftp.c:help() tries to create the same expectation later in POST_ROUTING and drops the packet with "dropping packet" message after failure in nf_ct_expect_related. - Change ip_vs_update_conntrack to alter the conntrack for related connections from real server. If we do not alter the reply in this direction the next packet from client sent to vport 20 comes as NEW connection. We alter it but may be some collision happens for both conntracks and the second conntrack gets destroyed immediately. The connection stucks too. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	gro: Re-fix different skb headrooms	Jarek Poplawski	2010-09-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch: "gro: fix different skb headrooms" in its part: "2) allocate a minimal skb for head of frag_list" is buggy. The copied skb has p->data set at the ip header at the moment, and skb_gro_offset is the length of ip + tcp headers. So, after the change the length of mac header is skipped. Later skb_set_mac_header() sets it into the NET_SKB_PAD area (if it's long enough) and ip header is misaligned at NET_SKB_PAD + NET_IP_ALIGN offset. There is no reason to assume the original skb was wrongly allocated, so let's copy it as it was. bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626 fixes commit: 3d3be4333fdf6faa080947b331a6a19bce1a4f57 Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> CC: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Tested-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
*	via-velocity: Turn scatter-gather support back off.	David S. Miller	2010-09-07
\| \| \| \| \| \| \| \| \| \| \| \|	It causes all kinds of DMA API debugging assertions and all straight-forward attempts to fix it have failed. So turn off SG, and we'll tackle making this work properly in net-next-2.6 Reported-by: Dave Jones <davej@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: Fix reverse path filtering with multipath routing.	David S. Miller	2010-09-07
\| \| \| \| \| \| \| \| \|	Actually iterate over the next-hops to make sure we have a device match. Otherwise RP filtering is always elided when the route matched has multiple next-hops. Reported-by: Igor M Podlesny <for.poige@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	UNIX: Do not loop forever at unix_autobind().	Tetsuo Handa	2010-09-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We assumed that unix_autobind() never fails if kzalloc() succeeded. But unix_autobind() allows only 1048576 names. If /proc/sys/fs/file-max is larger than 1048576 (e.g. systems with more than 10GB of RAM), a local user can consume all names using fork()/socket()/bind(). If all names are in use, those who call bind() with addr_len == sizeof(short) or connect()/sendmsg() with setsockopt(SO_PASSCRED) will continue while (1) yield(); loop at unix_autobind() till a name becomes available. This patch adds a loop counter in order to give up after 1048576 attempts. Calling yield() for once per 256 attempts may not be sufficient when many names are already in use, for __unix_find_socket_byname() can take long time under such circumstance. Therefore, this patch also adds cond_resched() call. Note that currently a local user can consume 2GB of kernel memory if the user is allowed to create and autobind 1048576 UNIX domain sockets. We should consider adding some restriction for autobind operation. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
*	PATCH: b44 Handle RX FIFO overflow better (simplified)	Mark Lord	2010-09-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is a simplified version of the original patch from James Courtier-Dutton. >From: James Courtier-Dutton >Subject: [PATCH] Fix b44 RX FIFO overflow recovery. >Date: Wednesday, June 30, 2010 - 1:11 pm > >This patch improves the recovery after a RX FIFO overflow on the b44 >Ethernet NIC. >Before it would do a complete chip reset, resulting is loss of link >for a few seconds. >This patch improves this to do recovery in about 20ms without loss of link. > >Signed off by: James@superbug.co.uk Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	irda: off by one	Dan Carpenter	2010-09-07
\| \| \| \| \| \| \| \| \| \| \|	This is an off by one. We would go past the end when we NUL terminate the "value" string at end of the function. The "value" buffer is allocated in irlan_client_parse_response() or irlan_provider_parse_command(). CC: stable@kernel.org Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	3c59x: Fix deadlock in vortex_error()	Ben Hutchings	2010-09-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes a bug introduced in commit de847272149365363a6043a963a6f42fb91566e2 "3c59x: Use fine-grained locks for MII and windowed register access". vortex_interrupt() holds vp->window_lock over multiple register accesses to reduce locking overhead. However it also needs to call vortex_error() sometimes, and that uses the regular functions for access to windowed registers, which will try to acquire window_lock again. Therefore, drop window_lock around the call to vortex_error() and set the window afterward reacquiring the lock. Since vortex_error() may call vortex_rx(), which does require its caller to hold window_lock, lift that call up into vortex_interrupt(). This also removes the potential for calling vortex_rx() on a later-generation NIC. Reported-and-tested-by: Jens Schüßler <jgs@trash.net> [in Debian's 2.6.32] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netfilter: discard overlapping IPv6 fragment	Nicolas Dichtel	2010-09-07
\| \| \| \| \| \| \| \| \|	RFC5722 prohibits reassembling IPv6 fragments when some data overlaps. Bug spotted by Zhang Zuotao <zuotao.zhang@6wind.com>. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv6: discard overlapping fragment	Nicolas Dichtel	2010-09-07
\| \| \| \| \| \| \| \| \|	RFC5722 prohibits reassembling fragments when some data overlaps. Bug spotted by Zhang Zuotao <zuotao.zhang@6wind.com>. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: fix tx queue selection for bridged devices implementing select_queue	Helmut Schaa	2010-09-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a net device is implementing the select_queue callback and is part of a bridge, frames coming from the bridge already have a tx queue associated to the socket (introduced in commit a4ee3ce3293dc931fab19beb472a8bde1295aebe, "net: Use sk_tx_queue_mapping for connected sockets"). The call to sk_tx_queue_get will then return the tx queue used by the bridge instead of calling the select_queue callback. In case of mac80211 this broke QoS which is implemented by using the select_queue callback. Furthermore it introduced problems with rt2x00 because frames with the same TID and RA sometimes appeared on different tx queues which the hw cannot handle correctly. Fix this by always calling select_queue first if it is available and only afterwards use the socket tx queue mapping. Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bonding: Fix jiffies overflow problems (again)	Jiri Bohac	2010-09-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The time_before_eq()/time_after_eq() functions operate on unsigned long and only work if the difference between the two compared values is smaller than half the range of unsigned long (31 bits on i386). Some of the variables (slave->jiffies, dev->trans_start, dev->last_rx) used by bonding store a copy of jiffies and may not be updated for a long time. With HZ=1000, time_before_eq()/time_after_eq() will start giving bad results after ~25 days. jiffies will never be before slave->jiffies, dev->trans_start, dev->last_rx by more than possibly a couple ticks caused by preemption of this code. This allows us to detect/prevent these overflows by replacing time_before_eq()/time_after_eq() with time_in_range(). Signed-off-by: Jiri Bohac <jbohac@suse.cz> Signed-off-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
*	stmmac: fix sleep inside atomic	Giuseppe Cavallaro	2010-09-07
\| \| \| \| \| \| \| \| \| \| \|	We cannot use spinlock when kmalloc is invoked with GFP_KERNEL flag because it can sleep. So this patch reviews the usage of spinlock within the stmmac_resume function avoing this bug. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Reported-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	cls_cgroup: Fix rcu lockdep warning	Li Zefan	2010-09-03
\| \| \| \| \| \| \| \| \| \| \| \| \|	Dave reported an rcu lockdep warning on 2.6.35.4 kernel task->cgroups and task->cgroups->subsys[i] are protected by RCU. So we avoid accessing invalid pointers here. This might happen, for example, when you are deref-ing those pointers while someone move @task from one cgroup to another. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	be2net: remove a BUG_ON in be_cmds.c	Ajit Khaparde	2010-09-03
\| \| \| \| \| \| \| \|	Async notifications other than link status are possible in certain configurations. Remove the BUG_ON in the mcc completion processing path. Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	be2net: fix a bug in UE detection logic	Ajit Khaparde	2010-09-03
\| \| \| \| \| \| \| \| \| \| \| \| \|	The ONLINE registers can return 0xFFFFFFFF on more than one occassion. On systems that care, reading these registers could lead to problems. So the new code decides that the ASIC has encountered and error by reading the UE_STATUS_LOW/HIGH registers. AND them with the mask values and a non-zero result indicates an error. Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	be2net: fix net-snmp error because of wrong packet stats	Ajit Khaparde	2010-09-03
\| \| \| \| \| \| \| \| \|	Wrong packet statistics for multicast Rx was causing net-snmp error messages every 15 seconds. Instead of picking the multicast stats from hardware, now maintain it in the driver itself. Signed-off-by: Ajit Khaparde <ajitk@serverengines.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	pkt_sched: Fix lockdep warning on est_tree_lock in gen_estimator	Jarek Poplawski	2010-09-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fixes a lockdep warning: [ 516.287584] ========================================================= [ 516.288386] [ INFO: possible irq lock inversion dependency detected ] [ 516.288386] 2.6.35b #7 [ 516.288386] --------------------------------------------------------- [ 516.288386] swapper/0 just changed the state of lock: [ 516.288386] (&qdisc_tx_lock){+.-...}, at: [<c12eacda>] est_timer+0x62/0x1b4 [ 516.288386] but this lock took another, SOFTIRQ-unsafe lock in the past: [ 516.288386] (est_tree_lock){+.+...} [ 516.288386] [ 516.288386] and interrupts could create inverse lock ordering between them. ... So, est_tree_lock needs BH protection because it's taken by qdisc_tx_lock, which is used both in BH and process contexts. (Full warning with this patch at netdev, 02 Sep 2010.) Fixes commit: ae638c47dc040b8def16d05dc6acdd527628f231 ("pkt_sched: gen_estimator: add a new lock") Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipvs: avoid oops for passive FTP	Julian Anastasov	2010-09-02
\| \| \| \| \| \| \| \| \| \| \|	Fix Passive FTP problem in ip_vs_ftp: - Do not oops in nf_nat_set_seq_adjust (adjust_tcp_sequence) when iptable_nat module is not loaded Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Revert "sky2: don't do GRO on second port"	David S. Miller	2010-09-02
\| \| \| \| \| \| \| \| \|	This reverts commit de6be6c1f77798c4da38301693d33aff1cd76e84. After some discussion with Jarek Poplawski and Eric Dumazet, we've decided that this change is incorrect. Signed-off-by: David S. Miller <davem@davemloft.net>
*	gro: fix different skb headrooms	Eric Dumazet	2010-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Packets entering GRO might have different headrooms, even for a given flow (because of implementation details in drivers, like copybreak). We cant force drivers to deliver packets with a fixed headroom. 1) fix skb_segment() skb_segment() makes the false assumption headrooms of fragments are same than the head. When CHECKSUM_PARTIAL is used, this can give csum_start errors, and crash later in skb_copy_and_csum_dev() 2) allocate a minimal skb for head of frag_list skb_gro_receive() uses netdev_alloc_skb(headroom + skb_gro_offset(p)) to allocate a fresh skb. This adds NET_SKB_PAD to a padding already provided by netdevice, depending on various things, like copybreak. Use alloc_skb() to allocate an exact padding, to reduce cache line needs: NET_SKB_PAD + NET_IP_ALIGN bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=16626 Many thanks to Plamen Petrov, testing many debugging patches ! With help of Jarek Poplawski. Reported-by: Plamen Petrov <pvp-lsts@fs.uni-ruse.bg> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bridge: Clear INET control block of SKBs passed into ip_fragment().	David S. Miller	2010-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \|	In a similar vain to commit 17762060c25590bfddd68cc1131f28ec720f405f ("bridge: Clear IPCB before possible entry into IP stack") Any time we call into the IP stack we have to make sure the state there is as expected by the ipv4 code. With help from Eric Dumazet and Herbert Xu. Reported-by: Bandan Das <bandan.das@stratus.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	3c59x: Remove incorrect locking; correct documented lock hierarchy	Ben Hutchings	2010-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	vortex_ioctl() was grabbing vortex_private::lock around its call to generic_mii_ioctl(). This is no longer necessary since there are more specific locks which the mdio_{read,write}() functions will obtain. Worse, those functions do not save and restore IRQ flags when locking the MII state, so interrupts will be enabled when generic_mii_ioctl() returns. Since there is currently no need for any function to call mdio_{read,write}() while holding another spinlock, do not change them to save and restore IRQ flags but remove the specification of ordering between vortex_private::lock and vortex_private::mii_lock. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
*	sky2: don't do GRO on second port	stephen hemminger	2010-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There's something very important I forgot to tell you. What? Don't cross the GRO streams. Why? It would be bad. I'm fuzzy on the whole good/bad thing. What do you mean, "bad"? Try to imagine all the Internet as you know it stopping instantaneously and every bit in every packet swapping at the speed of light. Total packet reordering. Right. That's bad. Okay. All right. Important safety tip. Thanks, Hubert The simplest way to stop this is just avoid doing GRO on the second port. Very few Marvell boards support two ports per ring, and GRO is just an optimization. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	ipv4: minor fix about RPF in help of Kconfig	Nicolas Dichtel	2010-09-01
\| \| \| \| \|	Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	xfrm_user: avoid a warning with some compiler	Nicolas Dichtel	2010-09-01
\| \| \| \| \| \| \| \|	Attached is a small patch to remove a warning ("warning: ISO C90 forbids mixed declarations and code" with gcc 4.3.2). Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net/sched/sch_hfsc.c: initialize parent's cl_cfmin properly in init_vf()	Michal Soltys	2010-09-01
\| \| \| \| \| \| \| \| \|	This patch fixes init_vf() function, so on each new backlog period parent's cl_cfmin is properly updated (including further propgation towards the root), even if the activated leaf has no upperlimit curve defined. Signed-off-by: Michal Soltys <soltys@ziu.info> Signed-off-by: David S. Miller <davem@davemloft.net>
*	pxa168_eth: fix a mdiobus leak	Denis Kirjanov	2010-09-01
\| \| \| \| \| \| \| \|	mdiobus resources must be released on exit Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Acked-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net sched: fix kernel leak in act_police	Jeff Mahoney	2010-09-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While reviewing commit 1c40be12f7d8ca1d387510d39787b12e512a7ce8, I audited other users of tc_action_ops->dump for information leaks. That commit covered almost all of them but act_police still had a leak. opt.limit and opt.capab aren't zeroed out before the structure is passed out. This patch uses the C99 initializers to zero everything unused out. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Acked-by: Jeff Mahoney <jeffm@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	vhost: stop worker only if created	Eric Dumazet	2010-09-01
\| \| \| \| \| \| \| \| \|	Its currently illegal to call kthread_stop(NULL) Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	MAINTAINERS: Add ehea driver as Supported	Breno Leitao	2010-09-01
\| \| \| \| \| \| \|	This change just add the IBM eHEA 10Gb network drivers as supported. Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'master' of ↵	David S. Miller	2010-09-01
\|\ \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6
\| *	ath9k_hw: fix parsing of HT40 5 GHz CTLs	Luis R. Rodriguez	2010-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The 5 GHz CTL indexes were not being read for all hardware devices due to the masking out through the CTL_MODE_M mask being one bit too short. Without this the calibrated regulatory maximum values were not being picked up when devices operate on 5 GHz in HT40 mode. The final output power used for Atheros devices is the minimum between the calibrated CTL values and what CRDA provides. Cc: stable@kernel.org [2.6.27+] Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	ath9k_hw: Fix EEPROM uncompress block reading on AR9003	Luis R. Rodriguez	2010-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The EEPROM is compressed on AR9003, upon decompression the wrong upper limit was being used for the block which prevented the 5 GHz CTL indexes from being used, which are stored towards the end of the EEPROM block. This fix allows the actual intended regulatory limits to be used on AR9003 hardware. Cc: stable@kernel.org [2.6.36+] Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	wireless: register wiphy rfkill w/o holding cfg80211_mutex	John W. Linville	2010-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise lockdep complains... https://bugzilla.kernel.org/show_bug.cgi?id=17311 [ INFO: possible circular locking dependency detected ] 2.6.36-rc2-git4 #12 ------------------------------------------------------- kworker/0:3/3630 is trying to acquire lock: (rtnl_mutex){+.+.+.}, at: [<ffffffff813396c7>] rtnl_lock+0x12/0x14 but task is already holding lock: (rfkill_global_mutex){+.+.+.}, at: [<ffffffffa014b129>] rfkill_switch_all+0x24/0x49 [rfkill] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (rfkill_global_mutex){+.+.+.}: [<ffffffff81079ad7>] lock_acquire+0x120/0x15b [<ffffffff813ae869>] __mutex_lock_common+0x54/0x52e [<ffffffff813aede9>] mutex_lock_nested+0x34/0x39 [<ffffffffa014b4ab>] rfkill_register+0x2b/0x29c [rfkill] [<ffffffffa0185ba0>] wiphy_register+0x1ae/0x270 [cfg80211] [<ffffffffa0206f01>] ieee80211_register_hw+0x1b4/0x3cf [mac80211] [<ffffffffa0292e98>] iwl_ucode_callback+0x9e9/0xae3 [iwlagn] [<ffffffff812d3e9d>] request_firmware_work_func+0x54/0x6f [<ffffffff81065d15>] kthread+0x8c/0x94 [<ffffffff8100ac24>] kernel_thread_helper+0x4/0x10 -> #1 (cfg80211_mutex){+.+.+.}: [<ffffffff81079ad7>] lock_acquire+0x120/0x15b [<ffffffff813ae869>] __mutex_lock_common+0x54/0x52e [<ffffffff813aede9>] mutex_lock_nested+0x34/0x39 [<ffffffffa018605e>] cfg80211_get_dev_from_ifindex+0x1b/0x7c [cfg80211] [<ffffffffa0189f36>] cfg80211_wext_giwscan+0x58/0x990 [cfg80211] [<ffffffff8139a3ce>] ioctl_standard_iw_point+0x1a8/0x272 [<ffffffff8139a529>] ioctl_standard_call+0x91/0xa7 [<ffffffff8139a687>] T.723+0xbd/0x12c [<ffffffff8139a727>] wext_handle_ioctl+0x31/0x6d [<ffffffff8133014e>] dev_ioctl+0x63d/0x67a [<ffffffff8131afd9>] sock_ioctl+0x48/0x21d [<ffffffff81102abd>] do_vfs_ioctl+0x4ba/0x509 [<ffffffff81102b5d>] sys_ioctl+0x51/0x74 [<ffffffff81009e02>] system_call_fastpath+0x16/0x1b -> #0 (rtnl_mutex){+.+.+.}: [<ffffffff810796b0>] __lock_acquire+0xa93/0xd9a [<ffffffff81079ad7>] lock_acquire+0x120/0x15b [<ffffffff813ae869>] __mutex_lock_common+0x54/0x52e [<ffffffff813aede9>] mutex_lock_nested+0x34/0x39 [<ffffffff813396c7>] rtnl_lock+0x12/0x14 [<ffffffffa0185cb5>] cfg80211_rfkill_set_block+0x1a/0x7b [cfg80211] [<ffffffffa014aed0>] rfkill_set_block+0x80/0xd5 [rfkill] [<ffffffffa014b07e>] __rfkill_switch_all+0x3f/0x6f [rfkill] [<ffffffffa014b13d>] rfkill_switch_all+0x38/0x49 [rfkill] [<ffffffffa014b821>] rfkill_op_handler+0x105/0x136 [rfkill] [<ffffffff81060708>] process_one_work+0x248/0x403 [<ffffffff81062620>] worker_thread+0x139/0x214 [<ffffffff81065d15>] kthread+0x8c/0x94 [<ffffffff8100ac24>] kernel_thread_helper+0x4/0x10 Signed-off-by: John W. Linville <linville@tuxdriver.com> Acked-by: Johannes Berg <johannes@sipsolutions.net>
\| *	wireless extensions: fix kernel heap content leak	Johannes Berg	2010-08-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Wireless extensions have an unfortunate, undocumented requirement which requires drivers to always fill iwp->length when returning a successful status. When a driver doesn't do this, it leads to a kernel heap content leak when userspace offers a larger buffer than would have been necessary. Arguably, this is a driver bug, as it should, if it returns 0, fill iwp->length, even if it separately indicated that the buffer contents was not valid. However, we can also at least avoid the memory content leak if the driver doesn't do this by setting the iwp length to max_tokens, which then reflects how big the buffer is that the driver may fill, regardless of how big the userspace buffer is. To illustrate the point, this patch also fixes a corresponding cfg80211 bug (since this requirement isn't documented nor was ever pointed out by anyone during code review, I don't trust all drivers nor all cfg80211 handlers to implement it correctly). Cc: stable@kernel.org [all the way back] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	MAINTAINERS: change broken url for prism54	John W. Linville	2010-08-30
\| \| \| \| \| \| \| \| \| \|	Reported-by: Joe Perches <joe@perches.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	mac80211: delete work timer	Johannes Berg	2010-08-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The new workqueue changes helped me find this bug that's been lingering since the changes to the work processing in mac80211 -- the work timer is never deleted properly. Do that to avoid having it fire after all data structures have been freed. It can't be re-armed because all it will do, if running, is schedule the work, but that gets flushed later and won't have anything to do since all work items are gone by now (by way of interface removal). Cc: stable@kernel.org [2.6.34+] Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	p54: fix tx feedback status flag check	Christian Lamparter	2010-08-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Michael reported that p54* never really entered power save mode, even tough it was enabled. It turned out that upon a power save mode change the firmware will set a special flag onto the last outgoing frame tx status (which in this case is almost always the designated PSM nullfunc frame). This flag confused the driver; It erroneously reported transmission failures to the stack, which then generated the next nullfunc. and so on... Cc: <stable@kernel.org> Reported-by: Michael Buesch <mb@bu3sch.de> Tested-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
\| *	ath5k: check return value of ieee80211_get_tx_rate	John W. Linville	2010-08-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This avoids a NULL pointer dereference as reported here: https://bugzilla.redhat.com/show_bug.cgi?id=625889 When the WARN condition is hit in ieee80211_get_tx_rate, it will return NULL. So, we need to check the return value and avoid dereferencing it in that case. Signed-off-by: John W. Linville <linville@tuxdriver.com> Cc: stable@kernel.org Acked-by: Bob Copeland <me@bobcopeland.com>
\| *	libertas: if_sdio: fix buffer alignment in struct if_sdio_card	Mike Rapoport	2010-08-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit 886275ce41a9751117367fb387ed171049eb6148 (param: lock if_sdio's lbs_helper_name and lbs_fw_name against sysfs changes) introduced new fields into the if_sdio_card structure. It caused missalignment of the if_sdio_card.buffer field and failure at driver load time: ~# modprobe libertas_sdio [ 62.315124] libertas_sdio: Libertas SDIO driver [ 62.319976] libertas_sdio: Copyright Pierre Ossman [ 63.020629] DMA misaligned error with device 48 [ 63.025207] mmci-omap-hs mmci-omap-hs.1: unexpected dma status 800 [ 66.005035] libertas: command 0x0003 timed out [ 66.009826] libertas: Timeout submitting command 0x0003 [ 66.016296] libertas: PREP_CMD: command 0x0003 failed: -110 Adding explicit alignment attribute for the if_sdio_card.buffer field fixes this problem. Signed-off-by: Mike Rapoport <mike@compulab.co.il> Acked-by: Marek Vasut <marek.vasut@gmail.com> Acked-by: Dan Williams <dcbw@redhat.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
* \|	netlink: Make NETLINK_USERSOCK work again.	David S. Miller	2010-08-31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once we started enforcing the a nl_table[] entry exist for a protocol, NETLINK_USERSOCK stopped working. Add a dummy table entry so that it works again. Reported-by: Thomas Voegtle <tv@lio96.de> Tested-by: Thomas Voegtle <tv@lio96.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	irda: Correctly clean up self->ias_obj on irda_bind() failure.	David S. Miller	2010-08-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If irda_open_tsap() fails, the irda_bind() code tries to destroy the ->ias_obj object by hand, but does so wrongly. In particular, it fails to a) release the hashbin attached to the object and b) reset the self->ias_obj pointer to NULL. Fix both problems by using irias_delete_object() and explicitly setting self->ias_obj to NULL, just as irda_release() does. Reported-by: Tavis Ormandy <taviso@cmpxchg8b.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	pcnet_cs: add new_id	Ken Kawasaki	2010-08-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pcnet_cs: add new_id: "KENTRONICS KEP-230" 10Base-T PCMCIA card. Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net/ipv4: Eliminate kstrdup memory leak	Julia Lawall	2010-08-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The string clone is only used as a temporary copy of the argument val within the while loop, and so it should be freed before leaving the function. The call to strsep, however, modifies clone, so a pointer to the front of the string is kept in saved_clone, to make it possible to free it. The sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ local idexpression x; expression E; identifier l; statement S; @@ x= \(kasprintf\\|kstrdup\)(...); ... if (x == NULL) S ... when != kfree(x) when != E = x if (...) { <... when != kfree(x) goto l; ...> * return ...; } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: David S. Miller <davem@davemloft.net>