aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* IPVS: Use global mutex in ip_vs_app.cSimon Horman2011-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the work to make IPVS network namespace aware __ip_vs_app_mutex was replaced by a per-namespace lock, ipvs->app_mutex. ipvs->app_key is also supplied for debugging purposes. Unfortunately this implementation results in ipvs->app_key residing in non-static storage which at the very least causes a lockdep warning. This patch takes the rather heavy-handed approach of reinstating __ip_vs_app_mutex which will cover access to the ipvs->list_head of all network namespaces. [ 12.610000] IPVS: Creating netns size=2456 id=0 [ 12.630000] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP) [ 12.640000] BUG: key ffff880003bbf1a0 not in .data! [ 12.640000] ------------[ cut here ]------------ [ 12.640000] WARNING: at kernel/lockdep.c:2701 lockdep_init_map+0x37b/0x570() [ 12.640000] Hardware name: Bochs [ 12.640000] Pid: 1, comm: swapper Tainted: G W 2.6.38-kexec-06330-g69b7efe-dirty #122 [ 12.650000] Call Trace: [ 12.650000] [<ffffffff8102e685>] warn_slowpath_common+0x75/0xb0 [ 12.650000] [<ffffffff8102e6d5>] warn_slowpath_null+0x15/0x20 [ 12.650000] [<ffffffff8105967b>] lockdep_init_map+0x37b/0x570 [ 12.650000] [<ffffffff8105829d>] ? trace_hardirqs_on+0xd/0x10 [ 12.650000] [<ffffffff81055ad8>] debug_mutex_init+0x38/0x50 [ 12.650000] [<ffffffff8104bc4c>] __mutex_init+0x5c/0x70 [ 12.650000] [<ffffffff81685ee7>] __ip_vs_app_init+0x64/0x86 [ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff [ 12.660000] [<ffffffff811b1c33>] T.620+0x43/0x170 [ 12.660000] [<ffffffff811b1e9a>] ? register_pernet_subsys+0x1a/0x40 [ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff [ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff [ 12.660000] [<ffffffff811b1db7>] register_pernet_operations+0x57/0xb0 [ 12.660000] [<ffffffff81685a3b>] ? ip_vs_init+0x0/0xff [ 12.670000] [<ffffffff811b1ea9>] register_pernet_subsys+0x29/0x40 [ 12.670000] [<ffffffff81685f19>] ip_vs_app_init+0x10/0x12 [ 12.670000] [<ffffffff81685a87>] ip_vs_init+0x4c/0xff [ 12.670000] [<ffffffff8166562c>] do_one_initcall+0x7a/0x12e [ 12.670000] [<ffffffff8166583e>] kernel_init+0x13e/0x1c2 [ 12.670000] [<ffffffff8128c134>] kernel_thread_helper+0x4/0x10 [ 12.670000] [<ffffffff8128ad40>] ? restore_args+0x0/0x30 [ 12.680000] [<ffffffff81665700>] ? kernel_init+0x0/0x1c2 [ 12.680000] [<ffffffff8128c130>] ? kernel_thread_helper+0x0/0x1global0 Signed-off-by: Simon Horman <horms@verge.net.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Julian Anastasov <ja@ssi.bg> Cc: Hans Schillstrom <hans@schillstrom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipvs: fix a typo in __ip_vs_control_init()Eric Dumazet2011-03-21
| | | | | | | | | Reported-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Simon Horman <horms@verge.net.au> Cc: Julian Anastasov <ja@ssi.bg> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* veth: Fix the byte countersEric W. Biederman2011-03-21
| | | | | | | | | | | | | | | | | | Commit 44540960 "veth: move loopback logic to common location" introduced a bug in the packet counters. I don't understand why that happened as it is not explained in the comments and the mut check in dev_forward_skb retains the assumption that skb->len is the total length of the packet. I just measured this emperically by setting up a veth pair between two noop network namespaces setting and attempting a telnet connection between the two. I saw three packets in each direction and the byte counters were exactly 14*3 = 42 bytes high in each direction. I got the actual packet lengths with tcpdump. So remove the extra ETH_HLEN from the veth byte count totals. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net ipv6: Fix duplicate /proc/sys/net/ipv6/neigh directory entries.Eric W. Biederman2011-03-21
| | | | | | | | | | | | | | | | | | When I was fixing issues with unregisgtering tables under /proc/sys/net/ipv6/neigh by adding a mount point it appears I missed a critical ordering issue, in the ipv6 initialization. I had not realized that ipv6_sysctl_register is called at the very end of the ipv6 initialization and in particular after we call neigh_sysctl_register from ndisc_init. "neigh" needs to be initialized in ipv6_static_sysctl_register which is the first ipv6 table to initialized, and definitely before ndisc_init. This removes the weirdness of duplicate tables while still providing a "neigh" mount point which prevents races in sysctl unregistering. This was initially reported at https://bugzilla.kernel.org/show_bug.cgi?id=31232 Reported-by: sunkan@zappa.cx Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* macvlan: Fix use after free of struct macvlan_port.Eric W. Biederman2011-03-21
| | | | | | | | | | | | | | | | | | | When the macvlan driver was extended to call unregisgter_netdevice_queue in 23289a37e2b127dfc4de1313fba15bb4c9f0cd5b, a use after free of struct macvlan_port was introduced. The code in dellink relied on unregister_netdevice actually unregistering the net device so it would be safe to free macvlan_port. Since unregister_netdevice_queue can just queue up the unregister instead of performing the unregiser immediately we free the macvlan_port too soon and then the code in macvlan_stop removes the macaddress for the set of macaddress to listen for and uses memory that has already been freed. To fix this add a reference count to track when it is safe to free the macvlan_port and move the call of macvlan_port_destroy into macvlan_uninit which is guaranteed to be called after the final macvlan_port_close. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: fix incorrect spelling in drop monitor protocolNeil Horman2011-03-21
| | | | | | | It was pointed out to me recently that my spelling could be better :) Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* can: c_can: Do basic c_can configuration _before_ enabling the interruptsJan Altenberg2011-03-21
| | | | | | | | | | | | | | | | | | | I ran into some trouble while testing the SocketCAN driver for the BOSCH C_CAN controller. The interface is not correctly initialized, if I put some CAN traffic on the line, _while_ the interface is being started (which means: the interface doesn't come up correcty, if there's some RX traffic while doing 'ifconfig can0 up'). The current implementation enables the controller interrupts _before_ doing the basic c_can configuration. I think, this should be done the other way round. The patch below fixes things for me. Signed-off-by: Jan Altenberg <jan@linutronix.de> Acked-by: Kurt Van Dijck <kurt.van.dijck@eia.be> Acked-by: Wolfgang Grandegger <wg@grandegger.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net/appletalk: fix atalk_release use after freeArnd Bergmann2011-03-21
| | | | | | | | | | | | The BKL removal in appletalk introduced a use-after-free problem, where atalk_destroy_socket frees a sock, but we still release the socket lock on it. An easy fix is to take an extra reference on the sock and sock_put it when returning from atalk_release. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipx: fix ipx_release()Eric Dumazet2011-03-21
| | | | | | | | | | | | | Commit b0d0d915d1d1a0 (remove the BKL) added a regression, because sock_put() can free memory while we are going to use it later. Fix is to delay sock_put() _after_ release_sock(). Reported-by: Ingo Molnar <mingo@elte.hu> Tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* snmp: SNMP_UPD_PO_STATS_BH() always called from softirqEric Dumazet2011-03-21
| | | | | | | | | We dont need to test if we run from softirq context, we definitely are. This saves few instructions in ip_rcv() & ip_rcv_finish() Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* l2tp: fix possible oops on l2tp_eth module unloadJames Chapman2011-03-21
| | | | | | | | | | | | | | | | | | | | | | A struct used in the l2tp_eth driver for registering network namespace ops was incorrectly marked as __net_initdata, leading to oops when module unloaded. BUG: unable to handle kernel paging request at ffffffffa00ec098 IP: [<ffffffff8123dbd8>] ops_exit_list+0x7/0x4b PGD 142d067 PUD 1431063 PMD 195da8067 PTE 0 Oops: 0000 [#1] SMP last sysfs file: /sys/module/l2tp_eth/refcnt Call Trace: [<ffffffff8123dc94>] ? unregister_pernet_operations+0x32/0x93 [<ffffffff8123dd20>] ? unregister_pernet_device+0x2b/0x38 [<ffffffff81068b6e>] ? sys_delete_module+0x1b8/0x222 [<ffffffff810c7300>] ? do_munmap+0x254/0x318 [<ffffffff812c64e5>] ? page_fault+0x25/0x30 [<ffffffff812c6952>] ? system_call_fastpath+0x16/0x1b Signed-off-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* xfrm: Fix initialize repl field of struct xfrm_stateWei Yongjun2011-03-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 'xfrm: Move IPsec replay detection functions to a separate file' (9fdc4883d92d20842c5acea77a4a21bb1574b495) introduce repl field to struct xfrm_state, and only initialize it under SA's netlink create path, the other path, such as pf_key, ipcomp/ipcomp6 etc, the repl field remaining uninitialize. So if the SA is created by pf_key, any input packet with SA's encryption algorithm will cause panic. int xfrm_input() { ... x->repl->advance(x, seq); ... } This patch fixed it by introduce new function __xfrm_init_state(). Pid: 0, comm: swapper Not tainted 2.6.38-next+ #14 Bochs Bochs EIP: 0060:[<c078e5d5>] EFLAGS: 00010206 CPU: 0 EIP is at xfrm_input+0x31c/0x4cc EAX: dd839c00 EBX: 00000084 ECX: 00000000 EDX: 01000000 ESI: dd839c00 EDI: de3a0780 EBP: dec1de88 ESP: dec1de64 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process swapper (pid: 0, ti=dec1c000 task=c09c0f20 task.ti=c0992000) Stack: 00000000 00000000 00000002 c0ba27c0 00100000 01000000 de3a0798 c0ba27c0 00000033 dec1de98 c0786848 00000000 de3a0780 dec1dea4 c0786868 00000000 dec1debc c074ee56 e1da6b8c de3a0780 c074ed44 de3a07a8 dec1decc c074ef32 Call Trace: [<c0786848>] xfrm4_rcv_encap+0x22/0x27 [<c0786868>] xfrm4_rcv+0x1b/0x1d [<c074ee56>] ip_local_deliver_finish+0x112/0x1b1 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44 [<c074ef77>] ip_local_deliver+0x3e/0x44 [<c074ed44>] ? ip_local_deliver_finish+0x0/0x1b1 [<c074ec03>] ip_rcv_finish+0x30a/0x332 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332 [<c074ef32>] NF_HOOK.clone.1+0x3d/0x44 [<c074f188>] ip_rcv+0x20b/0x247 [<c074e8f9>] ? ip_rcv_finish+0x0/0x332 [<c072797d>] __netif_receive_skb+0x373/0x399 [<c0727bc1>] netif_receive_skb+0x4b/0x51 [<e0817e2a>] cp_rx_poll+0x210/0x2c4 [8139cp] [<c072818f>] net_rx_action+0x9a/0x17d [<c0445b5c>] __do_softirq+0xa1/0x149 [<c0445abb>] ? __do_softirq+0x0/0x149 Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'vhost-net-next' of ↵David S. Miller2011-03-20
|\ | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
| * vhost-net: remove unlocked use of receive_queueMichael S. Tsirkin2011-03-13
| | | | | | | | | | | | | | | | | | Use of skb_queue_empty(&sock->sk->sk_receive_queue) without taking the sk_receive_queue.lock is unsafe or useless. Take it out. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: lock receive queue, not the socketJason Wang2011-03-13
| | | | | | | | | | | | | | | | | | vhost takes a sock lock to try and prevent the skb from being pulled from the receive queue after skb_peek. However this is not the right lock to use for that, sk_receive_queue.lock is. Fix that up. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost-net: Unify the code of mergeable and big buffer handlingJason Wang2011-03-13
| | | | | | | | | | | | | | | | | | | | | | | | Codes duplication were found between the handling of mergeable and big buffers, so this patch tries to unify them. This could be easily done by adding a quota to the get_rx_bufs() which is used to limit the number of buffers it returns (for mergeable buffer, the quota is simply UIO_MAXIOV, for big buffers, the quota is just 1), and then the previous handle_rx_mergeable() could be resued also for big buffers. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost-net: check the support of mergeable buffer outside the receive loopJason Wang2011-03-13
| | | | | | | | | | | | | | | | | | No need to check the support of mergeable buffer inside the recevie loop as the whole handle_rx()_xx is in the read critical region. So this patch move it ahead of the receiving loop. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: copy_from_user -> __copy_from_userMichael S. Tsirkin2011-03-08
| | | | | | | | | | | | | | | | copy_from_user is pretty high on perf top profile, replacing it with __copy_from_user helps. It's also safe because we do access_ok checks during setup. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
| * vhost: Cleanup vhost.c and net.cKrishna Kumar2011-03-08
| | | | | | | | | | | | | | Minor cleanup of vhost.c and net.c to match coding style. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* | Merge branch 'master' of ↵David S. Miller2011-03-20
|\ \ | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6
| * | netfilter: ipt_CLUSTERIP: fix buffer overflowVasiliy Kulikov2011-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'buffer' string is copied from userspace. It is not checked whether it is zero terminated. This may lead to overflow inside of simple_strtoul(). Changli Gao suggested to copy not more than user supplied 'size' bytes. It was introduced before the git epoch. Files "ipt_CLUSTERIP/*" are root writable only by default, however, on some setups permissions might be relaxed to e.g. network admin user. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: xtables: fix reentrancyEric Dumazet2011-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit f3c5c1bfd4308 (make ip_tables reentrant) introduced a race in handling the stackptr restore, at the end of ipt_do_table() We should do it before the call to xt_info_rdunlock_bh(), or we allow cpu preemption and another cpu overwrites stackptr of original one. A second fix is to change the underflow test to check the origptr value instead of 0 to detect underflow, or else we allow a jump from different hooks. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: ipset: fix checking the type revision at create commandJozsef Kadlecsik2011-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | The revision of the set type was not checked at the create command: if the userspace sent a valid set type but with not supported revision number, it'd create a loop. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | netfilter: ipset: fix address ranges at hash:*port* typesJozsef Kadlecsik2011-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | The hash:*port* types with IPv4 silently ignored when address ranges with non TCP/UDP were added/deleted from the set and used the first address from the range only. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Patrick McHardy <kaber@trash.net>
* | | niu: Rename NIU parent platform device name to fix conflict.David S. Miller2011-03-20
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the OF device driver bits were converted over to the platform device infrastructure in commit 74888760d40b3ac9054f9c5fa07b566c0676ba2d ("dt/net: Eliminate users of of_platform_{,un}register_driver") we inadvertantly created probing problems in the OF case. The NIU driver creates a dummy platform device to represent the board that contains one or more child NIU devices. Unfortunately we use the same name, "niu", as the OF device driver itself uses. The result is that we try to probe the dummy "niu" parent device we create, and since it has a NULL ofdevice pointer etc. everything explodes: [783019.128243] niu: niu.c:v1.1 (Apr 22, 2010) [783019.128810] Unable to handle kernel NULL pointer dereference [783019.128949] tsk->{mm,active_mm}->context = 000000000000039e [783019.129078] tsk->{mm,active_mm}->pgd = fffff803afc5a000 [783019.129206] \|/ ____ \|/ [783019.129213] "@'/ .. \`@" [783019.129220] /_| \__/ |_\ [783019.129226] \__U_/ [783019.129378] modprobe(2004): Oops [#1] [783019.129423] TSTATE: 0000000011001602 TPC: 0000000010052ff8 TNPC: 000000000061bbb4 Y: 00000000 Not tainted [783019.129542] TPC: <niu_of_probe+0x3c/0x2dc [niu]> [783019.129624] g0: 8080000000000000 g1: 0000000000000000 g2: 0000000010056000 g3: 0000000000000002 [783019.129733] g4: fffff803fc1da0c0 g5: fffff800441e2000 g6: fffff803fba84000 g7: 0000000000000000 [783019.129842] o0: fffff803fe7df010 o1: 0000000010055700 o2: 0000000000000000 o3: fffff803fbacaca0 [783019.129951] o4: 0000000000000080 o5: 0000000000777908 sp: fffff803fba866e1 ret_pc: 0000000010052ff4 [783019.130083] RPC: <niu_of_probe+0x38/0x2dc [niu]> [783019.130165] l0: fffff803fe7df010 l1: fffff803fbacafc0 l2: fffff803fbacaca0 l3: ffffffffffffffed [783019.130273] l4: 0000000000000000 l5: 000000007fffffff l6: fffff803fba86f40 l7: 0000000000000001 [783019.130382] i0: fffff803fe7df000 i1: fffff803fc20aba0 i2: 0000000000000000 i3: 0000000000000001 [783019.130490] i4: 0000000000000000 i5: 0000000000000000 i6: fffff803fba867a1 i7: 000000000062038c [783019.130614] I7: <platform_drv_probe+0xc/0x20> Fix by simply renaming the parent device to "niu-board". Signed-off-by: David S. Miller <davem@davemloft.net>
* | r8169: fix a bug in rtl8169_init_phy()Eric Dumazet2011-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | commit 54405cde7624 (r8169: support control of advertising.) introduced a bug in rtl8169_init_phy() Reported-by: Piotr Hosowicz <piotr@hosowicz.com> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Oliver Neukum <oliver@neukum.org> Cc: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Anca Emanuel <anca.emanuel@gmail.com> Tested-by: Piotr Hosowicz <piotr@hosowicz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bonding: fix a typo in a commentNicolas de Pesloüan2011-03-19
| | | | | | | | | | Signed-off-by: Nicolas de Pesloüan <nicolas.2p.debian@free.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ftmac100: use resource_size()Dan Carpenter2011-03-19
| | | | | | | | | | | | | | | | | | | | The calculation is off-by-one. It should be "end - start + 1". This patch fixes it to use resource_size() instead. Oddly, the code already uses resource size correctly a couple lines earlier when it calls request_mem_region() for this memory. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | headers: use __aligned_xx types for userspaceMike Frysinger2011-03-18
| | | | | | | | | | | | | | | | Now that we finally have __aligned_xx exported to userspace, convert the headers that get exported over to the proper type. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | bridge: Reset IPCB when entering IP stack on NF_FORWARDHerbert Xu2011-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whenever we enter the IP stack proper from bridge netfilter we need to ensure that the skb is in a form the IP stack expects it to be in. The entry point on NF_FORWARD did not meet the requirements of the IP stack, therefore leading to potential crashes/panics. This patch fixes the problem. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | vlan: should take into account needed_headroomEric Dumazet2011-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit c95b819ad7 (gre: Use needed_headroom) made gre use needed_headroom instead of hard_header_len This uncover a bug in vlan code. We should make sure vlan devices take into account their real_dev->needed_headroom or we risk a crash in ipgre_header(), because we dont have enough room to push IP header in skb. Reported-by: Diddi Oscarsson <diddi@diddi.se> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Patrick McHardy <kaber@trash.net> Cc: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | ethtool: Compat handling for struct ethtool_rxnfcBen Hutchings2011-03-18
| | | | | | | | | | | | | | | | | | | | | | This structure was accidentally defined such that its layout can differ between 32-bit and 64-bit processes. Add compat structure definitions and an ioctl wrapper function. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Cc: stable@kernel.org [2.6.30+] Signed-off-by: David S. Miller <davem@davemloft.net>
* | ethtool: __ethtool_set_sg: check for function pointer before using itRoger Luethi2011-03-18
| | | | | | | | | | | | | | | | | | | | __ethtool_set_sg does not check if dev->ethtool_ops->set_sg is defined which can result in a NULL pointer dereference when ethtool is used to change SG settings for drivers without SG support. Signed-off-by: Roger Luethi <rl@hellgate.ch> Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | econet: 4 byte infoleak to the networkVasiliy Kulikov2011-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | struct aunhdr has 4 padding bytes between 'pad' and 'handle' fields on x86_64. These bytes are not initialized in the variable 'ah' before sending 'ah' to the network. This leads to 4 bytes kernel stack infoleak. This bug was introduced before the git epoch. Signed-off-by: Vasiliy Kulikov <segoon@openwall.com> Acked-by: Phil Blundell <philb@gnu.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | gianfar: Fall back to software tcp/udp checksum on older controllersAlex Dubov2011-03-18
| | | | | | | | | | | | | | | | | | | | | | As specified by errata eTSEC49 of MPC8548 and errata eTSEC12 of MPC83xx, older revisions of gianfar controllers will be unable to calculate a TCP/UDP packet checksum for some alignments of the appropriate FCB. This patch checks for FCB alignment on such controllers and falls back to software checksumming if the alignment is known to be bad. Signed-off-by: Alex Dubov <oakad@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'linux-next' of ↵Linus Torvalds2011-03-18
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: PCI: label: remove #include of ACPI header to avoid warnings PCI: label: Fix compilation error when CONFIG_ACPI is unset PCI: pre-allocate additional resources to devices only after successful allocation of essential resources. PCI: introduce reset_resource() PCI: data structure agnostic free list function PCI: refactor io size calculation code PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICH PCI hotplug: acpiphp: set current_state to D0 in register_slot PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs PCI: add more checking to ICH region quirks PCI: aer-inject: Override PCIe AER Mask Registers PCI: fix tlan build when CONFIG_PCI is not enabled PCI: remove quirk for pre-production systems PCI: Avoid potential NULL pointer dereference in pci_scan_bridge PCI/lpc: irq and pci_ids patch for Intel DH89xxCC DeviceIDs PCI: sysfs: Fix failure path for addition of "vpd" attribute
| * | PCI: label: remove #include of ACPI header to avoid warningsShyam_Iyer@Dell.com2011-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I found that including acpi/apci_drivers.h is not necessary and introduces these warnings: In file included from drivers/pci/pci-label.c:32: include/acpi/acpi_drivers.h:103: warning: ‘struct acpi_device’ declared inside parameter list include/acpi/acpi_drivers.h:103: warning: its scope is only this definition or declaration, which is probably not what you want include/acpi/acpi_drivers.h:107: warning: ‘struct acpi_pci_root’ declared inside parameter list Signed-off-by: Shyam Iyer <shyam_iyer@dell.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: label: Fix compilation error when CONFIG_ACPI is unsetNarendra_K@Dell.com2011-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes compilation error descibed below introduced by the commit 6058989bad05b82e78baacce69ec14f27a11b5fd drivers/pci/pci-label.c: In function ‘pci_create_firmware_label_files’: drivers/pci/pci-label.c:366:2: error: implicit declaration of function ‘device_has_dsm’ Signed-off-by: Narendra K <narendra_k@dell.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: pre-allocate additional resources to devices only after successful ↵Ram Pai2011-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | allocation of essential resources. Linux tries to pre-allocate minimal resources to hotplug bridges. This works fine as long as there are enough resources to satisfy all other genuine resource requirements. However if enough resources are not available to satisfy any of these nice-to-have pre-allocations, the resource-allocator reports errors and returns failure. This patch distinguishes between must-have resource from nice-to-have resource. Any failure to allocate nice-to-have resources are ignored. This behavior can be particularly useful to trigger automatic reallocation when the OS discovers genuine allocation-conflicts or genuine unallocated-requests caused by buggy allocation behavior of the native BIOS/uEFI. https://bugzilla.kernel.org/show_bug.cgi?id=15960 captures the movitation behind the patch. This patch is verified to resolve the above bug. changelog v2: o fixed a bug where pci_assign_resource() was called on a resource of zero resource size. changelog v3: addressed Bjorn's comment o "Please don't indent and right-justify the changelog". o removed add_size from struct resource. The additional size is now tracked using a linked list. changelog v4: o moved freeing up of elements in head list from assign_requested_resources_sorted() to __assign_resources_sorted(). o removed a wrong reference to 'add_size' in pbus_size_mem(). o some code optimizations in adjust_resources_sorted() and assign_requested_resources_sorted() changelog v5: o moved freeing up of elements in head list from assign_requested_resources_sorted() to __assign_resources_sorted(). o removed a wrong reference to 'add_size' in pbus_size_mem(). o some code optimizations in adjust_resources_sorted() and assign_requested_resources_sorted() changelog v5: o factored out common code and made them into separate independent patches o added comments in kdoc format o added a BUG_ON in pci_assign_unassigned_resources() to catch for memory leak. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: introduce reset_resource()Ram Pai2011-03-04
| | | | | | | | | | | | | | | | | | | | | Introduce reset_resource() which factors out resource reset logic. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: data structure agnostic free list functionRam Pai2011-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | Replace free_failed_list() with a free_list() call. free_list() can handle 'resource_list_x', 'resource_list' and any linked list linked through ->next Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: refactor io size calculation codeRam Pai2011-03-04
| | | | | | | | | | | | | | | | | | | | | | | | Refactor code that calculates the io size in pbus_size_io() and pbus_mem_io() into separate functions. Signed-off-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: do not create quirk I/O regions below PCIBIOS_MIN_IO for ICHJiri Slaby2011-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some broken BIOSes on ICH4 chipset report an ACPI region which is in conflict with legacy IDE ports when ACPI is disabled. Even though the regions overlap, IDE ports are working correctly (we cannot find out the decoding rules on chipsets). So the only problem is the reported region itself, if we don't reserve the region in the quirk everything works as expected. This patch avoids reserving any quirk regions below PCIBIOS_MIN_IO which is 0x1000. Some regions might be (and are by a fast google query) below this border, but the only difference is that they won't be reserved anymore. They should still work though the same as before. The conflicts look like (1f.0 is bridge, 1f.1 is IDE ctrl): pci 0000:00:1f.1: address space collision: [io 0x0170-0x0177] conflicts with 0000:00:1f.0 [io 0x0100-0x017f] At 0x0100 a 128 bytes long ACPI region is reported in the quirk for ICH4. ata_piix then fails to find disks because the IDE legacy ports are zeroed: ata_piix 0000:00:1f.1: device not available (can't reserve [io 0x0000-0x0007]) References: https://bugzilla.novell.com/show_bug.cgi?id=558740 Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Renninger <trenn@suse.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI hotplug: acpiphp: set current_state to D0 in register_slotStefano Stabellini2011-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a device doesn't support power management (pm_cap == 0) but it is acpi_pci_power_manageable() because there is a _PS0 method declared for it and _EJ0 is also declared for the slot then nobody is going to set current_state = PCI_D0 for this device. This is what I think it is happening: pci_enable_device | __pci_enable_device_flags /* here we do not set current_state because !pm_cap */ | do_pci_enable_device | pci_set_power_state | __pci_start_power_transition | pci_platform_power_transition /* platform_pci_power_manageable() calls acpi_pci_power_manageable that * returns true */ | platform_pci_set_power_state /* acpi_pci_set_power_state gets called and does nothing because the * acpi device has _EJ0, see the comment "If the ACPI device has _EJ0, * ignore the device" */ at this point if we refer to the commit message that introduced the comment above (10b3dcae0f275e2546e55303d64ddbb58cec7599), it is up to the hotplug driver to set the state to D0. However AFAICT the pci hotplug driver never does, in fact drivers/pci/hotplug/acpiphp_glue.c:register_slot sets the slot flags to (SLOT_ENABLED | SLOT_POWEREDON) but it does not set the pci device current state to PCI_D0. So my proposed fix is also to set current_state = PCI_D0 in register_slot. Comments are very welcome. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: Export ACPI _DSM provided firmware instance number and string name to sysfsNarendra_K@Dell.com2011-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch exports ACPI _DSM (Device Specific Method) provided firmware instance number and string name of PCI devices as defined by 'PCI Firmware Specification Revision 3.1' section 4.6.7.( DSM for Naming a PCI or PCI Express Device Under Operating Systems) to sysfs. New files created are: /sys/bus/pci/devices/.../label which contains the firmware name for the device in question, and /sys/bus/pci/devices/.../acpi_index which contains the firmware device type instance for the given device. cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/acpi_index 1 cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/label Embedded Broadcom 5709C NIC 1 cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/acpi_index 2 cat /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/label Embedded Broadcom 5709C NIC 2 The ACPI _DSM provided firmware 'instance number' and 'string name' will be given priority if the firmware also provides 'SMBIOS type 41 device type instance and string'. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Jordan Hargrave <jordan_hargrave@dell.com> Signed-off-by: Narendra K <narendra_k@dell.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: add more checking to ICH region quirksJiri Slaby2011-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Per ICH4 and ICH6 specs, ACPI and GPIO regions are valid iff ACPI_EN and GPIO_EN bits are set to 1. Add checks for these bits into the quirks prior to the region creation. While at it, name the constants by macros. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Bjorn Helgaas <bjorn.helgaas@hp.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Thomas Renninger <trenn@suse.de> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: aer-inject: Override PCIe AER Mask RegistersPrarit Bhargava2011-03-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I have several systems which have the same problem: The PCIe AER corrected and uncorrected masks have all the error bits set. This results in the inablility to test with the aer_inject module & utility on those systems. Add the 'aer_mask_override' module parameter which will override the corrected or uncorrected masks for a PCI device. The mask will have the bit corresponding to the status passed into the aer_inject() function. After this patch it is possible to successfully use the aer_inject utility on those PCI slots. Successfully tested by me on a Dell and Intel whitebox which exhibited the mask problem. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: fix tlan build when CONFIG_PCI is not enabledRandy Dunlap2011-02-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_PCI is not enabled, tlan.c has a build error: drivers/net/tlan.c:503: error: implicit declaration of function 'pci_wake_from_d3' so add an inline function stub for this function to pci.h when PCI is not enabled, similar to other stubbed PCI functions. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Sakari Ailus <sakari.ailus@iki.fi> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: remove quirk for pre-production systemsBrandeburg, Jesse2011-02-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert commit 7eb93b175d4de9438a4b0af3a94a112cb5266944 Author: Yu Zhao <yu.zhao@intel.com> Date: Fri Apr 3 15:18:11 2009 +0800 PCI: SR-IOV quirk for Intel 82576 NIC If BIOS doesn't allocate resources for the SR-IOV BARs, zero the Flash BAR and program the SR-IOV BARs to use the old Flash Memory Space. Please refer to Intel 82576 Gigabit Ethernet Controller Datasheet section 7.9.2.14.2 for details. http://download.intel.com/design/network/datashts/82576_Datasheet.pdf Signed-off-by: Yu Zhao <yu.zhao@intel.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> This quirk was added before SR-IOV was in production and now all machines that originally had this issue alreayd have bios updates to correct the issue. The quirk itself is no longer needed and in fact causes bugs if run. Remove it. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> CC: Yu Zhao <yu.zhao@intel.com> CC: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
| * | PCI: Avoid potential NULL pointer dereference in pci_scan_bridgeJesper Juhl2011-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pci_add_new_bus() calls pci_alloc_child_bus() which calls pci_alloc_bus() that allocates memory dynamically with kzalloc(). The return value of kzalloc() is the pointer that's eventually returned from pci_add_new_bus(), so since kzalloc() can fail and return NULL so can pci_add_new_bus(). Thus we may end up dereferencing a NULL pointer in drivers/pci/probe.c::pci_scan_bridge(). Seems to me we should test for this and bail out if it happens rather than crashing. Also removed some trailing whitespace that bugged me while looking at this. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>