aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv6
Commit message (Collapse)AuthorAge
* net: Fix percpu counters deadlockHerbert Xu2008-12-30
| | | | | | | | | | | | | | | | When we converted the protocol atomic counters such as the orphan count and the total socket count deadlocks were introduced due to the mismatch in BH status of the spots that used the percpu counter operations. Based on the diagnosis and patch by Peter Zijlstra, this patch fixes these issues by disabling BH where we may be in process context. Reported-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6Linus Torvalds2008-12-28
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1429 commits) net: Allow dependancies of FDDI & Tokenring to be modular. igb: Fix build warning when DCA is disabled. net: Fix warning fallout from recent NAPI interface changes. gro: Fix potential use after free sfc: If AN is enabled, always read speed/duplex from the AN advertising bits sfc: When disabling the NIC, close the device rather than unregistering it sfc: SFT9001: Add cable diagnostics sfc: Add support for multiple PHY self-tests sfc: Merge top-level functions for self-tests sfc: Clean up PHY mode management in loopback self-test sfc: Fix unreliable link detection in some loopback modes sfc: Generate unique names for per-NIC workqueues 802.3ad: use standard ethhdr instead of ad_header 802.3ad: generalize out mac address initializer 802.3ad: initialize ports LACPDU from const initializer 802.3ad: remove typedef around ad_system 802.3ad: turn ports is_individual into a bool 802.3ad: turn ports is_enabled into a bool 802.3ad: make ntt bool ixgbe: Fix set_ringparam in ixgbe to use the same memory pools. ... Fixed trivial IPv4/6 address printing conflicts in fs/cifs/connect.c due to the conversion to %pI (in this networking merge) and the addition of doing IPv6 addresses (from the earlier merge of CIFS).
| * ipv6: fix the outgoing interface selection order in udpv6_sendmsg()Yang Hongyang2008-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1.When no interface is specified in an IPV6_PKTINFO ancillary data item, the interface specified in an IPV6_PKTINFO sticky optionis is used. RFC3542: 6.7. Summary of Outgoing Interface Selection This document and [RFC-3493] specify various methods that affect the selection of the packet's outgoing interface. This subsection summarizes the ordering among those in order to ensure deterministic behavior. For a given outgoing packet on a given socket, the outgoing interface is determined in the following order: 1. if an interface is specified in an IPV6_PKTINFO ancillary data item, the interface is used. 2. otherwise, if an interface is specified in an IPV6_PKTINFO sticky option, the interface is used. Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: fix the return interface index when get it while no message is receivedYang Hongyang2008-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When get receiving interface index while no message is received, the the value seted with setsockopt() should be returned. RFC 3542: Issuing getsockopt() for the above options will return the sticky option value i.e., the value set with setsockopt(). If no sticky option value has been set getsockopt() will return the following values: - For the IPV6_PKTINFO option, it will return an in6_pktinfo structure with ipi6_addr being in6addr_any and ipi6_ifindex being zero. Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv6: Add IPV6_PKTINFO sticky option support to setsockopt()Yang Hongyang2008-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are three reasons for me to add this support: 1.When no interface is specified in an IPV6_PKTINFO ancillary data item, the interface specified in an IPV6_PKTINFO sticky optionis is used. RFC3542: 6.7. Summary of Outgoing Interface Selection This document and [RFC-3493] specify various methods that affect the selection of the packet's outgoing interface. This subsection summarizes the ordering among those in order to ensure deterministic behavior. For a given outgoing packet on a given socket, the outgoing interface is determined in the following order: 1. if an interface is specified in an IPV6_PKTINFO ancillary data item, the interface is used. 2. otherwise, if an interface is specified in an IPV6_PKTINFO sticky option, the interface is used. 2.When no IPV6_PKTINFO ancillary data is received,getsockopt() should return the sticky option value which set with setsockopt(). RFC 3542: Issuing getsockopt() for the above options will return the sticky option value i.e., the value set with setsockopt(). If no sticky option value has been set getsockopt() will return the following values: 3.Make the setsockopt implementation POSIX compliant. Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of ↵David S. Miller2008-12-15
| |\ | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/e1000e/ich8lan.c
| * | ip6mr: use goto to common label instead of opencodingIlpo Järvinen2008-12-15
| | | | | | | | | | | | | | | Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | ipv6/mcast: join error paths using gotoIlpo Järvinen2008-12-15
| | | | | | | | | | | | | | | Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | xfrm6_tunnel: join error paths using gotoIlpo Järvinen2008-12-15
| | | | | | | | | | | | | | | Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netns: ip6mr: enable namespace support in ipv6 multicast forwarding codeBenjamin Thery2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This last patch makes the appropriate changes to use and propagate the network namespace where needed in IPv6 multicast forwarding code. This consists mainly in replacing all the remaining init_net occurences with current netns pointer retrieved from sockets, net devices or mfc6_caches depending on the routines' contexts. Some routines receive a new 'struct net' parameter to propagate the current netns: * ip6mr_get_route * ip6mr_cache_report * ip6mr_cache_find * ip6mr_cache_unresolved * mif6_add/mif6_delete * ip6mr_mfc_add/ip6mr_mfc_delete * ip6mr_reg_vif All the IPv6 multicast forwarding variables moved to struct netns_ipv6 by the previous patches are now referenced in the correct namespace. Changelog: ========== * Take into account the net associated to mfc6_cache when matching entries in mfc_unres_queue list. * Call mroute_clean_tables() in ip6mr_net_exit() to free memory allocated per-namespace. * Call dev_net_set() in ip6mr_reg_vif() to initialize dev->nd_net correctly. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netns: ip6mr: declare ip6mr /proc/net entries per-namespaceBenjamin Thery2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Declare IPv6 multicast forwarding /proc/net entries per-namespace: /proc/net/ip6_mr_vif /proc/net/ip6_mr_cache Changelog ========= V2: * In routine ipmr_mfc_seq_idx(), only match entries belonging to current netns in mfc_unres_queue list. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netns: ip6mr: declare reg_vif_num per-namespaceBenjamin Thery2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preliminary work to make IPv6 multicast forwarding netns-aware. Declare variable 'reg_vif_num' per-namespace, moves into struct netns_ipv6. At the moment, this variable is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netns: ip6mr: declare mroute_do_assert and mroute_do_pim per-namespaceBenjamin Thery2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preliminary work to make IPv6 multicast forwarding netns-aware. Declare IPv6 multicast forwarding variables 'mroute_do_assert' and 'mroute_do_pim' per-namespace in struct netns_ipv6. At the moment, these variables are only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netns: ip6mr: declare counter cache_resolve_queue_len per-namespaceBenjamin Thery2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preliminary work to make IPv6 multicast forwarding netns-aware. Declare variable cache_resolve_queue_len per-namespace: moves it into struct netns_ipv6. This variable counts the number of unresolved cache entries queued in the list mfc_unres_queue. This list is kept global to all netns as the number of entries per namespace is limited to 10 (hardcoded in routine ip6mr_cache_unresolved). Entries belonging to different namespaces in mfc_unres_queue will be identified by matching the mfc_net member introduced previously in struct mfc6_cache. Keeping this list global to all netns, also allows us to keep a single timer (ipmr_expire_timer) to handle their expiration. In some places cache_resolve_queue_len value was tested for arming or deleting the timer. These tests were equivalent to testing mfc_unres_queue value instead and are replaced in this patch. At the moment, cache_resolve_queue_len is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netns: ip6mr: dynamically allocate mfc6_cache_arrayBenjamin Thery2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preliminary work to make IPv6 multicast forwarding netns-aware. Dynamically allocates IPv6 multicast forwarding cache, mfc6_cache_array, and moves it to struct netns_ipv6. At the moment, mfc6_cache_array is only referenced in init_net. Replace 'ARRAY_SIZE(mfc6_cache_array)' with mfc6_cache_array size: MFC6_LINES. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netns: ip6mr: store netns in struct mfc6_cacheBenjamin Thery2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch stores into struct mfc6_cache the network namespace each mfc6_cache belongs to. The new member is mfc6_net. mfc6_net is assigned at cache allocation and doesn't change during the rest of the cache entry life. This will help to retrieve the current netns around the IPv6 multicast forwarding code. At the moment, all mfc6_cache are allocated in init_net. Changelog: ========== * Use write_pnet()/read_pnet() to set and get mfc6_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netns: ip6mr: dynamically allocates vif6_tableBenjamin Thery2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preliminary work to make IPv6 multicast forwarding netns-aware. Dynamically allocates interface table vif6_table and moves it to struct netns_ipv6, and updates MIF_EXISTS() macro. At the moment, vif6_table is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | netns: ip6mr: allocate mroute6_socket per-namespace.Benjamin Thery2008-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preliminary work to make IPv6 multicast forwarding netns-aware. Make IPv6 multicast forwarding mroute6_socket per-namespace, moves it into struct netns_ipv6. At the moment, mroute6_socket is only referenced in init_net. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net/ipv6/ip6mr.c: Use kmem_cache_zalloc, remove memsetJoe Perches2008-12-04
| | | | | | | | | | | | | | | Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: /proc/net/ip_mr_cache, display Iif as a signed shortBenjamin Thery2008-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today, iproute2 fails to show multicast forwarding unresolved cache entries while scanning /proc/net/ip_mr_cache. Indeed, it expects to see -1 in 'Iif' column to identify unresolved entries but the kernel outputs 65535. It's a signed/unsigned issue: 'Iif', the source interface, is retrieved from member mfc_parent in struct mfc_cache. mfc_parent is a vifi_t: unsigned short, but is displayed in ipmr_mfc_seq_show() as "%-3d", signed integer. In unresolevd entries, the 65535 value (0xFFFF) comes from this define: #define ALL_VIFS ((vifi_t)(-1)) That may explains why the guy who added support for this in iproute2 thought a -1 should be expected. I don't know if this must be fixed in kernel or in iproute2. Who is right? What is the correct API? How was it designed originally? I let you decide if it should goes in the kernel or be fixed in iproute2. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: fix /proc/net/ip_mr_cache display - V2Benjamin Thery2008-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | /proc/net/ip_mr_cache and /proc/net/ip6_mr_cache displays garbage when showing unresolved mfc_cache entries. [root@qemu tests]# cat /proc/net/ip_mr_cache Group Origin Iif Pkts Bytes Wrong Oifs 014C00EF 010014AC 1 10 10050 0 2:1 3:1 024C00EF 010014AC 65535 514 2 -559067475 The first line is correct. It is a resolved cache entry, 10 packets used it... The second line represents an unresolved entry, and the columns Pkts(4th), Bytes(5th) and Wrong(6th) just show garbage. In struct mfc_cache, there's an union to store data for resolved and unresolved cases. And what ipmr_mfc_seq_show() is printing in these columns for the unresolved entries is some bytes from mfc_cache.mfc_un.res. Bad. (eg. In our case -559067475 is in fact 0xdead4ead which is the spinlock magic from mfc_cache.mfc_un.unres.unresolved.lock.magic). This patch replaces the garbage data written in these columns for the unresolved entries by '0' (zeros) which is more correct. This change doesn't break the ABI. Also, mfc->mfc_un.res.pkt, mfc->mfc_un.res.bytes, mfc->mfc_un.res.wrong_if are unsigned long. It applies on top of net-next-2.6. The patch for net-2.6 is slightly different because of the NIP6_FMT to %pI6 conversion that was made in the seq_printf. Changelog: ========== V2: * Instead of breaking the ABI by suppressing the columns that have no meaning for unresolved entries, fill them with 0 values. Signed-off-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge branch 'master' of ↵David S. Miller2008-11-28
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 Conflicts: net/netfilter/nf_conntrack_netlink.c
| | * | netfilter: nfmark IPV6 routing in OUTPUT, mangle, NFQUEUEEric Leblond2008-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch let nfmark to be evaluated for routing decision for OUTPUT packet, in mangle table, when process paquet in NFQUEUE. This patch is an IPv6 port of Laurent Licour IPv4 one. Signed-off-by: Eric Leblond <eric@inl.fr> Signed-off-by: Patrick McHardy <kaber@trash.net>
| | * | netfilter: ip6table_filter: merge LOCAL_IN and FORWARD hooksAlexey Dobriyan2008-11-20
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net>
| * | | net: Use a percpu_counter for sockets_allocatedEric Dumazet2008-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using one atomic_t per protocol, use a percpu_counter for "sockets_allocated", to reduce cache line contention on heavy duty network servers. Note : We revert commit (248969ae31e1b3276fc4399d67ce29a5d81e6fd9 net: af_unix can make unix_nr_socks visbile in /proc), since it is not anymore used after sock_prot_inuse_add() addition Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns xfrm: per-netns MIBsAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns xfrm: AH/ESP in netns!Alexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns xfrm: ->get_saddr in netnsAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns xfrm: ->dst_lookup in netnsAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns xfrm: KM reporting in netnsAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns xfrm: dst garbage-collecting in netnsAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass netns pointer to struct xfrm_policy_afinfo::garbage_collect() [This needs more thoughts on what to do with dst_ops] [Currently stub to init_net] Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns xfrm: lookup in netnsAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pass netns to xfrm_lookup()/__xfrm_lookup(). For that pass netns to flow_cache_lookup() and resolver callback. Take it from socket or netdevice. Stub DECnet to init_net. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns xfrm: state lookup in netnsAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | netns xfrm: add struct xfrm_state::xs_netAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid unnecessary complications with passing netns around. * set once, very early after allocating * once set, never changes For a while create every xfrm_state in init_net. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ipv6: fix warning in net/ipv6/ip6_flowlabel.cIngo Molnar2008-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this warning: net/ipv6/ip6_flowlabel.c: In function ‘ipv6_flowlabel_opt’: net/ipv6/ip6_flowlabel.c:467: warning: ‘err’ may be used uninitialized in this function triggers because GCC does not recognize the (correct) error flow between fl_create() and 'err'. Annotate it. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | xfrm: remove useless forward declarationsAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ah4/ah6: remove useless NULL assignmentsAlexey Dobriyan2008-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | struct will be kfreed in a moment, so... Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: fix tunnels in netns after ndo_ changesAlexey Dobriyan2008-11-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dev_net_set() should be the very first thing after alloc_netdev(). "ndo_" changes turned simple assignment (which is OK to do before netns assignment) into quite non-trivial operation (which is not OK, init_net was used). This leads to incomplete initialisation of tunnel device in netns. BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f *pde = 00000000 Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC last sysfs file: /sys/class/net/lo/operstate Pid: 10, comm: netns Not tainted (2.6.28-rc6 #1) EIP: 0060:[<c02efdb5>] EFLAGS: 00010246 CPU: 0 EIP is at ip6_tnl_exit_net+0x37/0x4f EAX: 00000000 EBX: 00000020 ECX: 00000000 EDX: 00000003 ESI: c5caef30 EDI: c782bbe8 EBP: c7909f50 ESP: c7909f48 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process netns (pid: 10, ti=c7908000 task=c7905780 task.ti=c7908000) Stack: c03e75e0 c7390bc8 c7909f60 c0245448 c7390bd8 c7390bf0 c7909fa8 c012577a 00000000 00000002 00000000 c0125736 c782bbe8 c7909f90 c0308fe3 c782bc04 c7390bd4 c0245406 c084b718 c04f0770 c03ad785 c782bbe8 c782bc04 c782bc0c Call Trace: [<c0245448>] ? cleanup_net+0x42/0x82 [<c012577a>] ? run_workqueue+0xd6/0x1ae [<c0125736>] ? run_workqueue+0x92/0x1ae [<c0308fe3>] ? schedule+0x275/0x285 [<c0245406>] ? cleanup_net+0x0/0x82 [<c0125ae1>] ? worker_thread+0x81/0x8d [<c0128344>] ? autoremove_wake_function+0x0/0x33 [<c0125a60>] ? worker_thread+0x0/0x8d [<c012815c>] ? kthread+0x39/0x5e [<c0128123>] ? kthread+0x0/0x5e [<c0103b9f>] ? kernel_thread_helper+0x7/0x10 Code: db e8 05 ff ff ff 89 c6 e8 dc 04 f6 ff eb 08 8b 40 04 e8 38 89 f5 ff 8b 44 9e 04 85 c0 75 f0 43 83 fb 20 75 f2 8b 86 84 00 00 00 <8b> 40 04 e8 1c 89 f5 ff e8 98 04 f6 ff 89 f0 e8 f8 63 e6 ff 5b EIP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f SS:ESP 0068:c7909f48 ---[ end trace 6c2f2328fccd3e0c ]--- Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: Convert TCP/DCCP listening hash tables to use RCUEric Dumazet2008-11-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the last step to be able to perform full RCU lookups in __inet_lookup() : After established/timewait tables, we add RCU lookups to listening hash table. The only trick here is that a socket of a given type (TCP ipv4, TCP ipv6, ...) can now flight between two different tables (established and listening) during a RCU grace period, so we must use different 'nulls' end-of-chain values for two tables. We define a large value : #define LISTENING_NULLS_BASE (1U << 29) So that slots in listening table are guaranteed to have different end-of-chain values than slots in established table. A reader can still detect it finished its lookup in the right chain. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: remove redundant argument commentsQinghuang Feng2008-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove redundant argument comments in files of net/* Signed-off-by: Qinghuang Feng <qhfeng.kernel@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: convert TCP/DCCP ehash rwlocks to spinlocksEric Dumazet2008-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now TCP & DCCP use RCU lookups, we can convert ehash rwlocks to spinlocks. /proc/net/tcp and other seq_file 'readers' can safely be converted to 'writers'. This should speedup writers, since spin_lock()/spin_unlock() only use one atomic operation instead of two for write_lock()/write_unlock() Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ipv6: convert tunnels to net_device_opsStephen Hemminger2008-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Like IPV4, convert the tunnel virtual devices to use net_device_ops. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | ipmr: convert ipmr virtual interface to net_device_opsStephen Hemminger2008-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to new network device ops interface. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge branch 'master' of ↵David S. Miller2008-11-20
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/ixgbe/ixgbe_main.c include/net/mac80211.h net/phonet/af_phonet.c
| * | | | net: listening_hash get a spinlock per bucketEric Dumazet2008-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch prepares RCU migration of listening_hash table for TCP/DCCP protocols. listening_hash table being small (32 slots per protocol), we add a spinlock for each slot, instead of a single rwlock for whole table. This should reduce hold time of readers, and writers concurrency. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ip: convert to net_device_ops for ioctlStephen Hemminger2008-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to net_device_ops function table pointer for ioctl. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | include/net net/ - csum_partial - remove unnecessary castsJoe Perches2008-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The first argument to csum_partial is const void * casts to char/u8 * are not necessary Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge branch 'master' of ↵David S. Miller2008-11-19
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/isdn/i4l/isdn_net.c fs/cifs/connect.c
| * | | | | net: Convert TCP & DCCP hash tables to use RCU / hlist_nullsEric Dumazet2008-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RCU was added to UDP lookups, using a fast infrastructure : - sockets kmem_cache use SLAB_DESTROY_BY_RCU and dont pay the price of call_rcu() at freeing time. - hlist_nulls permits to use few memory barriers. This patch uses same infrastructure for TCP/DCCP established and timewait sockets. Thanks to SLAB_DESTROY_BY_RCU, no slowdown for applications using short lived TCP connections. A followup patch, converting rwlocks to spinlocks will even speedup this case. __inet_lookup_established() is pretty fast now we dont have to dirty a contended cache line (read_lock/read_unlock) Only established and timewait hashtable are converted to RCU (bind table and listen table are still using traditional locking) Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | udp: Use hlist_nulls in UDP RCU codeEric Dumazet2008-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a straightforward patch, using hlist_nulls infrastructure. RCUification already done on UDP two weeks ago. Using hlist_nulls permits us to avoid some memory barriers, both at lookup time and delete time. Patch is large because it adds new macros to include/net/sock.h. These macros will be used by TCP & DCCP in next patch. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>