aboutsummaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAge
* pppoatm: fix module_put() raceKrzysztof Mazur2012-11-27
| | | | | | | | | | | The pppoatm used module_put() during unassignment from vcc with hope that we have BKL. This assumption is no longer true. Now owner field in atmvcc is used to move this module_put() to vcc_destroy_socket(). Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* pppoatm: allow assign only on a connected socketKrzysztof Mazur2012-11-27
| | | | | | | | | | | | | The pppoatm does not check if used vcc is in connected state, causing an Oops in pppoatm_send() when vcc->send() is called on not fully connected socket. Now pppoatm can be assigned only on connected sockets; otherwise -EINVAL error is returned. Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Cc: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
* atm: add owner of push() callback to atmvccKrzysztof Mazur2012-11-27
| | | | | | | | | | | | | | The atm is using atmvcc->push(vcc, NULL) callback to notify protocol that vcc will be closed and protocol must detach from it. This callback is usually used by protocol to decrement module usage count by module_put(), but it leaves small window then module is still used after module_put(). Now the owner of push() callback is kept in atmvcc and module_put(atmvcc->owner) is called after the protocol is detached from vcc. Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Acked-by: Chas Williams <chas@cmf.nrl.navy.mil>
* atm: br2684: Fix excessive queue bloatDavid Woodhouse2012-11-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's really no excuse for an additional wmem_default of buffering between the netdev queue and the ATM device. Two packets (one in-flight, and one ready to send) ought to be fine. It's not as if it should take long to get another from the netdev queue when we need it. If necessary we can make the queue space configurable later, but I don't think it's likely to be necessary. cf. commit 9d02daf754238adac48fa075ee79e7edd3d79ed3 (pppoatm: Fix excessive queue bloat) which did something very similar for PPPoATM. Note that there is a tremendously unlikely race condition which may result in qspace temporarily going negative. If a CPU running the br2684_pop() function goes off into the weeds for a long period of time after incrementing qspace to 1, but before calling netdev_wake_queue()... and another CPU ends up calling br2684_start_xmit() and *stopping* the queue again before the first CPU comes back, the netdev queue could end up being woken when qspace has already reached zero. An alternative approach to coping with this race would be to check in br2684_start_xmit() for qspace==0 and return NETDEV_TX_BUSY, but just using '> 0' and '< 1' for comparison instead of '== 0' and '!= 0' is simpler. It just warranted a mention of *why* we do it that way... Move the call to atmvcc->send() to happen *after* the accounting and potentially stopping the netdev queue, in br2684_xmit_vcc(). This matters if the ->send() call suffers an immediate failure, because it'll call br2684_pop() with the offending skb before returning. We want that to happen *after* we've done the initial accounting for the packet in question. Also make it return an appropriate success/failure indication while we're at it. Tested by running 'ping -l 1000 bottomless.aaisp.net.uk' from within my network, with only a single PPPoE-over-BR2684 link running. And after setting txqueuelen on the nas0 interface to something low (5, in fact). Before the patch, we'd see about 15 packets being queued and a resulting latency of ~56ms being reached. After the patch, we see only about 8, which is fairly much what we expect. And a max latency of ~36ms. On this OpenWRT box, wmem_default is 163840. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> Reviewed-by: Krzysztof Mazur <krzysiek@podlesie.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* dsa: Hide core config options; make drivers select what they needBen Hutchings2012-11-26
| | | | | | | | | | | | | | | | | | | | | | Commit 82167cb8c6b2f8166d5c7532e5ef4b5e0cc46a72 ('net: dsa/slave: Fix compilation warnings') fixed one possible invalid configuration (NET_DSA enabled with no trailer formats) but added others: drivers can select NET_DSA without its dependencies being met. It's not very useful to make either the DSA core or the tagging formats manually selectable without a driver to use them, so: 1. Define a hidden HAVE_NET_DSA option and move the dependencies of NET_DSA to that. While we're at it, drop the deprecated EXPERIMENTAL dependency. 2. Make NET_DSA and the drivers dependent on HAVE_NET_DSA. 3. Hide the tagging format options again. 4. Make drivers select both NET_DSA and the appropriate tagging format option. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4/ipmr and ipv6/ip6mr: Convert int mroute_do_<foo> to boolJoe Perches2012-11-25
| | | | | | | | | | Save a few bytes per table by convert mroute_do_assert and mroute_do_pim from int to bool. Remove !! as the compiler does that when assigning int to bool. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: ipmr: various fixes and cleanupsEric Dumazet2012-11-25
| | | | | | | | | | | | | | | | 1) ip_mroute_setsockopt() & ip_mroute_getsockopt() should not access/set raw_sk(sk)->ipmr_table before making sure the socket is a raw socket, and protocol is IGMP 2) MRT_INIT should return -EINVAL if optlen != sizeof(int), not -ENOPROTOOPT 3) MRT_ASSERT & MRT_PIM should validate optlen 4) " (v) ? 1 : 0 " can be written as " !!v " Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: dsa/slave: Fix compilation warningsviresh kumar2012-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently when none of CONFIG_NET_DSA_TAG_DSA, CONFIG_NET_DSA_TAG_EDSA and CONFIG_NET_DSA_TAG_TRAILER is defined, we get following compilation warnings: net/dsa/slave.c:51:12: warning: 'dsa_slave_init' defined but not used [-Wunused-function] net/dsa/slave.c:60:12: warning: 'dsa_slave_open' defined but not used [-Wunused-function] net/dsa/slave.c:98:12: warning: 'dsa_slave_close' defined but not used [-Wunused-function] net/dsa/slave.c:116:13: warning: 'dsa_slave_change_rx_flags' defined but not used [-Wunused-function] net/dsa/slave.c:127:13: warning: 'dsa_slave_set_rx_mode' defined but not used [-Wunused-function] net/dsa/slave.c:136:12: warning: 'dsa_slave_set_mac_address' defined but not used [-Wunused-function] net/dsa/slave.c:164:12: warning: 'dsa_slave_ioctl' defined but not used [-Wunused-function] Earlier approach to fix this was discussed here: lkml.org/lkml/2012/10/29/549 This is another approach to fix it. This is done by some changes in config options, which make more sense than the earlier approach. As, atleast one tagging option must always be selected for using net/dsa/ infrastructure, this patch selects NET_DSA from tagging configs instead of having it as an selectable config. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: sched: enable CAN Identifier to be build into kernelMarc Kleine-Budde2012-11-25
| | | | | | | | This patch makes it possible to build the CAN Identifier into the kernel, even if the CAN support is build as a module. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2012-11-25
|\ | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/wireless/iwlwifi/pcie/tx.c Minor iwlwifi conflict in TX queue disabling between 'net', which removed a bogus warning, and 'net-next' which added some status register poking code. Signed-off-by: David S. Miller <davem@davemloft.net>
| * ipv4: do not cache looped multicastsJulian Anastasov2012-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Starting from 3.6 we cache output routes for multicasts only when using route to 224/4. For local receivers we can set RTCF_LOCAL flag depending on the membership but in such case we use maddr and saddr which are not caching keys as before. Additionally, we can not use same place to cache routes that differ in RTCF_LOCAL flag value. Fix it by caching only RTCF_MULTICAST entries without RTCF_LOCAL (send-only, no loopback). As a side effect, we avoid unneeded lookup for fnhe when not caching because multicasts are not redirected and they do not learn PMTU. Thanks to Maxime Bizon for showing the caching problems in __mkroute_output for 3.6 kernels: different RTCF_LOCAL flag in cache can lead to wrong ip_mc_output or ip_output call and the visible problem is that traffic can not reach local receivers via loopback. Reported-by: Maxime Bizon <mbizon@freebox.fr> Tested-by: Maxime Bizon <mbizon@freebox.fr> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of git://1984.lsi.us.es/nfDavid S. Miller2012-11-22
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pablo Neira Ayuso says: ==================== The following patchset contains two Netfilter fixes: * Fix buffer overflow in the name of the timeout policy object in the cttimeout infrastructure, from Florian Westphal. * Fix a bug in the hash set in case that IP ranges are specified, from Jozsef Kadlecsik. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * netfilter: cttimeout: fix buffer overflowFlorian Westphal2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chen Gang reports: the length of nla_data(cda[CTA_TIMEOUT_NAME]) is not limited in server side. And indeed, its used to strcpy to a fixed-sized buffer. Fortunately, nfnetlink users need CAP_NET_ADMIN. Reported-by: Chen Gang <gang.chen@asianux.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| | * netfilter: ipset: Fix range bug in hash:ip,port,netJozsef Kadlecsik2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to the missing ininitalization at adding/deleting entries, when a plain_ip,port,net element was the object, multiple elements were added/deleted instead. The bug came from the missing dangling default initialization. The error-prone default initialization is corrected in all hash:* types. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * | Merge branch 'master' of ↵David S. Miller2012-11-22
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== This pull request is intended for 3.7 and contains a single patch to fix the IPsec gc threshold value for ipv4. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | xfrm: Fix the gc threshold value for ipv4Steffen Klassert2012-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The xfrm gc threshold value depends on ip_rt_max_size. This value was set to INT_MAX with the routing cache removal patch, so we start doing garbage collecting when we have INT_MAX/2 IPsec routes cached. Fix this by going back to the static threshold of 1024 routes. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | Merge branch 'master' of ↵John W. Linville2012-11-21
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem John W. Linville says: ==================== This is a batch of fixes intended for 3.7... Included are two pulls. Regarding the mac80211 tree, Johannes says: "Please pull my mac80211.git tree (see below) to get two more fixes for 3.7. Both fix regressions introduced *before* this cycle that weren't noticed until now, one for IBSS not cleaning up properly and the other to add back the "wireless" sysfs directory for Fedora's startup scripts." Regarding the iwlwifi tree, Johannes says: "Please also pull my iwlwifi.git tree, I have two fixes: one to remove a spurious warning that can actually trigger in legitimate situations, and the other to fix a regression from when monitor mode was changed to use the "sniffer" firmware mode." Also included is an nfc tree pull. Samuel says: "We mostly have pn533 fixes here, 2 memory leaks and an early unlocking fix. Moreover, we also have an LLCP adapter linked list insertion fix." On top of that, a few more bits... Albert Pool adds a USB ID to rtlwifi. Bing Zhao provides two mwifiex fixes -- one to fix a system hang during a command timeout, and the other to properly report a suspend error to the MMC core. Finally, Sujith Manoharan fixes a thinko that would trigger an ath9k hang during device reset. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | NFC: Fix nfc_llcp_local chained list insertionThierry Escande2012-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | list_add was called with swapped parameters Signed-off-by: Thierry Escande <thierry.escande@linux.intel.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
| | * | | Merge branch 'for-john' of ↵John W. Linville2012-11-19
| | |\ \ \ | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
| | | * | | wireless: add back sysfs directoryJohannes Berg2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 35b2a113cb0298d4f9a1263338b456094a414057 broke (at least) Fedora's networking scripts, they check for the existence of the wireless directory. As the files aren't used, add the directory back and not the files. Also do it for both drivers based on the old wireless extensions and cfg80211, regardless of whether the compat code for wext is built into cfg80211 or not. Cc: stable@vger.kernel.org [3.6] Reported-by: Dave Airlie <airlied@gmail.com> Reported-by: Bill Nottingham <notting@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| | | * | | mac80211: deinitialize ibss-internals after emptiness checkSimon Wunderlich2012-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The check whether the IBSS is active and can be removed should be performed before deinitializing the fields used for the check/search. Otherwise, the configured BSS will not be found and removed properly. To make it more clear for the future, rename sdata->u.ibss to the local pointer ifibss which is used within the checks. This behaviour was introduced by f3209bea110cade12e2b133da8b8499689cb0e2e ("mac80211: fix IBSS teardown race") Cc: stable@vger.kernel.org Cc: Ignacy Gawedzki <i@lri.fr> Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
| * | | | | ipv6: fix inet6_csk_update_pmtu() return valueEric Dumazet2012-11-20
| | |_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of error, inet6_csk_update_pmtu() should consistently return NULL. Bug added in commit 35ad9b9cf7d8a (ipv6: Add helper inet6_csk_update_pmtu().) Reported-by: Lluís Batlle i Rossell <viric@viric.name> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | Merge branch 'tipc_net-next' of ↵David S. Miller2012-11-23
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Paul Gortmaker says: ==================== The most interesting thing here, at least from a user perspective, is the broadcast link fix -- where there was a corner case where two endpoints could get in a state where they disagree on where to start Rx and ack of broadcast packets. There is also the poll/wait changes which could also impact end users for certain use cases - the fixes there also better align tipc with the rest of the networking code. The rest largely falls into routine cleanup category, by getting rid of some unused routines, some Kconfig clutter, etc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | tipc: delete TIPC_ADVANCED Kconfig variablePaul Gortmaker2012-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There used to be a time when TIPC had lots of Kconfig knobs the end user could alter, but they have all been made automatic or obsolete, with the exception of CONFIG_TIPC_PORTS. This previously existing set of options was all hidden under the TIPC_ADVANCED setting, which does not exist in any code, but only in Kconfig scope. Having this now, just to hide the one remaining "advanced" option no longer makes sense. Remove it. Also get rid of the ifdeffery in the TIPC code that allowed for TIPC_PORTS to be possibly undefined. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * | | | | tipc: eliminate an unnecessary cast of node variableYing Xue2012-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the variable:node is currently defined to u32 type, it is unnecessary to cast its type to u32 again when using it. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * | | | | tipc: introduce message to synchronize broadcast linkJon Maloy2012-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upon establishing a first link between two nodes, there is currently a risk that the two endpoints will disagree on exactly which sequence number reception and acknowleding of broadcast packets should start. The following scenarios may happen: 1: Node A sends an ACTIVATE message to B, telling it to start acking packets from sequence number N. 2: Node A sends out broadcast N, but does not expect an acknowledge from B, since B is not yet in its broadcast receiver's list. 3: Node A receives ACK for N from all nodes except B, and releases packet N. 4: Node B receives the ACTIVATE, activates its link endpoint, and stores the value N as sequence number of first expected packet. 5: Node B sends a NAME_DISTR message to A. 6: Node A receives the NAME_DISTR message, and activates its endpoint. At this moment B is added to A's broadcast receiver's set. Node A also sets sequence number 0 as the first broadcast packet to be received from B. 7: Node A sends broadcast N+1. 8: B receives N+1, determines there is a gap in the sequence, since it is expecting N, and sends a NACK for N back to A. 9: Node A has already released N, so no retransmission is possible. The broadcast link in direction A->B is stale. In addition to, or instead of, 7-9 above, the following may happen: 10: Node B sends broadcast M > 0 to A. 11: Node A receives M, falsely decides there must be a gap, since it is expecting packet 0, and asks for retransmission of packets [0,M-1]. 12: Node B has already released these packets, so the broadcast link is stale in direction B->A. We solve this problem by introducing a new unicast message type, BCAST_PROTOCOL/STATE, to convey the sequence number of the next sent broadcast packet to the other endpoint, at exactly the moment that endpoint is added to the own node's broadcast receivers list, and before any other unicast messages are permitted to be sent. Furthermore, we don't allow any node to start receiving and processing broadcast packets until this new synchronization message has been received. To maintain backwards compatibility, we still open up for broadcast reception if we receive a NAME_DISTR message without any preceding broadcast sync message. In this case, we must assume that the other end has an older code version, and will never send out the new synchronization message. Hence, for mixed old and new nodes, the issue arising in 7-12 of the above may happen with the same probability as before. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * | | | | tipc: rename supported flag to recv_permittedYing Xue2012-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename the "supported" flag in bclink structure to "recv_permitted" to better reflect what it is used for. When this flag is set for a given node, we are permitted to receive and acknowledge broadcast messages from that node. Convert it to a bool at the same time, since it is not used to store any numerical values. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * | | | | tipc: remove supportable flag from bclink structureYing Xue2012-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "supportable" flag in bclink structure is a compatibility flag indicating whether a peer node is capable of receiving TIPC broadcast messages. However, all TIPC versions since tipc-1.5, and after the inclusion in the upstream Linux kernel in 2006, support this capability. It is highly unlikely that anybody is still using such an old version of TIPC, let alone that they want to mix it with TIPC-2.0 nodes. Therefore, we now remove the "supportable" flag. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * | | | | tipc: remove the bearer congestion mechanismYing Xue2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently at the TIPC bearer layer there is the following congestion mechanism: Once sending packets has failed via that bearer, the bearer will be flagged as being in congested state at once. During bearer congestion, all packets arriving at link will be queued on the link's outgoing buffer. When we detect that the state of bearer congestion has relaxed (e.g. some packets are received from the bearer) we will try our best to push all packets in the link's outgoing buffer until the buffer is empty, or until the bearer is congested again. However, in fact the TIPC bearer never receives any feedback from the device layer whether a send was successful or not, so it must always assume it was successful. Therefore, the bearer congestion mechanism as it exists currently is of no value. But the bearer blocking state is still useful for us. For example, when the physical media goes down/up, we need to change the state of the links bound to the bearer. So the code maintaing the state information is not removed. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * | | | | tipc: wake up all waiting threads at socket shutdownYing Xue2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a socket is shut down, we should wake up all thread sleeping on it, instead of just one of them. Otherwise, when several threads are polling the same socket, and one of them does shutdown(), the remaining threads may end up sleeping forever. Also, to align socket usage with common practice in other stacks, we use one of the common socket callback handlers, sk_state_change(), to wake up pending users. This is similar to the usage in e.g. inet_shutdown(). [net/ipv4/af_inet.c]. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * | | | | tipc: return POLLOUT for sockets in an unconnected stateErik Hugne2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an implied connect is attempted on a nonblocking STREAM/SEQPACKET socket during link congestion, the connect message will be discarded and sendmsg will return EAGAIN. This is normal behavior, and the application is expected to poll the socket until POLLOUT is set, after which the connection attempt can be retried. However, the POLLOUT flag is never set for unconnected sockets and poll() always returns a zero mask. The application is then left without a trigger for when it can make another attempt at sending the message. The solution is to check if we're polling on an unconnected socket and set the POLLOUT flag if the TIPC port owned by this socket is not congested. The TIPC ports waiting on a specific link will be marked as 'not congested' when the link congestion have abated. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
| * | | | | tipc: fix race/inefficiencies in poll/wait behaviourYing Xue2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When an application blocks at poll/select on a TIPC socket while requesting a specific event mask, both the filter_rcv() and wakeupdispatch() case will wake it up unconditionally whenever the state changes (i.e an incoming message arrives, or congestion has subsided). No mask is used. To avoid this, we populate sk->sk_data_ready and sk->sk_write_space with tipc_data_ready and tipc_write_space respectively, which makes tipc more in alignment with the rest of the networking code. These pass the exact set of possible events to the waker in fs/select.c hence avoiding waking up blocked processes unnecessarily. In doing so, we uncover another issue -- that there needs to be a memory barrier in these poll/receive callbacks, otherwise we are subject to the the same race as documented above wq_has_sleeper() [in commit a57de0b4 "net: adding memory barrier to the poll and receive callbacks"]. So we need to replace poll_wait() with sock_poll_wait() and use rcu protection for the sk->sk_wq pointer in these two new functions. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* | | | | | ipv6: adapt connect for repair moveAndrey Vagin2012-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is work the same as for ipv4. All other hacks about tcp repair are in common code for ipv4 and ipv6, so this patch is enough for repairing ipv6 connections. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrey Vagin <avagin@openvz.org> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | Merge branch 'master' of ↵David S. Miller2012-11-22
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== This pull request is intended for net-next and contains the following changes: 1) Remove a redundant check when initializing the xfrm replay functions, from Ulrich Weber. 2) Use a faster per-cpu helper when allocating ipcomt transforms, from Shan Wei. 3) Use a static gc threshold value for ipv6, simmilar to what we do for ipv4 now. 4) Remove a commented out function call. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | xfrm6: Remove commented out function call to xfrm6_input_finiSteffen Klassert2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | xfrm6_input_fini() is not in the tree since more than 10 years, so remove the commented out function call. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | | | | xfrm: Use a static gc threshold value for ipv6Steffen Klassert2012-11-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlike ipv4 did, ipv6 does not handle the maximum number of cached routes dynamically. So no need to try to handle the IPsec gc threshold value dynamically. This patch sets the IPsec gc threshold value back to 1024 routes, as it is for non-IPsec routes. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | | | | net: xfrm: use __this_cpu_read per-cpu helperShan Wei2012-11-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this_cpu_ptr/this_cpu_read is faster than per_cpu_ptr(p, smp_processor_id()) and can reduce memory accesses. The latter helper needs to find the offset for current cpu, and needs more assembler instructions which objdump shows in following. this_cpu_ptr relocates and address. this_cpu_read() relocates the address and performs the fetch. this_cpu_read() saves you more instructions since it can do the relocation and the fetch in one instruction. per_cpu_ptr(p, smp_processor_id()): 1e: 65 8b 04 25 00 00 00 00 mov %gs:0x0,%eax 26: 48 98 cltq 28: 31 f6 xor %esi,%esi 2a: 48 c7 c7 00 00 00 00 mov $0x0,%rdi 31: 48 8b 04 c5 00 00 00 00 mov 0x0(,%rax,8),%rax 39: c7 44 10 04 14 00 00 00 movl $0x14,0x4(%rax,%rdx,1) this_cpu_ptr(p) 1e: 65 48 03 14 25 00 00 00 00 add %gs:0x0,%rdx 27: 31 f6 xor %esi,%esi 29: c7 42 04 14 00 00 00 movl $0x14,0x4(%rdx) 30: 48 c7 c7 00 00 00 00 mov $0x0,%rdi Signed-off-by: Shan Wei <davidshan@tencent.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
| * | | | | | xfrm: remove redundant replay_esn checkUlrich Weber2012-11-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | x->replay_esn is already checked in if clause, so remove check and ident properly Signed-off-by: Ulrich Weber <ulrich.weber@sophos.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* | | | | | | batman-adv: Use packing of 2 for all headers before an ethernet headerSven Eckelmann2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All packet headers in front of an ethernet header have to be completely divisible by 2 but not by 4 to make the payload after the ethernet header again 4 bytes boundary aligned. A packing of 2 is necessary to avoid extra padding at the end of the struct caused by a structure member which is larger than two bytes. Otherwise the structure would not fulfill the previously mentioned rule to avoid the misalignment of the payload after the ethernet header. It may also lead to leakage of information when the padding it not initialized before sending. Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
* | | | | | | batman-adv: Start new development cycleSven Eckelmann2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
* | | | | | | batman-adv: Fix broadcast duplist for fragmentationSimon Wunderlich2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the skb is fragmented, the checksum must be computed on the individual fragments, just using skb->data may fail on fragmented data. Instead of doing linearizing the packet, use the new batadv_crc32 to do that more efficiently- it should not hurt replacing the old crc16 by the new crc32. Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
* | | | | | | batman-adv: Add function to calculate crc32c for the skb payloadSven Eckelmann2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
* | | | | | | batman-adv: Add wrapper to look up neighbor and send skbMartin Hundebøll2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By adding batadv_send_skb_to_orig() in send.c, we can remove duplicate code that looks up the next hop and then calls batadv_send_skb_packet(). Furthermore, this prepares the upcoming new implementation of fragmentation, which requires the next hop to route packets. Please note that this doesn't entirely remove the next-hop lookup in routing.c and unicast.c, since it is used by the current fragmentation code. Also note that the next-hop info is removed from debug messages in translation-table.c, since it is looked up elsewhere. Signed-off-by: Martin Hundebøll <martin@hundeboll.net> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
* | | | | | | batman-adv: support array of debugfs general attributesAntonio Quartulli2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for an array of debugfs general (not soft_iface specific) attributes. With this change it will be possible to add more general attributes by simply appending them to the array without touching the rest of the code. Reported-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org> Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
* | | | | | | batman-adv: fix bla compare functionSimon Wunderlich2012-11-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The address and the VLAN VID may not be packed in the respective structs. Fix this by comparing the elements individually. Reported-by: Marek Lindner <lindner_marek@yahoo.de> Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
* | | | | | | batman-adv: Mark best gateway in transtable_global debugfsSven Eckelmann2012-11-21
| |/ / / / / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The transtable_global debug file can show multiple entries for a single client when multiple gateways exist. The chosen gateway isn't marked in the list and therefore the user cannot easily debug the situation when there is a problem with the currently used gateway. The best gateway is now marked with "*" and secondary gateways are marked with "+". Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <lindner_marek@yahoo.de> Signed-off-by: Antonio Quartulli <ordex@autistici.org>
* | | | | | sctp: send abort chunk when max_retrans exceededNeil Horman2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the event that an association exceeds its max_retrans attempts, we should send an ABORT chunk indicating that we are closing the assocation as a result. Because of the nature of the error, its unlikely to be received, but its a nice clean way to close the association if it does make it through, and it will give anyone watching via tcpdump a clue as to what happened. Change notes: v2) * Removed erroneous changes from sctp_make_violation_parmlen Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Vlad Yasevich <vyasevich@gmail.com> CC: "David S. Miller" <davem@davemloft.net> CC: linux-sctp@vger.kernel.org Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | net: Remove redundant null check before kfree in dev.cSachin Kamat2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kfree on a null pointer is a no-op. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | caif: Remove redundant null check before kfree in cfctrl.cSachin Kamat2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kfree on a null pointer is a no-op. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | sit: allow to configure 6rd tunnels via netlinkNicolas Dichtel2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch add the support of 6RD tunnels management via netlink. Note that netdev_state_change() is now called when 6RD parameters are updated. 6RD parameters are updated only if there is at least one 6RD attribute. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>