litmus-rt-ext-res.git - LITMUS^RT with extended reservations for Forbidden Zones paper @ RTAS'20

	Commit message (Collapse)	Author	Age
*	tipc: fix problem with parallel link synchronization mechanism	Jon Paul Maloy	2015-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, we try to accumulate arrived packets in the links's 'deferred' queue during the parallel link syncronization phase. This entails two problems: - With an unlucky combination of arriving packets the algorithm may go into a lockstep with the out-of-sequence handling function, where the synch mechanism is adding a packet to the deferred queue, while the out-of-sequence handling is retrieving it again, thus ending up in a loop inside the node_lock scope. - Even if this is avoided, the link will very often send out unnecessary protocol messages, in the worst case leading to redundant retransmissions. We fix this by just dropping arriving packets on the upcoming link during the synchronization phase, thus relying on the retransmission protocol to resolve the situation once the two links have arrived to a synchronized state. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: remove wrong use of NLM_F_MULTI	Nicolas Dichtel	2015-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact, it is sent only at the end of a dump. Libraries like libnl will wait forever for NLMSG_DONE. Fixes: 35b9dd7607f0 ("tipc: add bearer get/dump to new netlink api") Fixes: 7be57fc69184 ("tipc: add link get/dump to new netlink api") Fixes: 46f15c6794fb ("tipc: add media get/dump to new netlink api") CC: Richard Alpe <richard.alpe@ericsson.com> CC: Jon Maloy <jon.maloy@ericsson.com> CC: Ying Xue <ying.xue@windriver.com> CC: tipc-discussion@lists.sourceforge.net Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bridge/nl: remove wrong use of NLM_F_MULTI	Nicolas Dichtel	2015-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact, it is sent only at the end of a dump. Libraries like libnl will wait forever for NLMSG_DONE. Fixes: e5a55a898720 ("net: create generic bridge ops") Fixes: 815cccbf10b2 ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf") CC: John Fastabend <john.r.fastabend@intel.com> CC: Sathya Perla <sathya.perla@emulex.com> CC: Subbu Seetharaman <subbu.seetharaman@emulex.com> CC: Ajit Khaparde <ajit.khaparde@emulex.com> CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com> CC: intel-wired-lan@lists.osuosl.org CC: Jiri Pirko <jiri@resnulli.us> CC: Scott Feldman <sfeldma@gmail.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: bridge@lists.linux-foundation.org Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bridge/mdb: remove wrong use of NLM_F_MULTI	Nicolas Dichtel	2015-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact, it is sent only at the end of a dump. Libraries like libnl will wait forever for NLMSG_DONE. Fixes: 37a393bc4932 ("bridge: notify mdb changes via netlink") CC: Cong Wang <amwang@redhat.com> CC: Stephen Hemminger <stephen@networkplumber.org> CC: bridge@lists.linux-foundation.org Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: sched: act_connmark: don't zap skb->nfct	Florian Westphal	2015-04-29
\| \| \| \| \| \| \| \| \| \| \| \|	This action is meant to be passive, i.e. we should not alter skb->nfct: If nfct is present just leave it alone. Compile tested only. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	trivial: net: systemport: bcmsysport.h: fix 0x0x prefix	Antonio Ospite	2015-04-29
\| \| \| \| \| \| \| \| \| \| \|	Fix the 0x0x prefix in an integer constant. In this case, while at it, also fix a typo (s/unitcast/unicast/). Signed-off-by: Antonio Ospite <ao2@ao2.it> Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
*	trivial: net: atl1e: atl1e_hw.h: fix 0x0x prefix	Antonio Ospite	2015-04-29
\| \| \| \| \| \| \| \| \| \|	Fix the 0x0x prefix in an integer constant. Signed-off-by: Antonio Ospite <ao2@ao2.it> Cc: Jay Cliburn <jcliburn@gmail.com> Cc: Chris Snook <chris.snook@gmail.com> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'bnx2x'	David S. Miller	2015-04-29
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Michal Schmidt says: ==================== bnx2x: minor cleanups related to TPA bits I noticed some simplification possibilities while looking into the bug fixed by "bnx2x: really disable TPA if 'disable_tpa' is set'. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	bnx2x: remove {TPA,GRO}_ENABLE_FLAG	Michal Schmidt	2015-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These flags are redundant with dev->features. Remove them. Just make sure to set dev->features ourselves in bnx2x_set_features() before performing the reload of the card. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	bnx2x: merge fp->disable_tpa with fp->mode	Michal Schmidt	2015-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is simpler to have the TPA mode as one three-state variable. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	bnx2x: mark LRO as a fixed disabled feature if disable_tpa is set	Michal Schmidt	2015-04-29
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \|	If disable_tpa is set, remove NETIF_F_LRO from hw_features, so ethtool sees it as "off [fixed]". Note that setting the NETIF_F_LRO bit in dev->features in the 'else' branch is not needed, because the bit was already set by bnx2x_init_dev(). Then the check for disable_tpa in in bnx2x_fix_features() becomes unnecessary. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	hv_netvsc: introduce netif-msg into netvsc module	Simon Xiao	2015-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. Introduce netif-msg to netvsc to control debug logging output and keep msg_enable in netvsc_device_context so that it is kept persistently. 2. Only call dump_rndis_message() when NETIF_MSG_RX_ERR or above is specified in netvsc module debug param. In non-debug mode, in current code, dump_rndis_message() will not dump anything but it still initialize some local variables and process the switch logic which is unnecessary, especially in high network throughput situation. Signed-off-by: Simon Xiao <sixiao@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	route: Use ipv4_mtu instead of raw rt_pmtu	Herbert Xu	2015-04-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit 3cdaa5be9e81a914e633a6be7b7d2ef75b528562 ("ipv4: Don't increase PMTU with Datagram Too Big message") broke PMTU in cases where the rt_pmtu value has expired but is smaller than the new PMTU value. This obsolete rt_pmtu then prevents the new PMTU value from being installed. Fixes: 3cdaa5be9e81 ("ipv4: Don't increase PMTU with Datagram Too Big message") Reported-by: Gerd v. Egidy <gerd.von.egidy@intra2net.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf	David S. Miller	2015-04-27
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Fix a crash in nf_tables when dictionaries are used from the ruleset, due to memory corruption, from Florian Westphal. 2) Fix another crash in nf_queue when used with br_netfilter. Also from Florian. Both fixes are related to new stuff that got in 4.0-rc. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	netfilter: bridge: fix NULL deref in physin/out ifindex helpers	Florian Westphal	2015-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Might not have an outdev yet. We'll oops when iface goes down while skbs are still nfqueue'd: RIP: 0010:[<ffffffff81422a2f>] [<ffffffff81422a2f>] dev_cmp+0x4f/0x80 nfqnl_rcv_dev_event+0xe2/0x150 notifier_call_chain+0x53/0xa0 Fixes: c737b7c4510026 ("netfilter: bridge: add helpers for fetching physin/outdev") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
\| *	netfilter: nf_tables: fix wrong length for jump/goto verdicts	Florian Westphal	2015-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	NFT_JUMP/GOTO erronously sets length to sizeof(void *). We then allocate insufficient memory when such element is added to a vmap. Suggested-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* \|	bpf: fix 64-bit divide	Alexei Starovoitov	2015-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ALU64_DIV instruction should be dividing 64-bit by 64-bit, whereas do_div() does 64-bit by 32-bit divide. x64 and arm64 JITs correctly implement 64 by 64 unsigned divide. llvm BPF backend emits code assuming that ALU64_DIV does 64 by 64. Fixes: 89aa075832b0 ("net: sock: allow eBPF programs to be attached to sockets") Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net: netcp: remove call to netif_carrier_(on/off) for MAC to Phy interface	Karicheri, Muralidharan	2015-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently when interface type is MAC to Phy, netif_carrier_(on/off) is called which is not needed as Phy lib already updates the carrier status to net stack. This is needed only for other interface types Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	2015-04-27
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull networking fixes from David Miller: 1) mlx4 doesn't check fully for supported valid RSS hash function, fix from Amir Vadai 2) Off by one in ibmveth_change_mtu(), from David Gibson 3) Prevent altera chip from reporting false error interrupts in some circumstances, from Chee Nouk Phoon 4) Get rid of that stupid endless loop trying to allocate a FIN packet in TCP, and in the process kill deadlocks. From Eric Dumazet 5) Fix get_rps_cpus() crash due to wrong invalid-cpu value, also from Eric Dumazet 6) Fix two bugs in async rhashtable resizing, from Thomas Graf 7) Fix topology server listener socket namespace bug in TIPC, from Ying Xue 8) Add some missing HAS_DMA kconfig dependencies, from Geert Uytterhoeven 9) bgmac driver intends to force re-polling but does so by returning the wrong value from it's ->poll() handler. Fix from Rafał Miłecki 10) When the creater of an rhashtable configures a max size for it, don't bark in the logs and drop insertions when that is exceeded. Fix from Johannes Berg 11) Recover from out of order packets in ppp mppe properly, from Sylvain Rochet * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) bnx2x: really disable TPA if 'disable_tpa' option is set net:treewide: Fix typo in drivers/net net/mlx4_en: Prevent setting invalid RSS hash function mdio-mux-gpio: use new gpiod_get_array and gpiod_put_array functions netfilter; Add some missing default cases to switch statements in nft_reject. ppp: mppe: discard late packet in stateless mode ppp: mppe: sanity error path rework net/bonding: Make DRV macros private net: rfs: fix crash in get_rps_cpus() altera tse: add support for fixed-links. pxa168: fix double deallocation of managed resources net: fix crash in build_skb() net: eth: altera: Resolve false errors from MSGDMA to TSE ehea: Fix memory hook reference counting crashes net/tg3: Release IRQs on permanent error net: mdio-gpio: support access that may sleep inet: fix possible panic in reqsk_queue_unlink() rhashtable: don't attempt to grow when at max_size bgmac: fix requests for extra polling calls from NAPI tcp: avoid looping in tcp_send_fin() ...
\| * \|	bnx2x: really disable TPA if 'disable_tpa' option is set	Michal Schmidt	2015-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	bnx2x's 'disable_tpa=1' module option is not respected properly and TPA (transparent packet aggregation) remains enabled. Even though the module option causes LRO to be disabled, TPA is enabled in GRO mode. Additionally, disabling GRO via ethtool then has no effect. One can still observe tpa_* statistics increase and large packets being received in tcpdump. The bug was an unintended consequence of commit aebf6244cd39 "bnx2x: Be more forgiving toward SW GRO". Fix it by following the bp->disable_tpa flag when initializing fp's. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net:treewide: Fix typo in drivers/net	Masanari Iida	2015-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch fix spelling typo in printk. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net/mlx4_en: Prevent setting invalid RSS hash function	Amir Vadai	2015-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	mlx4_en_check_rxfh_func() was checking for hardware support before setting a known RSS hash function, but didn't do any check before setting unknown RSS hash function. Need to make it fail on such values. In this occasion, moved the actual setting of the new value from the check function into mlx4_en_set_rxfh(). Fixes: 947cbb0 ("net/mlx4_en: Support for configurable RSS hash function") Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	mdio-mux-gpio: use new gpiod_get_array and gpiod_put_array functions	Rojhalat Ibrahim	2015-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use the new gpiod_get_array and gpiod_put_array functions (added to mainline in the v4.1 merge window) for obtaining and disposing of GPIO descriptors. Cc: David Miller <davem@davemloft.net> Cc: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Rojhalat Ibrahim <imr@rtschenk.de> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	netfilter; Add some missing default cases to switch statements in nft_reject.	David S. Miller	2015-04-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes: ==================== net/netfilter/nft_reject.c: In function ‘nft_reject_dump’: net/netfilter/nft_reject.c:61:2: warning: enumeration value ‘NFT_REJECT_TCP_RST’ not handled in switch [-Wswitch] switch (priv->type) { ^ net/netfilter/nft_reject.c:61:2: warning: enumeration value ‘NFT_REJECT_ICMPX_UNREACH’ not handled in switch [-Wswi\ tch] net/netfilter/nft_reject_inet.c: In function ‘nft_reject_inet_dump’: net/netfilter/nft_reject_inet.c:105:2: warning: enumeration value ‘NFT_REJECT_TCP_RST’ not handled in switch [-Wswi\ tch] switch (priv->type) { ^ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	Merge branch 'ppp_mppe_desync'	David S. Miller	2015-04-26
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sylvain Rochet says: ==================== ppp: mppe: fixes MPPE desync on links which don't guarantee packet ordering I am currently having an issue with PPP over L2TP (UDP) and MPPE in stateless mode (default mode), UDP does not guarantee packet ordering so we might get out of order packet. MPPE needs to be continuously synched so we should drop late UDP packet. I added a printk on the number of time we rekeyed in MPPE decompressor, this is what we currently have if we receive a slightly out of order UDP packet: [1731001.049206] mppe_decompress[1]: ccount 1559 [1731001.049216] mppe_decompress[1]: rekeyed 1 times [1731001.049228] mppe_decompress[1]: ccount 1560 [1731001.049232] mppe_decompress[1]: rekeyed 1 times [1731001.050170] mppe_decompress[1]: ccount 1562 [1731001.050182] mppe_decompress[1]: rekeyed 2 times [1731001.050191] mppe_decompress[1]: ccount 1561 [1731001.062576] mppe_decompress[1]: rekeyed 4095 times ^^^^ This is obviously wrong, we missed packet 1561 and we already rekeyed 2 times for 1562 we previously received, we can't recover the decryption key we need for 1561, we should drop it instead of rekeying 4095 times. This patch series drop any packet with are not within the 4096/2 forward window. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \|	ppp: mppe: discard late packet in stateless mode	Sylvain Rochet	2015-04-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When PPP is used over a link which does not guarantee packet ordering, we might get late MPPE packets. This is a problem because MPPE must be kept synchronized and the current implementation does not drop them and rekey 4095 times instead of 0, which is wrong. In order to prevent rekeying about a whole count space times (~ 4095 times), drop packets which are not within the forward 4096/2 window and increase sanity error counter. Signed-off-by: Sylvain Rochet <sylvain.rochet@finsecur.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| * \|	ppp: mppe: sanity error path rework	Sylvain Rochet	2015-04-26
\| \|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We are going to need sanity error path a little further, rework to be able to use the sanity error path anywhere in decompressor. Signed-off-by: Sylvain Rochet <sylvain.rochet@finsecur.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net/bonding: Make DRV macros private	Matan Barak	2015-04-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bonding modules currently defines four macros with general names that pollute the global namespace: DRV_VERSION DRV_RELDATE DRV_NAME DRV_DESCRIPTION Fixing that by defining a private bonding_priv.h header files which includes those defines. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net: rfs: fix crash in get_rps_cpus()	Eric Dumazet	2015-04-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 567e4b79731c ("net: rfs: add hash collision detection") had one mistake : RPS_NO_CPU is no longer the marker for invalid cpu in set_rps_cpu() and get_rps_cpu(), as @next_cpu was the result of an AND with rps_cpu_mask This bug showed up on a host with 72 cpus : next_cpu was 0x7f, and the code was trying to access percpu data of an non existent cpu. In a follow up patch, we might get rid of compares against nr_cpu_ids, if we init the tables with 0. This is silly to test for a very unlikely condition that exists only shortly after table initialization, as we got rid of rps_reset_sock_flow() and similar functions that were writing this RPS_NO_CPU magic value at flow dismantle : When table is old enough, it never contains this value anymore. Fixes: 567e4b79731c ("net: rfs: add hash collision detection") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	altera tse: add support for fixed-links.	Andreas Oetken	2015-04-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for fixed-links in configurations without PHY. (e.g. connection to a switch, SGMII point to point, SFPs) Check: Documentation/devicetree/bindings/net/fixed-link.txt. Signed-off-by: Andreas Oetken <ennoerlangen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	pxa168: fix double deallocation of managed resources	Alexey Khoroshilov	2015-04-26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 43d3ddf87a57 ("net: pxa168_eth: add device tree support") starts to use managed resources by adding devm_clk_get() and devm_ioremap_resource(), but it leaves explicit iounmap() and clock_put() in pxa168_eth_remove() and in failure handling code of pxa168_eth_probe(). As a result double free can happen. The patch removes explicit resource deallocation. Also it converts clk_disable() to clk_disable_unprepare() to make it symmetrical with clk_prepare_enable(). Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net: fix crash in build_skb()	Eric Dumazet	2015-04-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When I added pfmemalloc support in build_skb(), I forgot netlink was using build_skb() with a vmalloc() area. In this patch I introduce __build_skb() for netlink use, and build_skb() is a wrapper handling both skb->head_frag and skb->pfmemalloc This means netlink no longer has to hack skb->head_frag [ 1567.700067] kernel BUG at arch/x86/mm/physaddr.c:26! [ 1567.700067] invalid opcode: 0000 [#1] PREEMPT SMP KASAN [ 1567.700067] Dumping ftrace buffer: [ 1567.700067] (ftrace buffer empty) [ 1567.700067] Modules linked in: [ 1567.700067] CPU: 9 PID: 16186 Comm: trinity-c182 Not tainted 4.0.0-next-20150424-sasha-00037-g4796e21 #2167 [ 1567.700067] task: ffff880127efb000 ti: ffff880246770000 task.ti: ffff880246770000 [ 1567.700067] RIP: __phys_addr (arch/x86/mm/physaddr.c:26 (discriminator 3)) [ 1567.700067] RSP: 0018:ffff8802467779d8 EFLAGS: 00010202 [ 1567.700067] RAX: 000041000ed8e000 RBX: ffffc9008ed8e000 RCX: 000000000000002c [ 1567.700067] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffffb3fd6049 [ 1567.700067] RBP: ffff8802467779f8 R08: 0000000000000019 R09: ffff8801d0168000 [ 1567.700067] R10: ffff8801d01680c7 R11: ffffed003a02d019 R12: ffffc9000ed8e000 [ 1567.700067] R13: 0000000000000f40 R14: 0000000000001180 R15: ffffc9000ed8e000 [ 1567.700067] FS: 00007f2a7da3f700(0000) GS:ffff8801d1000000(0000) knlGS:0000000000000000 [ 1567.700067] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1567.700067] CR2: 0000000000738308 CR3: 000000022e329000 CR4: 00000000000007e0 [ 1567.700067] Stack: [ 1567.700067] ffffc9000ed8e000 ffff8801d0168000 ffffc9000ed8e000 ffff8801d0168000 [ 1567.700067] ffff880246777a28 ffffffffad7c0a21 0000000000001080 ffff880246777c08 [ 1567.700067] ffff88060d302e68 ffff880246777b58 ffff880246777b88 ffffffffad9a6821 [ 1567.700067] Call Trace: [ 1567.700067] build_skb (include/linux/mm.h:508 net/core/skbuff.c:316) [ 1567.700067] netlink_sendmsg (net/netlink/af_netlink.c:1633 net/netlink/af_netlink.c:2329) [ 1567.774369] ? sched_clock_cpu (kernel/sched/clock.c:311) [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273) [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273) [ 1567.774369] sock_sendmsg (net/socket.c:614 net/socket.c:623) [ 1567.774369] sock_write_iter (net/socket.c:823) [ 1567.774369] ? sock_sendmsg (net/socket.c:806) [ 1567.774369] __vfs_write (fs/read_write.c:479 fs/read_write.c:491) [ 1567.774369] ? get_lock_stats (kernel/locking/lockdep.c:249) [ 1567.774369] ? default_llseek (fs/read_write.c:487) [ 1567.774369] ? vtime_account_user (kernel/sched/cputime.c:701) [ 1567.774369] ? rw_verify_area (fs/read_write.c:406 (discriminator 4)) [ 1567.774369] vfs_write (fs/read_write.c:539) [ 1567.774369] SyS_write (fs/read_write.c:586 fs/read_write.c:577) [ 1567.774369] ? SyS_read (fs/read_write.c:577) [ 1567.774369] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 1567.774369] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 kernel/locking/lockdep.c:2636) [ 1567.774369] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42) [ 1567.774369] system_call_fastpath (arch/x86/kernel/entry_64.S:261) Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net: eth: altera: Resolve false errors from MSGDMA to TSE	Chee Nouk Phoon	2015-04-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch resolves false errors from MSGDMA in TX mSGDMA MM to ST mode, and is a continuation of the patch recently submitted by Andrea Oetken. The MSGDMA had a logic bug that masked detection of this issue prior to Quartus 14.1/Build 164. When the MSGDMA logic bug was addressed in Quartus 14.1/Build 164, the driver problem was exposed. The problem is corrected by making sure MSGDMA_DESC_CTL_TR_ERR_IRQ is not set for any of the transmit DMA descriptors, and only used for receive descriptors. Fixes: 71cd26e altera tse: Error-Bit on tx-avalon-stream always set. Signed-off-by: Chee Nouk Phoon <cnphoon@altera.com> Signed-off-by: Vince Bridgers <vbridger@opensource.altera.com>a Cc: Andreas Oetken <ennoerlangen@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	ehea: Fix memory hook reference counting crashes	Michael Ellerman	2015-04-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The recent commit to only register the EHEA memory hotplug hooks on adapter probe has a few problems. Firstly the reference counting is wrong for multiple adapters, in that the hooks are registered multiple times. Secondly the check in the tear down path is backward. Finally the error path doesn't decrement the count. The multiple registration of the hooks is the biggest problem, as it leads to oopses when the system is rebooted, and/or errors during memory hotplug, eg: $ ./mem-on-off-test.sh -r 2 ... ehea: memory is going offline ehea: LPAR memory changed - re-initializing driver ehea: re-initializing driver complete ehea: memory is going offline ehea: LPAR memory changed - re-initializing driver ehea: opcode=26c ret=fffffffffffffffc arg1=8000000003000003 arg2=0 arg3=700000060000d600 arg4=3fded0000 arg5=200 arg6=0 arg7=0 ehea: register_rpage_mr failed ehea: registering mr failed ehea: register MR failed - driver inoperable! ehea: memory is going offline Fixes: aa183323312d ("ehea: Register memory hotplug, reboot and crash hooks on adapter probe") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net/tg3: Release IRQs on permanent error	Gavin Shan	2015-04-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When having permanent EEH error, the PCI device will be removed from the system. For this case, we shouldn't set pcierr_recovery to true wrongly, which blocks the driver to release the allocated interrupts and their handlers. Eventually, we can't disable MSI or MSIx successfully because of the MSI or MSIx interrupts still have associated interrupt actions, which is turned into following stack dump. Oops: Exception in kernel mode, sig: 5 [#1] : [c0000000003b76a8] .free_msi_irqs+0x80/0x1a0 (unreliable) [c00000000039f388] .pci_remove_bus_device+0x98/0x110 [c0000000000790f4] .pcibios_remove_pci_devices+0x9c/0x128 [c000000000077b98] .handle_eeh_events+0x2d8/0x4b0 [c0000000000782d0] .eeh_event_handler+0x130/0x1c0 [c000000000022bd4] .kernel_thread+0x54/0x70 Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Acked-by: Prashant Sreedharan <prashant@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	net: mdio-gpio: support access that may sleep	Vivien Didelot	2015-04-24
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some systems using mdio-gpio may use gpio on message based busses, which require sleeping (e.g. gpio from an I2C I/O expander). Since this driver does not use IRQ handler, it is safe to use the _cansleep suffixed gpio accessors. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	inet: fix possible panic in reqsk_queue_unlink()	Eric Dumazet	2015-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ 3897.923145] BUG: unable to handle kernel NULL pointer dereference at 0000000000000080 [ 3897.931025] IP: [<ffffffffa9f27686>] reqsk_timer_handler+0x1a6/0x243 There is a race when reqsk_timer_handler() and tcp_check_req() call inet_csk_reqsk_queue_unlink() on the same req at the same time. Before commit fa76ce7328b2 ("inet: get rid of central tcp/dccp listener timer"), listener spinlock was held and race could not happen. To solve this bug, we change reqsk_queue_unlink() to not assume req must be found, and we return a status, to conditionally release a refcount on the request sock. This also means tcp_check_req() in non fastopen case might or not consume req refcount, so tcp_v6_hnd_req() & tcp_v4_hnd_req() have to properly handle this. (Same remark for dccp_check_req() and its callers) inet_csk_reqsk_queue_drop() is now too big to be inlined, as it is called 4 times in tcp and 3 times in dccp. Fixes: fa76ce7328b2 ("inet: get rid of central tcp/dccp listener timer") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	rhashtable: don't attempt to grow when at max_size	Johannes Berg	2015-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The conversion of mac80211's station table to rhashtable had a bug that I found by accident in code review, that hadn't been found as rhashtable apparently managed to have a maximum hash chain length of one (!) in all our testing. In order to test the bug and verify the fix I set my rhashtable's max_size very low (4) in order to force getting hash collisions. At that point, rhashtable WARNed in rhashtable_insert_rehash() but didn't actually reject the hash table insertion. This caused it to lose insertions - my master list of stations would have 9 entries, but the rhashtable only had 5. This may warrant a deeper look, but that WARN_ON() just shouldn't happen. Fix this by not returning true from rht_grow_above_100() when the rhashtable's max_size has been reached - in this case the user is explicitly configuring it to be at most that big, so even if it's now above 100% it shouldn't attempt to resize. This fixes the "lost insertion" issue and consequently allows my code to display its error (and verify my fix for it.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	bgmac: fix requests for extra polling calls from NAPI	Rafał Miłecki	2015-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After d75b1ade567f ("net: less interrupt masking in NAPI") polling function has to return whole budget when it wants NAPI to call it again. Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Cc: Felix Fietkau <nbd@openwrt.org> Fixes: eb64e2923a886 ("bgmac: leave interrupts disabled as long as there is work to do") Acked-by: Felix Fietkau <nbd@openwrt.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	tcp: avoid looping in tcp_send_fin()	Eric Dumazet	2015-04-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Presence of an unbound loop in tcp_send_fin() had always been hard to explain when analyzing crash dumps involving gigantic dying processes with millions of sockets. Lets try a different strategy : In case of memory pressure, try to add the FIN flag to last packet in write queue, even if packet was already sent. TCP stack will be able to deliver this FIN after a timeout event. Note that this FIN being delivered by a retransmit, it also carries a Push flag given our current implementation. By checking sk_under_memory_pressure(), we anticipate that cooking many FIN packets might deplete tcp memory. In the case we could not allocate a packet, even with __GFP_WAIT allocation, then not sending a FIN seems quite reasonable if it allows to get rid of this socket, free memory, and not block the process from eventually doing other useful work. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ethernet: myri10ge: use arch_phys_wc_add()	Luis R. Rodriguez	2015-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This driver already uses ioremap_wc() on the same range so when write-combining is available that will be used instead. Cc: Hyong-Youb Kim <hykim@myri.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Suresh Siddha <sbsiddha@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Juergen Gross <jgross@suse.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: netdev@vger.kernel.org Cc: Juergen Gross <jgross@suse.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Airlie <airlied@redhat.com> Cc: Antonino Daplas <adaplas@gmail.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	can: CAN_GRCAN should depend on HAS_DMA	Geert Uytterhoeven	2015-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If NO_DMA=y: drivers/built-in.o: In function `grcan_free_dma_buffers': grcan.c:(.text+0x2d7716): undefined reference to `dma_free_coherent' drivers/built-in.o: In function `grcan_allocate_dma_buffers': grcan.c:(.text+0x2d779c): undefined reference to `dma_alloc_coherent' Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ethernet: arc: ARC_EMAC and EMAC_ROCKCHIP should depend on HAS_DMA	Geert Uytterhoeven	2015-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If NO_DMA=y: drivers/built-in.o: In function `arc_emac_tx_clean': emac_main.c:(.text+0x2decde): undefined reference to `dma_unmap_single' drivers/built-in.o: In function `arc_emac_rx': emac_main.c:(.text+0x2dee1c): undefined reference to `dma_unmap_single' emac_main.c:(.text+0x2dee72): undefined reference to `dma_map_single' emac_main.c:(.text+0x2dee7e): undefined reference to `dma_mapping_error' drivers/built-in.o: In function `arc_emac_probe': (.text+0x2df2ee): undefined reference to `dmam_alloc_coherent' drivers/built-in.o: In function `arc_emac_open': emac_main.c:(.text+0x2df6d8): undefined reference to `dma_map_single' emac_main.c:(.text+0x2df6e4): undefined reference to `dma_mapping_error' drivers/built-in.o: In function `arc_emac_tx': emac_main.c:(.text+0x2df9e4): undefined reference to `dma_map_single' emac_main.c:(.text+0x2df9f0): undefined reference to `dma_mapping_error' Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ethernet: amd: AMD_XGBE should depend on HAS_DMA	Geert Uytterhoeven	2015-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If NO_DMA=y: drivers/built-in.o: In function `xgbe_probe': xgbe-main.c:(.text+0x2def0a): undefined reference to `dma_set_mask' xgbe-main.c:(.text+0x2def20): undefined reference to `dma_supported' drivers/built-in.o: In function `xgbe_rx_poll': xgbe-drv.c:(.text+0x2e0320): undefined reference to `dma_sync_single_for_cpu' xgbe-drv.c:(.text+0x2e035e): undefined reference to `dma_sync_single_for_cpu' drivers/built-in.o: In function `xgbe_unmap_rdata': xgbe-desc.c:(.text+0x2e5fe4): undefined reference to `dma_unmap_page' xgbe-desc.c:(.text+0x2e5ffa): undefined reference to `dma_unmap_single' xgbe-desc.c:(.text+0x2e604a): undefined reference to `dma_unmap_page' xgbe-desc.c:(.text+0x2e6084): undefined reference to `dma_unmap_page' drivers/built-in.o: In function `xgbe_alloc_pages': xgbe-desc.c:(.text+0x2e6156): undefined reference to `dma_map_page' xgbe-desc.c:(.text+0x2e6164): undefined reference to `dma_mapping_error' drivers/built-in.o: In function `xgbe_free_ring': xgbe-desc.c:(.text+0x2e63d4): undefined reference to `dma_unmap_page' xgbe-desc.c:(.text+0x2e640e): undefined reference to `dma_unmap_page' xgbe-desc.c:(.text+0x2e644a): undefined reference to `dma_free_coherent' drivers/built-in.o: In function `xgbe_init_ring': xgbe-desc.c:(.text+0x2e64d4): undefined reference to `dma_alloc_coherent' drivers/built-in.o: In function `xgbe_map_tx_skb': xgbe-desc.c:(.text+0x2e6628): undefined reference to `dma_map_single' xgbe-desc.c:(.text+0x2e6638): undefined reference to `dma_mapping_error' xgbe-desc.c:(.text+0x2e66b2): undefined reference to `dma_map_single' xgbe-desc.c:(.text+0x2e66c2): undefined reference to `dma_mapping_error' xgbe-desc.c:(.text+0x2e6762): undefined reference to `dma_map_page' xgbe-desc.c:(.text+0x2e6772): undefined reference to `dma_mapping_error' Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: unix: garbage: fixed several comment and whitespace style issues	Jason Eastman	2015-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	fixed several comment and whitespace style issues Signed-off-by: Jason Eastman <eastman.jason.linux@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	Merge branch 'tipc-fixes'	David S. Miller	2015-04-23
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Jon Maloy says: ==================== tipc: three bug fixes A set of unrelated corrections; one for the tipc netns implementation, one regarding problems with random link resets, and one removing a an erroneous refcount decrement when reading link statistsics via netlink. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	tipc: fix node refcount issue	Erik Hugne	2015-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When link statistics is dumped over netlink, we iterate over the list of peer nodes and append each links statistics to the netlink msg. In the case where the dump is resumed after filling up a nlmsg, the node refcnt is decremented without having been incremented previously which may cause the node reference to be freed. When this happens, the following info/stacktrace will be generated, followed by a crash or undefined behavior. We fix this by removing the erroneous call to tipc_node_put inside the loop that iterates over nodes. [ 384.312303] INFO: trying to register non-static key. [ 384.313110] the code is fine but needs lockdep annotation. [ 384.313290] turning off the locking correctness validator. [ 384.313290] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.0.0+ #13 [ 384.313290] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 384.313290] ffff88003c6d0290 ffff88003cc03ca8 ffffffff8170adf1 0000000000000007 [ 384.313290] ffffffff82728730 ffff88003cc03d38 ffffffff810a6a6d 00000000001d7200 [ 384.313290] ffff88003c6d0ab0 ffff88003cc03ce8 0000000000000285 0000000000000001 [ 384.313290] Call Trace: [ 384.313290] <IRQ> [<ffffffff8170adf1>] dump_stack+0x4c/0x65 [ 384.313290] [<ffffffff810a6a6d>] __lock_acquire+0xf3d/0xf50 [ 384.313290] [<ffffffff810a7375>] lock_acquire+0xd5/0x290 [ 384.313290] [<ffffffffa0043e8c>] ? link_timeout+0x1c/0x170 [tipc] [ 384.313290] [<ffffffffa0043e70>] ? link_state_event+0x4e0/0x4e0 [tipc] [ 384.313290] [<ffffffff81712890>] _raw_spin_lock_bh+0x40/0x80 [ 384.313290] [<ffffffffa0043e8c>] ? link_timeout+0x1c/0x170 [tipc] [ 384.313290] [<ffffffffa0043e8c>] link_timeout+0x1c/0x170 [tipc] [ 384.313290] [<ffffffff810c4698>] call_timer_fn+0xb8/0x490 [ 384.313290] [<ffffffff810c45e0>] ? process_timeout+0x10/0x10 [ 384.313290] [<ffffffff810c5a2c>] run_timer_softirq+0x21c/0x420 [ 384.313290] [<ffffffffa0043e70>] ? link_state_event+0x4e0/0x4e0 [tipc] [ 384.313290] [<ffffffff8105a954>] __do_softirq+0xf4/0x630 [ 384.313290] [<ffffffff8105afdd>] irq_exit+0x5d/0x60 [ 384.313290] [<ffffffff8103ade1>] smp_apic_timer_interrupt+0x41/0x50 [ 384.313290] [<ffffffff817144a0>] apic_timer_interrupt+0x70/0x80 [ 384.313290] <EOI> [<ffffffff8100db10>] ? default_idle+0x20/0x210 [ 384.313290] [<ffffffff8100db0e>] ? default_idle+0x1e/0x210 [ 384.313290] [<ffffffff8100e61a>] arch_cpu_idle+0xa/0x10 [ 384.313290] [<ffffffff81099803>] cpu_startup_entry+0x2c3/0x530 [ 384.313290] [<ffffffff810d2893>] ? clockevents_register_device+0x113/0x200 [ 384.313290] [<ffffffff81038b0f>] start_secondary+0x13f/0x170 Fixes: 8a0f6ebe8494 ("tipc: involve reference counter for node structure") Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	tipc: fix random link reset problem	Erik Hugne	2015-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the function tipc_sk_rcv(), the stack variable 'err' is only initialized to TIPC_ERR_NO_PORT for the first iteration over the link input queue. If a chain of messages are received from a link, failure to lookup the socket for any but the first message will cause the message to bounce back out on a random link. We fix this by properly initializing err. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| \| *	tipc: fix topology server broken issue	Ying Xue	2015-04-23
\| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a new topology server is launched in a new namespace, its listening socket is inserted into the "init ns" namespace's socket hash table rather than the one owned by the new namespace. Although the socket's namespace is forcedly changed to the new namespace later, the socket is still stored in the socket hash table of "init ns" namespace. When a client created in the new namespace connects its own topology server, the connection is failed as its server's socket could not be found from its own namespace's socket table. If __sock_create() instead of original sock_create_kern() is used to create the server's socket through specifying an expected namesapce, the socket will be inserted into the specified namespace's socket table, thereby avoiding to the topology server broken issue. Fixes: 76100a8a64bc ("tipc: fix netns refcnt leak") Reported-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ibmveth: Fix off-by-one error in ibmveth_change_mtu()	David Gibson	2015-04-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AFAIK the PAPR document which defines the virtual device interface used by the ibmveth driver doesn't specify a specific maximum MTU. So, in the ibmveth driver, the maximum allowed MTU is determined by the maximum allocated buffer size of 64k (corresponding to one page in the common case) minus the per-buffer overhead IBMVETH_BUFF_OH (which has value 22 for 14 bytes of ethernet header, plus 8 bytes for an opaque handle). This suggests a maximum allowable MTU of 65514 bytes, but in fact the driver only permits a maximum MTU of 65513. This is because there is a < instead of an <= in ibmveth_change_mtu(), which only permits an MTU which is strictly smaller than the buffer size, rather than allowing the buffer to be completely filled. This patch fixes the buglet. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>