aboutsummaryrefslogtreecommitdiffstats
path: root/net
Commit message (Collapse)AuthorAge
...
| * | | net: filter: rework/optimize internal BPF interpreter's instruction setAlexei Starovoitov2014-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces/reworks the kernel-internal BPF interpreter with an optimized BPF instruction set format that is modelled closer to mimic native instruction sets and is designed to be JITed with one to one mapping. Thus, the new interpreter is noticeably faster than the current implementation of sk_run_filter(); mainly for two reasons: 1. Fall-through jumps: BPF jump instructions are forced to go either 'true' or 'false' branch which causes branch-miss penalty. The new BPF jump instructions have only one branch and fall-through otherwise, which fits the CPU branch predictor logic better. `perf stat` shows drastic difference for branch-misses between the old and new code. 2. Jump-threaded implementation of interpreter vs switch statement: Instead of single table-jump at the top of 'switch' statement, gcc will now generate multiple table-jump instructions, which helps CPU branch predictor logic. Note that the verification of filters is still being done through sk_chk_filter() in classical BPF format, so filters from user- or kernel space are verified in the same way as we do now, and same restrictions/constraints hold as well. We reuse current BPF JIT compilers in a way that this upgrade would even be fine as is, but nevertheless allows for a successive upgrade of BPF JIT compilers to the new format. The internal instruction set migration is being done after the probing for JIT compilation, so in case JIT compilers are able to create a native opcode image, we're going to use that, and in all other cases we're doing a follow-up migration of the BPF program's instruction set, so that it can be transparently run in the new interpreter. In short, the *internal* format extends BPF in the following way (more details can be taken from the appended documentation): - Number of registers increase from 2 to 10 - Register width increases from 32-bit to 64-bit - Conditional jt/jf targets replaced with jt/fall-through - Adds signed > and >= insns - 16 4-byte stack slots for register spill-fill replaced with up to 512 bytes of multi-use stack space - Introduction of bpf_call insn and register passing convention for zero overhead calls from/to other kernel functions - Adds arithmetic right shift and endianness conversion insns - Adds atomic_add insn - Old tax/txa insns are replaced with 'mov dst,src' insn Performance of two BPF filters generated by libpcap resp. bpf_asm was measured on x86_64, i386 and arm32 (other libpcap programs have similar performance differences): fprog #1 is taken from Documentation/networking/filter.txt: tcpdump -i eth0 port 22 -dd fprog #2 is taken from 'man tcpdump': tcpdump -i eth0 'tcp port 22 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)' -dd Raw performance data from BPF micro-benchmark: SK_RUN_FILTER on the same SKB (cache-hit) or 10k SKBs (cache-miss); time in ns per call, smaller is better: --x86_64-- fprog #1 fprog #1 fprog #2 fprog #2 cache-hit cache-miss cache-hit cache-miss old BPF 90 101 192 202 new BPF 31 71 47 97 old BPF jit 12 34 17 44 new BPF jit TBD --i386-- fprog #1 fprog #1 fprog #2 fprog #2 cache-hit cache-miss cache-hit cache-miss old BPF 107 136 227 252 new BPF 40 119 69 172 --arm32-- fprog #1 fprog #1 fprog #2 fprog #2 cache-hit cache-miss cache-hit cache-miss old BPF 202 300 475 540 new BPF 180 270 330 470 old BPF jit 26 182 37 202 new BPF jit TBD Thus, without changing any userland BPF filters, applications on top of AF_PACKET (or other families) such as libpcap/tcpdump, cls_bpf classifier, netfilter's xt_bpf, team driver's load-balancing mode, and many more will have better interpreter filtering performance. While we are replacing the internal BPF interpreter, we also need to convert seccomp BPF in the same step to make use of the new internal structure since it makes use of lower-level API details without being further decoupled through higher-level calls like sk_unattached_filter_{create,destroy}(), for example. Just as for normal socket filtering, also seccomp BPF experiences a time-to-verdict speedup: 05-sim-long_jumps.c of libseccomp was used as micro-benchmark: seccomp_rule_add_exact(ctx,... seccomp_rule_add_exact(ctx,... rc = seccomp_load(ctx); for (i = 0; i < 10000000; i++) syscall(199, 100); 'short filter' has 2 rules 'large filter' has 200 rules 'short filter' performance is slightly better on x86_64/i386/arm32 'large filter' is much faster on x86_64 and i386 and shows no difference on arm32 --x86_64-- short filter old BPF: 2.7 sec 39.12% bench libc-2.15.so [.] syscall 8.10% bench [kernel.kallsyms] [k] sk_run_filter 6.31% bench [kernel.kallsyms] [k] system_call 5.59% bench [kernel.kallsyms] [k] trace_hardirqs_on_caller 4.37% bench [kernel.kallsyms] [k] trace_hardirqs_off_caller 3.70% bench [kernel.kallsyms] [k] __secure_computing 3.67% bench [kernel.kallsyms] [k] lock_is_held 3.03% bench [kernel.kallsyms] [k] seccomp_bpf_load new BPF: 2.58 sec 42.05% bench libc-2.15.so [.] syscall 6.91% bench [kernel.kallsyms] [k] system_call 6.25% bench [kernel.kallsyms] [k] trace_hardirqs_on_caller 6.07% bench [kernel.kallsyms] [k] __secure_computing 5.08% bench [kernel.kallsyms] [k] sk_run_filter_int_seccomp --arm32-- short filter old BPF: 4.0 sec 39.92% bench [kernel.kallsyms] [k] vector_swi 16.60% bench [kernel.kallsyms] [k] sk_run_filter 14.66% bench libc-2.17.so [.] syscall 5.42% bench [kernel.kallsyms] [k] seccomp_bpf_load 5.10% bench [kernel.kallsyms] [k] __secure_computing new BPF: 3.7 sec 35.93% bench [kernel.kallsyms] [k] vector_swi 21.89% bench libc-2.17.so [.] syscall 13.45% bench [kernel.kallsyms] [k] sk_run_filter_int_seccomp 6.25% bench [kernel.kallsyms] [k] __secure_computing 3.96% bench [kernel.kallsyms] [k] syscall_trace_exit --x86_64-- large filter old BPF: 8.6 seconds 73.38% bench [kernel.kallsyms] [k] sk_run_filter 10.70% bench libc-2.15.so [.] syscall 5.09% bench [kernel.kallsyms] [k] seccomp_bpf_load 1.97% bench [kernel.kallsyms] [k] system_call new BPF: 5.7 seconds 66.20% bench [kernel.kallsyms] [k] sk_run_filter_int_seccomp 16.75% bench libc-2.15.so [.] syscall 3.31% bench [kernel.kallsyms] [k] system_call 2.88% bench [kernel.kallsyms] [k] __secure_computing --i386-- large filter old BPF: 5.4 sec new BPF: 3.8 sec --arm32-- large filter old BPF: 13.5 sec 73.88% bench [kernel.kallsyms] [k] sk_run_filter 10.29% bench [kernel.kallsyms] [k] vector_swi 6.46% bench libc-2.17.so [.] syscall 2.94% bench [kernel.kallsyms] [k] seccomp_bpf_load 1.19% bench [kernel.kallsyms] [k] __secure_computing 0.87% bench [kernel.kallsyms] [k] sys_getuid new BPF: 13.5 sec 76.08% bench [kernel.kallsyms] [k] sk_run_filter_int_seccomp 10.98% bench [kernel.kallsyms] [k] vector_swi 5.87% bench libc-2.17.so [.] syscall 1.77% bench [kernel.kallsyms] [k] __secure_computing 0.93% bench [kernel.kallsyms] [k] sys_getuid BPF filters generated by seccomp are very branchy, so the new internal BPF performance is better than the old one. Performance gains will be even higher when BPF JIT is committed for the new structure, which is planned in future work (as successive JIT migrations). BPF has also been stress-tested with trinity's BPF fuzzer. Joint work with Daniel Borkmann. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Kees Cook <keescook@chromium.org> Cc: Paul Moore <pmoore@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@linux.intel.com> Cc: linux-kernel@vger.kernel.org Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: ptp: do not reimplement PTP/BPF classifierDaniel Borkmann2014-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are currently pch_gbe, cpts, and ixp4xx_eth drivers that open-code and reimplement a BPF classifier for the PTP protocol. Since all of them effectively do the very same thing and load the very same PTP/BPF filter, we can just consolidate that code by introducing ptp_classify_raw() in the time-stamping core framework which can be used in drivers. As drivers get initialized after bootstrapping the core networking subsystem, they can make use of ptp_insns wrapped through ptp_classify_raw(), which allows to simplify and remove PTP classifier setup code in drivers. Joint work with Alexei Starovoitov. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Richard Cochran <richard.cochran@omicron.at> Cc: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: ptp: use sk_unattached_filter_create() for BPFDaniel Borkmann2014-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch migrates an open-coded sk_run_filter() implementation with proper use of the BPF API, that is, sk_unattached_filter_create(). This migration is needed, as we will be internally transforming the filter to a different representation, and therefore needs to be decoupled. It is okay to do so as skb_timestamping_init() is called during initialization of the network stack in core initcall via sock_init(). This would effectively also allow for PTP filters to be jit compiled if bpf_jit_enable is set. For better readability, there are also some newlines introduced, also ptp_classify.h is only in kernel space. Joint work with Alexei Starovoitov. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Richard Cochran <richard.cochran@omicron.at> Cc: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: filter: move filter accounting to filter coreDaniel Borkmann2014-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch basically does two things, i) removes the extern keyword from the include/linux/filter.h file to be more consistent with the rest of Joe's changes, and ii) moves filter accounting into the filter core framework. Filter accounting mainly done through sk_filter_{un,}charge() take care of the case when sockets are being cloned through sk_clone_lock() so that removal of the filter on one socket won't result in eviction as it's still referenced by the other. These functions actually belong to net/core/filter.c and not include/net/sock.h as we want to keep all that in a central place. It's also not in fast-path so uninlining them is fine and even allows us to get rd of sk_filter_release_rcu()'s EXPORT_SYMBOL and a forward declaration. Joint work with Alexei Starovoitov. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: filter: keep original BPF program aroundDaniel Borkmann2014-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to open up the possibility to internally transform a BPF program into an alternative and possibly non-trivial reversible representation, we need to keep the original BPF program around, so that it can be passed back to user space w/o the need of a complex decoder. The reason for that use case resides in commit a8fc92778080 ("sk-filter: Add ability to get socket filter program (v2)"), that is, the ability to retrieve the currently attached BPF filter from a given socket used mainly by the checkpoint-restore project, for example. Therefore, we add two helpers sk_{store,release}_orig_filter for taking care of that. In the sk_unattached_filter_create() case, there's no such possibility/requirement to retrieve a loaded BPF program. Therefore, we can spare us the work in that case. This approach will simplify and slightly speed up both, sk_get_filter() and sock_diag_put_filterinfo() handlers as we won't need to successively decode filters anymore through sk_decode_filter(). As we still need sk_decode_filter() later on, we're keeping it around. Joint work with Alexei Starovoitov. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | net: filter: add jited flag to indicate jit compiled filtersDaniel Borkmann2014-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a jited flag into sk_filter struct in order to indicate whether a filter is currently jited or not. The size of sk_filter is not being expanded as the 32 bit 'len' member allows upper bits to be reused since a filter can currently only grow as large as BPF_MAXINSNS. Therefore, there's enough room also for other in future needed flags to reuse 'len' field if necessary. The jited flag also allows for having alternative interpreter functions running as currently, we can only detect jit compiled filters by testing fp->bpf_func to not equal the address of sk_run_filter(). Joint work with Alexei Starovoitov. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-03-29
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/marvell/mvneta.c The mvneta.c conflict is a case of overlapping changes, a conversion to devm_ioremap_resource() vs. a conversion to netdev_alloc_pcpu_stats. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ipv6: fix checkpatch errors of "foo*" and "foo * bar"Wang Yufen2014-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ERROR: "(foo*)" should be "(foo *)" ERROR: "foo * bar" should be "foo *bar" Suggested-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Wang Yufen <wangyufen@huawei.com> Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ipv6: fix checkpatch errors of brace and trailing statementsWang Yufen2014-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ERROR: open brace '{' following enum go on the same line ERROR: open brace '{' following struct go on the same line ERROR: trailing statements should be on next line Signed-off-by: Wang Yufen <wangyufen@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ipv6: fix checkpatch errors comments and spaceWang Yufen2014-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | WARNING: please, no space before tabs WARNING: please, no spaces at the start of a line ERROR: spaces required around that ':' (ctx:VxW) ERROR: spaces required around that '>' (ctx:VxV) ERROR: spaces required around that '>=' (ctx:VxV) Signed-off-by: Wang Yufen <wangyufen@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netpoll: Respect NETIF_F_LLTXEric W. Biederman2014-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Stop taking the transmit lock when a network device has specified NETIF_F_LLTX. If no locks needed to trasnmit a packet this is the ideal scenario for netpoll as all packets can be trasnmitted immediately. Even if some locks are needed in ndo_start_xmit skipping any unnecessary serialization is desirable for netpoll as it makes it more likely a debugging packet may be trasnmitted immediately instead of being deferred until later. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netpoll: Remove strong unnecessary assumptions about skbsEric W. Biederman2014-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove the assumption that the skbs that make it to netpoll_send_skb_on_dev are allocated with find_skb, such that skb->users == 1 and nothing is attached that would prevent the skbs from being freed from hard irq context. Remove this assumption by replacing __kfree_skb on error paths with dev_kfree_skb_irq (in hard irq context) and kfree_skb (in process context). Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netpoll: Rename netpoll_rx_enable/disable to netpoll_poll_disable/enableEric W. Biederman2014-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The netpoll_rx_enable and netpoll_rx_disable functions have always controlled polling the network drivers transmit and receive queues. Rename them to netpoll_poll_enable and netpoll_poll_disable to make their functionality clear. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netpoll: Move rx enable/disable into __dev_close_manyEric W. Biederman2014-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Today netpoll_rx_enable and netpoll_rx_disable are called from dev_close and and __dev_close, and not from dev_close_many. Move the calls into __dev_close_many so that we have a single call site to maintain, and so that dev_close_many gains this protection as well. Which importantly makes batched network device deletes safe. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netpoll: Only call ndo_start_xmit from a single placeEric W. Biederman2014-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Factor out the code that needs to surround ndo_start_xmit from netpoll_send_skb_on_dev into netpoll_start_xmit. It is an unfortunate fact that as the netpoll code has been maintained the primary call site ndo_start_xmit learned how to handle vlans and timestamps but the second call of ndo_start_xmit in queue_process did not. With the introduction of netpoll_start_xmit this associated logic now happens at both call sites of ndo_start_xmit and should make it easy for that to continue into the future. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | netpoll: Remove gfp parameter from __netpoll_setupEric W. Biederman2014-03-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The gfp parameter was added in: commit 47be03a28cc6c80e3aa2b3e8ed6d960ff0c5c0af Author: Amerigo Wang <amwang@redhat.com> Date: Fri Aug 10 01:24:37 2012 +0000 netpoll: use GFP_ATOMIC in slave_enable_netpoll() and __netpoll_setup() slave_enable_netpoll() and __netpoll_setup() may be called with read_lock() held, so should use GFP_ATOMIC to allocate memory. Eric suggested to pass gfp flags to __netpoll_setup(). Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> The reason for the gfp parameter was removed in: commit c4cdef9b7183159c23c7302aaf270d64c549f557 Author: dingtianhong <dingtianhong@huawei.com> Date: Tue Jul 23 15:25:27 2013 +0800 bonding: don't call slave_xxx_netpoll under spinlocks The slave_xxx_netpoll will call synchronize_rcu_bh(), so the function may schedule and sleep, it should't be called under spinlocks. bond_netpoll_setup() and bond_netpoll_cleanup() are always protected by rtnl lock, it is no need to take the read lock, as the slave list couldn't be changed outside rtnl lock. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net> Nothing else that calls __netpoll_setup or ndo_netpoll_setup requires a gfp paramter, so remove the gfp parameter from both of these functions making the code clearer. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: net: add a core netdev->tx_dropped counterEric Dumazet2014-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dropping packets in __dev_queue_xmit() when transmit queue is stopped (NIC TX ring buffer full or BQL limit reached) currently outputs a syslog message. It would be better to get a precise count of such events available in netdevice stats so that monitoring tools can have a clue. This extends the work done in caf586e5f23ce ("net: add a core netdev->rx_dropped counter") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | packet: respect devices with LLTX flag in direct xmitDaniel Borkmann2014-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quite often it can be useful to test with dummy or similar devices as a blackhole sink for skbs. Such devices are only equipped with a single txq, but marked as NETIF_F_LLTX as they do not require locking their internal queues on xmit (or implement locking themselves). Therefore, rather use HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected. trafgen mmap/TX_RING example against dummy device with config foo: { fill(0xff, 64) } results in the following performance improvements for such scenarios on an ordinary Core i7/2.80GHz: Before: Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs): 160,975,944,159 instructions:k # 0.55 insns per cycle ( +- 0.09% ) 293,319,390,278 cycles:k # 0.000 GHz ( +- 0.35% ) 192,501,104 branch-misses:k ( +- 1.63% ) 831 context-switches:k ( +- 9.18% ) 7 cpu-migrations:k ( +- 7.40% ) 69,382 cache-misses:k # 0.010 % of all cache refs ( +- 2.18% ) 671,552,021 cache-references:k ( +- 1.29% ) 22.856401569 seconds time elapsed ( +- 0.33% ) After: Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs): 133,788,739,692 instructions:k # 0.92 insns per cycle ( +- 0.06% ) 145,853,213,256 cycles:k # 0.000 GHz ( +- 0.17% ) 59,867,100 branch-misses:k ( +- 4.72% ) 384 context-switches:k ( +- 3.76% ) 6 cpu-migrations:k ( +- 6.28% ) 70,304 cache-misses:k # 0.077 % of all cache refs ( +- 1.73% ) 90,879,408 cache-references:k ( +- 1.35% ) 11.719372413 seconds time elapsed ( +- 0.24% ) Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: make discovery domain a bearer attributeErik Hugne2014-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The node discovery domain is assigned when a bearer is enabled. In the previous commit we reflect this attribute directly in the bearer structure since it's needed to reinitialize the node discovery mechanism after a hardware address change. There's no need to replicate this attribute anywhere else, so we remove it from the tipc_link_req structure. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: fix neighbor detection problem after hw address changeErik Hugne2014-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the hardware address of a underlying netdevice is changed, it is not enough to simply reset the bearer/links over this device. We also need to reflect this change in the TIPC bearer and node discovery structures aswell. This patch adds the necessary reinitialization of the node disovery mechanism following a hardware address change so that the correct originating media address is advertised in the discovery messages. Signed-off-by: Erik Hugne <erik.hugne@ericsson.com> Reported-by: Dong Liu <dliu.cn@gmail.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | hsr: replace del_timer by del_timer_syncJulia Lawall2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use del_timer_sync to ensure that the timer is stopped on all CPUs before the driver exists. This change was suggested by Thomas Gleixner. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ declarer name module_exit; identifier ex; @@ module_exit(ex); @@ identifier r.ex; @@ ex(...) { <... - del_timer + del_timer_sync (...) ...> } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | atm: replace del_timer by del_timer_syncJulia Lawall2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use del_timer_sync to ensure that the timer is stopped on all CPUs before the driver exists. This change was suggested by Thomas Gleixner. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ declarer name module_exit; identifier ex; @@ module_exit(ex); @@ identifier r.ex; @@ ex(...) { <... - del_timer + del_timer_sync (...) ...> } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tcp: tcp_make_synack() minor changesEric Dumazet2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is no need to allocate 15 bytes in excess for a SYNACK packet, as it contains no data, only headers. SYNACK are always generated in softirq context, and contain a single segment, we can use TCP_INC_STATS_BH() Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | ipv6: do not overwrite inetpeer metrics prematurelyMichal Kubeček2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an IPv6 host route with metrics exists, an attempt to add a new route for the same target with different metrics fails but rewrites the metrics anyway: 12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000 12sp0:~ # ip -6 route show fe80::/64 dev eth0 proto kernel metric 256 fec0::1 dev eth0 metric 1024 rto_min lock 1s 12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1500 RTNETLINK answers: File exists 12sp0:~ # ip -6 route show fe80::/64 dev eth0 proto kernel metric 256 fec0::1 dev eth0 metric 1024 rto_min lock 1.5s This is caused by all IPv6 host routes using the metrics in their inetpeer (or the shared default). This also holds for the new route created in ip6_route_add() which shares the metrics with the already existing route and thus ip6_route_add() rewrites the metrics even if the new route ends up not being used at all. Another problem is that old metrics in inetpeer can reappear unexpectedly for a new route, e.g. 12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000 12sp0:~ # ip route del fec0::1 12sp0:~ # ip route add fec0::1 dev eth0 12sp0:~ # ip route change fec0::1 dev eth0 hoplimit 10 12sp0:~ # ip -6 route show fe80::/64 dev eth0 proto kernel metric 256 fec0::1 dev eth0 metric 1024 hoplimit 10 rto_min lock 1s Resolve the first problem by moving the setting of metrics down into fib6_add_rt2node() to the point we are sure we are inserting the new route into the tree. Second problem is addressed by introducing new flag DST_METRICS_FORCE_OVERWRITE which is set for a new host route in ip6_route_add() and makes ipv6_cow_metrics() always overwrite the metrics in inetpeer (even if they are not "new"); it is reset after that. v5: use a flag in _metrics member rather than one in flags v4: fix a typo making a condition always true (thanks to Hannes Frederic Sowa) v3: rewritten based on David Miller's idea to move setting the metrics (and allocation in non-host case) down to the point we already know the route is to be inserted. Also rebased to net-next as it is quite late in the cycle. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: use node list lock to protect tipc_num_links variableYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without properly implicit or explicit read memory barrier, it's unsafe to read an atomic variable with atomic_read() from another thread which is different with the thread of changing the atomic variable with atomic_inc() or atomic_dec(). So a stale tipc_num_links may be got with atomic_read() in tipc_node_get_links(). If the tipc_num_links variable type is converted from atomic to unsigned integer and node list lock is used to protect it, the issue would be avoided. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: use node_list_lock to protect tipc_num_nodes variableYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As tipc_node_list is protected by rcu read lock on read side, it's unnecessary to hold node_list_lock to protect tipc_node_list in tipc_node_get_links(). Instead, node_list_lock should just protects tipc_num_nodes in the function. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: tipc: convert node list and node hlist to RCU listsYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert tipc_node_list list and node_htable hash list to RCU lists. On read side, the two lists are protected with RCU read lock, and on update side, node_list_lock is applied to them. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: rename node create lock to protect node list and hlistYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a node is created, tipc_net_lock read lock is first held and then node_create_lock is grabbed in order to prevent the same node from being created and inserted into both node list and hlist twice. But when we query node from the two node lists, we only hold tipc_net_lock read lock without grabbing node_create_lock. Obviously this locking policy is unable to guarantee that the two node lists are always synchronized especially when the operation of changing and accessing them occurs in different contexts like currently doing. Therefore, rename node_create_lock to node_list_lock to protect the two node lists, that is, whenever node is inserted into them or node is queried from them, the node_list_lock should be always held. As a result, tipc_net_lock read lock becomes redundant and then can be removed from the node query functions. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: make broadcast bearer store in bearer_list arrayYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now unicast bearer is dynamically allocated and placed into its identity specified slot of bearer_list array. When we search bearer_list array with a bearer identity, the corresponding bearer instance can be found. But broadcast bearer is statically allocated and it is not located in the bearer_list array yet. So we decide to enlarge bearer_list array into MAX_BEARERS + 1 slots, and its last slot stores the broadcast bearer so that the broadcast bearer can be found from bearer_list array with MAX_BEARERS as index. The change will help us reduce the complex relationship between bearer and link in the future. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: remove active flag from tipc_bearer structureYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After the allocation of tipc_bearer structure instance is converted from statical way to dynamical way, we identify whether a certain tipc_bearer structure pointer is valid by checking whether the pointer is NULL or not. So the active flag in tipc_bearer structure becomes redundant. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: convert tipc_bearers array to pointer listYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the effort to introduce RCU protection for the bearer list, we first need to change it to a list of pointers. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: acquire necessary locks in named_cluster_distribute routineYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'tipc_node_list' is guarded by tipc_net_lock and 'links' array defined in 'tipc_node' structure is protected by node lock as well. Without acquiring the two locks in named_cluster_distribute() a fatal oops may happen in case that a destroyed link might be got and then accessed. Therefore, above mentioned two locks must be held in named_cluster_distribute() to prevent the issue from happening accidentally. As 'links' array in node struct must be protected by node lock, we have to move the code of selecting an active link from tipc_link_xmit() to named_cluster_distribute() and then call __tipc_link_xmit() with the selected link to deliver name messages. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: obsolete the remote management featureYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to the lacking of any credential, it's allowed to accept commands requested from remote nodes to query the local node status, which is prone to involve potential security risks. Instead, if we login to a remote node with ssh command, this approach is not only more safe than the remote management feature, but also it can give us more permissions like changing the remote node configuration. So it's reasonable for us to obsolete the remote management feature now. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tipc: remove unnecessary checking for node objectYing Xue2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | tipc_node_create routine doesn't need to check whether a node object specified with a node address exists or not because its caller(ie, tipc_disc_recv_msg routine) has checked this before calling it. Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/core: Use RCU_INIT_POINTER(x, NULL) in netpoll.cMonam Agarwal2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net/bridge: Use RCU_INIT_POINTER(x, NULL) in br_vlan.cMonam Agarwal2014-03-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: Monam Agarwal <monamagarwal123@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | vlan: make a new function vlan_dev_vlan_proto() and exportdingtianhong2014-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The vlan support 2 proto: 802.1q and 802.1ad, so make a new function called vlan_dev_vlan_proto() which could return the vlan proto for input dev. Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | net: Rename skb->rxhash to skb->hashTom Herbert2014-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The packet hash can be considered a property of the packet, not just on RX path. This patch changes name of rxhash and l4_rxhash skbuff fields to be hash and l4_hash respectively. This includes changing uses of the field in the code which don't call the access functions. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | tcp: delete unused parameter in tcp_nagle_check()Peter Pan(潘卫平)2014-03-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit d4589926d7a9 (tcp: refine TSO splits), tcp_nagle_check() does not use parameter mss_now anymore. Signed-off-by: Weiping Pan <panweiping3@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2014-03-25
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: Documentation/devicetree/bindings/net/micrel-ks8851.txt net/core/netpoll.c The net/core/netpoll.c conflict is a bug fix in 'net' happening to code which is completely removed in 'net-next'. In micrel-ks8851.txt we simply have overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
| * \ \ \ \ Merge branch 'for-davem' of ↵David S. Miller2014-03-25
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Please pull this batch of wireless updates intended for 3.15! For the mac80211 bits, Johannes says: "This has a whole bunch of bugfixes for things that went into -next previously as well as some other bugfixes I didn't want to rush into 3.14 at this point. The rest of it is some cleanups and a few small features, the biggest of which is probably Janusz's regulatory DFS CAC time code." For the Bluetooth bits, Gustavo says: "One more pull request to 3.15. This is mostly and bug fix pull request, it contains several fixes and clean up all over the tree, plus some small new features." For the NFC bits, Samuel says: "This is the NFC pull request for 3.15. With this one we have: - Support for ISO 15693 a.k.a. NFC vicinity a.k.a. Type 5 tags. ISO 15693 are long range (1 - 2 meters) vicinity tags/cards. The kernel now supports those through the NFC netlink and digital APIs. - Support for TI's trf7970a chipset. This chipset relies on the NFC digital layer and the driver currently supports type 2, 4A and 5 tags. - Support for NXP's pn544 secure firmare download. The pn544 C3 chipsets relies on a different firmware download protocal than the C2 one. We now support both and use the right one depending on the version we detect at runtime. - Support for 4A tags from the NFC digital layer. - A bunch of cleanups and minor fixes from Axel Lin and Thierry Escande." For the iwlwifi bits, Emmanuel says: "We were sending a host command while the mutex wasn't held. This led to hard-to-catch races." And... "I have a fix for a "merge damage" which is not really a merge damage: it enables scheduled scan which has been disabled in wireless.git. Since you merged wireless.git into wireless-next.git, this can now be fixed in wireless-next.git. Besides this, Alex made a workaround for a hardware bug. This fix allows us to consume less power in S3. Arik and Eliad continue to work on D0i3 which is a run-time power saving feature. Eliad also contributes a few bits to the rate scaling logic to which Eyal adds his own contribution. Avri dives deep in the power code - newer firmware will allow to enable power save in newer scenarios. Johannes made a few clean-ups. I have the regular amount of BT Coex boring stuff. I disable uAPSD since we identified firmware bugs that cause packet loss. One thing that do stand out is the udev event that we now send when the FW asserts. I hope it will allow us to debug the FW more easily." Also included is one last iwlwifi pull for a build breakage fix... For the Atheros bits, Kalle says: "Michal now did some optimisations and was able to improve throughput by 100 Mbps on our MIPS based AP135 platform. Chun-Yeow added some workarounds to be able to better use ad-hoc mode. Ben improved log messages and added support for MSDU chaining. And, as usual, also some smaller fixes." Beyond that... Andrea Merello continues his rtl8180 refactoring, in preparation for a long-awaited rtl8187 driver. We get a new driver (rsi) for the RS9113 chip, from Fariya Fatima. And, of course, we get the usual round of updates for ath9k, brcmfmac, mwifiex, wil6210, etc. as well. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | * \ \ \ \ Merge branch 'master' of ↵John W. Linville2014-03-21
| | |\ \ \ \ \ | | | | |_|/ / | | | |/| | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
| | | * | | | Merge branch 'for-upstream' of ↵John W. Linville2014-03-20
| | | |\ \ \ \ | | | | | |_|/ | | | | |/| | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
| | | | * | | Bluetooth: Enforce strict Secure Connections Only mode securityMarcel Holtmann2014-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In Secure Connections Only mode, it is required that Secure Connections is used for pairing and that the link key is encrypted with AES-CCM using a P-256 authenticated combination key. If this is not the case, then new connection shall be refused or existing connections shall be dropped. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
| | | | * | | Bluetooth: Fix Pair Device response parameters for pairing failureJohan Hedberg2014-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is possible that pairing fails after we've already received remote identity information. One example of such a situation is when re-encryption using the LTK fails. In this case the hci_conn object has already been updated with the identity address but user space does not yet know about it (since we didn't notify it of the new IRK yet). To ensure user space doesn't get a Pair Device command response with an unknown address always use the same address in the response as was used for the original command. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| | | | * | | Bluetooth: Fix SMP user passkey notification mgmt eventJohan Hedberg2014-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When performing SMP pairing with MITM protection one side needs to enter the passkey while the other side displays to the user what needs to be entered. Nowhere in the SMP specification does it say that the displaying side needs to any kind of confirmation of the passkey, even though a code comment in smp.c implies this. This patch removes the misleading comment and converts the code to use the passkey notification mgmt event instead of the passkey confirmation mgmt event. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| | | | * | | Bluetooth: Increase SMP re-encryption delay to 500msJohan Hedberg2014-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In some cases the current 250ms delay is not enough for the remote to receive the keys, as can be witnessed by the following log: > ACL Data RX: Handle 64 flags 0x02 dlen 21 [hci1] 231.414217 SMP: Signing Information (0x0a) len 16 Signature key: 555bb66b7ab3abc9d5c287c97fe6eb29 < ACL Data TX: Handle 64 flags 0x00 dlen 21 [hci1] 231.414414 SMP: Encryption Information (0x06) len 16 Long term key: 2a7cdc233c9a4b1f3ed31dd9843fea29 < ACL Data TX: Handle 64 flags 0x00 dlen 15 [hci1] 231.414466 SMP: Master Identification (0x07) len 10 EDIV: 0xeccc Rand: 0x322e0ef50bd9308a < ACL Data TX: Handle 64 flags 0x00 dlen 21 [hci1] 231.414505 SMP: Signing Information (0x0a) len 16 Signature key: bbda1b2076e2325aa66fbcdd5388f745 > HCI Event: Number of Completed Packets (0x13) plen 5 [hci1] 231.483130 Num handles: 1 Handle: 64 Count: 2 < HCI Command: LE Start Encryption (0x08|0x0019) plen 28 [hci1] 231.664211 Handle: 64 Random number: 0x5052ad2b75fed54b Encrypted diversifier: 0xb7c2 Long term key: a336ede66711b49a84bde9b41426692e > HCI Event: Command Status (0x0f) plen 4 [hci1] 231.666937 LE Start Encryption (0x08|0x0019) ncmd 1 Status: Success (0x00) > HCI Event: Number of Completed Packets (0x13) plen 5 [hci1] 231.712646 Num handles: 1 Handle: 64 Count: 1 > HCI Event: Disconnect Complete (0x05) plen 4 [hci1] 232.562587 Status: Success (0x00) Handle: 64 Reason: Remote User Terminated Connection (0x13) As can be seen, the last key (Signing Information) is sent at 231.414505 but the completed packets event for it comes only at 231.712646, i.e. roughly 298ms later. To have a better margin of error this patch increases the delay to 500ms. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| | | | * | | Bluetooth: Simplify logic when checking SMP_FLAG_TK_VALIDJohan Hedberg2014-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a trivial coding style simplification by instead of having an extra early return to instead revert the if condition and do the single needed queue_work() call there. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| | | | * | | Bluetooth: Fix MITM flag when initiating SMP pairingJohan Hedberg2014-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The pairing process initiated through mgmt sets the conn->auth_type value regardless of BR/EDR or LE pairing. This value will contain the MITM flag if the local IO capability allows it. When sending the SMP pairing request we should check the value and ensure that the MITM bit gets correctly set in the bonding flags. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
| | | | * | | Bluetooth: Fix smp_e byte order to be consistent with SMP specificationJohan Hedberg2014-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The SMP specification is written with the assumption that both key information, plaintextData and encryptedData follow the same little endian byte ordering as the rest of SMP. Since the kernel crypto routines expect big endian data the code has had to do various byte swapping tricks to make the behavior as expected, however the swapping has been scattered all around the place. This patch centralizes the byte order swapping into the smp_e function by making its public interface match what the other SMP functions expect as per specification. The benefit is vastly simplified calls to smp_e. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>