litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	Merge branch 'ipv6_route'	David S. Miller	2014-10-24
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Martin KaFai Lau says: ==================== ipv6: Reduce the number of fib6_lookup() calls from ip6_pol_route() This patch set is trying to reduce the number of fib6_lookup() calls from ip6_pol_route(). I have adapted davem's udpflooda and kbench_mod test (https://git.kernel.org/pub/scm/linux/kernel/git/davem/net_test_tools.git) to support IPv6 and here is the result: Before: [root]# for i in $(seq 1 3); do time ./udpflood -l 20000000 -c 250 2401:face:face:face::2; done real 0m34.190s user 0m3.047s sys 0m31.108s real 0m34.635s user 0m3.125s sys 0m31.475s real 0m34.517s user 0m3.034s sys 0m31.449s [root]# insmod ip6_route_kbench.ko oif=2 src=2401:face:face:face::1 dst=2401:face:face:face::2 [ 660.160976] ip6_route_kbench: ip6_route_output tdiff: 933 [ 660.207261] ip6_route_kbench: ip6_route_output tdiff: 988 [ 660.253492] ip6_route_kbench: ip6_route_output tdiff: 896 [ 660.298862] ip6_route_kbench: ip6_route_output tdiff: 898 After: [root]# for i in $(seq 1 3); do time ./udpflood -l 20000000 -c 250 2401:face:face:face::2; done real 0m32.695s user 0m2.925s sys 0m29.737s real 0m32.636s user 0m3.007s sys 0m29.596s real 0m32.797s user 0m2.866s sys 0m29.898s [root]# insmod ip6_route_kbench.ko oif=2 src=2401:face:face:face::1 dst=2401:face:face:face::2 [ 881.220793] ip6_route_kbench: ip6_route_output tdiff: 684 [ 881.253477] ip6_route_kbench: ip6_route_output tdiff: 640 [ 881.286867] ip6_route_kbench: ip6_route_output tdiff: 630 [ 881.320749] ip6_route_kbench: ip6_route_output tdiff: 653 /**************************** udpflood.c ***************************/ / It is an adaptation of the Eric Dumazet's and David Miller's * udpflood tool, by adding IPv6 support. / typedef uint32_t u32; static int debug =3D 0; / Allow -fstrict-aliasing / typedef union sa_u { struct sockaddr_storage a46; struct sockaddr_in a4; struct sockaddr_in6 a6; } sa_u; static int usage(void) { printf("usage: udpflood [ -l count ] [ -m message_size ] [ -c num_ip_addrs= ] IP_ADDRESS\n"); return -1; } static u32 get_last32h(const sa_u sa) { if (sa->a46.ss_family =3D=3D PF_INET) return ntohl(sa->a4.sin_addr.s_addr); else return ntohl(sa->a6.sin6_addr.s6_addr32[3]); } static void set_last32h(sa_u sa, u32 last32h) { if (sa->a46.ss_family =3D=3D PF_INET) sa->a4.sin_addr.s_addr =3D htonl(last32h); else sa->a6.sin6_addr.s6_addr32[3] =3D htonl(last32h); } static void print_saddr(const sa_u sa, const char msg) { char buf[64]; if (!debug) return; switch (sa->a46.ss_family) { case PF_INET: inet_ntop(PF_INET, &(sa->a4.sin_addr.s_addr), buf, sizeof(buf)); break; case PF_INET6: inet_ntop(PF_INET6, &(sa->a6.sin6_addr), buf, sizeof(buf)); break; } printf("%s: %s\n", msg, buf); } static int send_packets(const sa_u sa, size_t num_addrs, int count, int ms= g_sz) { char msg =3D malloc(msg_sz); sa_u saddr; u32 start_addr32h, end_addr32h, cur_addr32h; int fd, i, err; if (!msg) return -ENOMEM; memset(msg, 0, msg_sz); memcpy(&saddr, sa, sizeof(saddr)); cur_addr32h =3D start_addr32h =3D get_last32h(&saddr); end_addr32h =3D start_addr32h + num_addrs; fd =3D socket(saddr.a46.ss_family, SOCK_DGRAM, 0); if (fd < 0) { perror("socket"); err =3D fd; goto out_nofd; } / connect to avoid the kernel spending time in figuring * out the source address (i.e pin the src address) / err =3D connect(fd, (struct sockaddr ) &saddr, sizeof(saddr)); if (err < 0) { perror("connect"); goto out; } print_saddr(&saddr, "start_addr"); for (i =3D 0; i < count; i++) { print_saddr(&saddr, "sendto"); err =3D sendto(fd, msg, msg_sz, 0, (struct sockaddr )&saddr, sizeof(saddr)); if (err < 0) { perror("sendto"); goto out; } if (++cur_addr32h >=3D end_addr32h) cur_addr32h =3D start_addr32h; set_last32h(&saddr, cur_addr32h); } err =3D 0; out: close(fd); out_nofd: free(msg); return err; } int main(int argc, char argv, char envp) { int port, msg_sz, count, num_addrs, ret; sa_u start_addr; port =3D 6000; msg_sz =3D 32; count =3D 10000000; num_addrs =3D 1; while ((ret =3D getopt(argc, argv, "dl:s:p:c:")) >=3D 0) { switch (ret) { case 'l': sscanf(optarg, "%d", &count); break; case 's': sscanf(optarg, "%d", &msg_sz); break; case 'p': sscanf(optarg, "%d", &port); break; case 'c': sscanf(optarg, "%d", &num_addrs); break; case 'd': debug =3D 1; break; case '?': return usage(); } } if (num_addrs < 1) return usage(); if (!argv[optind]) return usage(); start_addr.a4.sin_port =3D htons(port); if (inet_pton(PF_INET, argv[optind], &start_addr.a4.sin_addr)) start_addr.a46.ss_family =3D PF_INET; else if (inet_pton(PF_INET6, argv[optind], &start_addr.a6.sin6_addr.s6_add= r)) start_addr.a46.ss_family =3D PF_INET6; else return usage(); return send_packets(&start_addr, num_addrs, count, msg_sz); } /*************** ip6_route_kbench_mod.c ***************/ / We can't just use "get_cycles()" as on some platforms, such * as sparc64, that gives system cycles rather than cpu clock * cycles. / static inline unsigned long long get_tick(void) { unsigned long long t; __asm__ __volatile__("rd %%tick, %0" : "=r" (t)); return t; } static inline unsigned long long get_tick(void) { unsigned long long t; rdtscll(t); return t; } static inline unsigned long long get_tick(void) { return get_cycles(); } static int flow_oif = DEFAULT_OIF; static int flow_iif = DEFAULT_IIF; static u32 flow_mark = DEFAULT_MARK; static struct in6_addr flow_dst_ip_addr; static struct in6_addr flow_src_ip_addr; static int flow_tos = DEFAULT_TOS; static char dst_string[64]; static char src_string[64]; module_param_string(dst, dst_string, sizeof(dst_string), 0); module_param_string(src, src_string, sizeof(src_string), 0); static int __init flow_setup(void) { if (dst_string[0] && !in6_pton(dst_string, -1, &flow_dst_ip_addr.s6_addr[0], -1, NULL)) { pr_info("cannot parse \"%s\"\n", dst_string); return -1; } if (src_string[0] && !in6_pton(src_string, -1, &flow_src_ip_addr.s6_addr[0], -1, NULL)) { pr_info("cannot parse \"%s\"\n", dst_string); return -1; } return 0; } module_param_named(oif, flow_oif, int, 0); module_param_named(iif, flow_iif, int, 0); module_param_named(mark, flow_mark, uint, 0); module_param_named(tos, flow_tos, int, 0); static int warmup_count = DEFAULT_WARMUP_COUNT; module_param_named(count, warmup_count, int, 0); static void flow_init(struct flowi6 fl6) { memset(fl6, 0, sizeof(fl6)); fl6->flowi6_proto = IPPROTO_ICMPV6; fl6->flowi6_oif = flow_oif; fl6->flowi6_iif = flow_iif; fl6->flowi6_mark = flow_mark; fl6->flowi6_tos = flow_tos; fl6->daddr = flow_dst_ip_addr; fl6->saddr = flow_src_ip_addr; } static struct sk_buff fake_skb_get(void) { struct ipv6hdr hdr; struct sk_buff skb; skb = alloc_skb(4096, GFP_KERNEL); if (!skb) { pr_info("Cannot alloc SKB for test\n"); return NULL; } skb->dev = __dev_get_by_index(&init_net, flow_iif); if (skb->dev == NULL) { pr_info("Input device (%d) does not exist\n", flow_iif); goto err; } skb_reset_mac_header(skb); skb_reset_network_header(skb); skb_reserve(skb, MAX_HEADER + sizeof(struct ipv6hdr)); hdr = ipv6_hdr(skb); hdr->priority = 0; hdr->version = 6; memset(hdr->flow_lbl, 0, sizeof(hdr->flow_lbl)); hdr->payload_len = htons(sizeof(struct icmp6hdr)); hdr->nexthdr = IPPROTO_ICMPV6; hdr->saddr = flow_src_ip_addr; hdr->daddr = flow_dst_ip_addr; skb->protocol = htons(ETH_P_IPV6); skb->mark = flow_mark; return skb; err: kfree_skb(skb); return NULL; } static void do_full_output_lookup_bench(void) { unsigned long long t1, t2, tdiff; struct rt6_info rt; struct flowi6 fl6; int i; rt = NULL; for (i = 0; i < warmup_count; i++) { flow_init(&fl6); rt = (struct rt6_info )ip6_route_output(&init_net, NULL, &fl6); if (IS_ERR(rt)) break; ip6_rt_put(rt); } if (IS_ERR(rt)) { pr_info("ip_route_output_key: err=%ld\n", PTR_ERR(rt)); return; } flow_init(&fl6); t1 = get_tick(); rt = (struct rt6_info )ip6_route_output(&init_net, NULL, &fl6); t2 = get_tick(); if (!IS_ERR(rt)) ip6_rt_put(rt); tdiff = t2 - t1; pr_info("ip6_route_output tdiff: %llu\n", tdiff); } static void do_full_input_lookup_bench(void) { unsigned long long t1, t2, tdiff; struct sk_buff skb; struct rt6_info rt; int err, i; skb = fake_skb_get(); if (skb == NULL) goto out_free; err = 0; local_bh_disable(); for (i = 0; i < warmup_count; i++) { ip6_route_input(skb); rt = (struct rt6_info )skb_dst(skb); err = (!rt \|\| rt == init_net.ipv6.ip6_null_entry); skb_dst_drop(skb); if (err) break; } local_bh_enable(); if (err) { pr_info("Input route lookup fails\n"); goto out_free; } local_bh_disable(); t1 = get_tick(); ip6_route_input(skb); t2 = get_tick(); local_bh_enable(); rt = (struct rt6_info *)skb_dst(skb); err = (!rt \|\| rt == init_net.ipv6.ip6_null_entry); skb_dst_drop(skb); if (err) { pr_info("Input route lookup fails\n"); goto out_free; } tdiff = t2 - t1; pr_info("ip6_route_input tdiff: %llu\n", tdiff); out_free: kfree_skb(skb); } static void do_full_lookup_bench(void) { if (!flow_iif) do_full_output_lookup_bench(); else do_full_input_lookup_bench(); } static void do_bench(void) { do_full_lookup_bench(); do_full_lookup_bench(); do_full_lookup_bench(); do_full_lookup_bench(); } static int __init kbench_init(void) { if (flow_setup()) return -EINVAL; pr_info("flow [IIF(%d),OIF(%d),MARK(0x%08x),D("IP6_FMT")," "S("IP6_FMT"),TOS(0x%02x)]\n", flow_iif, flow_oif, flow_mark, IP6_PRT(flow_dst_ip_addr), IP6_PRT(flow_src_ip_addr), flow_tos); if (!cpu_has_tsc) { pr_err("X86 TSC is required, but is unavailable.\n"); return -EINVAL; } pr_info("sizeof(struct rt6_info)==%zu\n", sizeof(struct rt6_info)); do_bench(); return -ENODEV; } static void __exit kbench_exit(void) { } module_init(kbench_init); module_exit(kbench_exit); MODULE_LICENSE("GPL"); ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ipv6: Avoid redoing fib6_lookup() with reachable = 0 by saving fn	Martin KaFai Lau	2014-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch save the fn before doing rt6_backtrack. Hence, without redo-ing the fib6_lookup(), saved_fn can be used to redo rt6_select() with RT6_LOOKUP_F_REACHABLE off. Some minor changes I think make sense to review as a single patch: * Remove the 'out:' goto label. * Remove the 'reachable' variable. Only use the 'strict' variable instead. After this patch, "failing ip6_ins_rt()" should be the only case that requires a redo of fib6_lookup(). Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ipv6: Avoid redoing fib6_lookup() for RTF_CACHE hit case	Martin KaFai Lau	2014-10-24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When there is a RTF_CACHE hit, no need to redo fib6_lookup() with reachable=0. Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	ipv6: Remove BACKTRACK macro	Martin KaFai Lau	2014-10-24
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is the prep work to reduce the number of calls to fib6_lookup(). The BACKTRACK macro could be hard-to-read and error-prone due to its side effects (mainly goto). This patch is to: 1. Replace BACKTRACK macro with a function (fib6_backtrack) with the following return values: * If it is backtrack-able, returns next fn for retry. * If it reaches the root, returns NULL. 2. The caller needs to decide if a backtrack is needed (by testing rt == net->ipv6.ip6_null_entry). 3. Rename the goto labels in ip6_pol_route() to make the next few patches easier to read. Cc: David Miller <davem@davemloft.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: Remove trailing whitespace in tcp.h icmp.c syncookies.c	Kenjiro Nakayama	2014-10-24
\| \| \| \| \| \| \|	Remove trailing whitespace in tcp.h icmp.c syncookies.c Signed-off-by: Kenjiro Nakayama <nakayamakenjiro@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	hyperv: Fix the total_data_buflen in send path	Haiyang Zhang	2014-10-22
\| \| \| \| \| \| \| \| \| \| \| \| \|	total_data_buflen is used by netvsc_send() to decide if a packet can be put into send buffer. It should also include the size of RNDIS message before the Ethernet frame. Otherwise, a messge with total size bigger than send_section_size may be copied into the send buffer, and cause data corruption. [Request to include this patch to the Stable branches] Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'amd-xgbe'	David S. Miller	2014-10-22
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Tom Lendacky says: ==================== amd-xgbe: AMD XGBE driver fixes 2014-10-22 The following series of patches includes fixes to the driver. - Properly handle feature changes via ethtool by using correctly sized variables - Perform proper napi packet counting and budget checking This patch series is based on net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	amd-xgbe: Fix napi Rx budget accounting	Lendacky, Thomas	2014-10-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently the amd-xgbe driver increments the packets processed counter each time a descriptor is processed. Since a packet can be represented by more than one descriptor incrementing the counter in this way is not appropriate. Also, since multiple descriptors cause the budget check to be short circuited, sometimes the returned value from the poll function would be larger than the budget value resulting in a WARN_ONCE being triggered. Update the polling logic to properly account for the number of packets processed and exit when the budget value is reached. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	amd-xgbe: Properly handle feature changes via ethtool	Lendacky, Thomas	2014-10-22
\|/ \| \| \| \| \| \| \| \| \| \| \| \|	The ndo_set_features callback function was improperly using an unsigned int to save the current feature value for features such as NETIF_F_RXCSUM. Since that feature is in the upper 32 bits of a 64 bit variable the result was always 0 making it not possible to actually turn off the hardware RX checksum support. Change the unsigned int type to the netdev_features_t type in order to properly capture the current value and perform the proper operation. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: fec: ptp: fix NULL pointer dereference if ptp_clock is not set	Philipp Zabel	2014-10-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since commit 278d24047891 (net: fec: ptp: Enable PPS output based on ptp clock) fec_enet_interrupt calls fec_ptp_check_pps_event unconditionally, which calls into ptp_clock_event. If fep->ptp_clock is NULL, ptp_clock_event tries to dereference the NULL pointer. Since on i.MX53 fep->bufdesc_ex is not set, fec_ptp_init is never called, and fep->ptp_clock is NULL, which reliably causes a kernel panic. This patch adds a check for fep->ptp_clock == NULL in fec_enet_interrupt. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: fix saving TX flow hash in sock for outgoing connections	Sathya Perla	2014-10-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit "net: Save TX flow hash in sock and set in skbuf on xmit" introduced the inet_set_txhash() and ip6_set_txhash() routines to calculate and record flow hash(sk_txhash) in the socket structure. sk_txhash is used to set skb->hash which is used to spread flows across multiple TXQs. But, the above routines are invoked before the source port of the connection is created. Because of this all outgoing connections that just differ in the source port get hashed into the same TXQ. This patch fixes this problem for IPv4/6 by invoking the the above routines after the source port is available for the socket. Fixes: b73c3d0e4("net: Save TX flow hash in sock and set in skbuf on xmit") Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	xfrm6: fix a potential use after free in xfrm6_policy.c	Li RongQing	2014-10-22
\| \| \| \| \| \| \| \|	pskb_may_pull() maybe change skb->data and make nh and exthdr pointer oboslete, so recompute the nd and exthdr Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: fs_enet: set back promiscuity mode after restart	LEROY Christophe	2014-10-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After interface restart (eg: after link disconnection/reconnection), the bridge function doesn't work anymore. This is due to the promiscuous mode being cleared by the restart. The mac-fcc already includes code to set the promiscuous mode back during the restart. This patch adds the same handling to mac-fec and mac-scc. Tested with bridge function on MPC885 with FEC. Reported-by: Germain Montoies <germain.montoies@c-s.fr> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: tso: fix unaligned access to crafted TCP header in helper API	Karl Beldan	2014-10-22
\| \| \| \| \| \| \| \| \| \| \| \|	The crafted header start address is from a driver supplied buffer, which one can reasonably expect to be aligned on a 4-bytes boundary. However ATM the TSO helper API is only used by ethernet drivers and the tcp header will then be aligned to a 2-bytes only boundary from the header start address. Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com> Cc: Ezequiel Garcia <ezequiel.garcia@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	sfc: remove incorrect EFX_BUG_ON_PARANOID check	Jon Cooper	2014-10-22
\| \| \| \| \| \| \| \| \| \|	write_count and insert_count can wrap around, making > check invalid. Fixes: 70b33fb0ddec827cbbd14cdc664fc27b2ef4a6b6 ("sfc: add support for skb->xmit_more"). Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	net: sched: initialize bstats syncp	Sabrina Dubroca	2014-10-21
\| \| \| \| \| \| \| \| \|	Use netdev_alloc_pcpu_stats to allocate percpu stats and initialize syncp. Fixes: 22e0f8b9322c "net: sched: make bstats per cpu and estimator RCU safe" Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Cong Wang <cwang@twopensource.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	bpf: fix bug in eBPF verifier	Alexei Starovoitov	2014-10-21
\| \| \| \| \| \| \| \| \| \| \| \|	while comparing for verifier state equivalency the comparison was missing a check for uninitialized register. Make sure it does so and add a testcase. Fixes: f1bca824dabb ("bpf: add search pruning optimization to verifier") Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
*	netlink: Re-add locking to netlink_lookup() and seq walker	Thomas Graf	2014-10-21
\| \| \| \| \| \| \| \| \| \| \| \| \|	The synchronize_rcu() in netlink_release() introduces unacceptable latency. Reintroduce minimal lookup so we can drop the synchronize_rcu() until socket destruction has been RCUfied. Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com> Reported-and-tested-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: fix lockdep warning when intra-node messages are delivered	Ying Xue	2014-10-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When running tipcTC&tipcTS test suite, below lockdep unsafe locking scenario is reported: [ 1109.997854] [ 1109.997988] ================================= [ 1109.998290] [ INFO: inconsistent lock state ] [ 1109.998575] 3.17.0-rc1+ #113 Not tainted [ 1109.998762] --------------------------------- [ 1109.998762] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 1109.998762] swapper/7/0 [HC0[0]:SC1[1]:HE1:SE0] takes: [ 1109.998762] (slock-AF_TIPC){+.?...}, at: [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] {SOFTIRQ-ON-W} state was registered at: [ 1109.998762] [<ffffffff810a4770>] __lock_acquire+0x6a0/0x1d80 [ 1109.998762] [<ffffffff810a6555>] lock_acquire+0x95/0x1e0 [ 1109.998762] [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80 [ 1109.998762] [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffffa0004fe8>] tipc_link_xmit+0xa8/0xc0 [tipc] [ 1109.998762] [<ffffffffa000ec6f>] tipc_sendmsg+0x15f/0x550 [tipc] [ 1109.998762] [<ffffffffa000f165>] tipc_connect+0x105/0x140 [tipc] [ 1109.998762] [<ffffffff817676ee>] SYSC_connect+0xae/0xc0 [ 1109.998762] [<ffffffff81767b7e>] SyS_connect+0xe/0x10 [ 1109.998762] [<ffffffff817a9788>] compat_SyS_socketcall+0xb8/0x200 [ 1109.998762] [<ffffffff81a306e5>] sysenter_dispatch+0x7/0x1f [ 1109.998762] irq event stamp: 241060 [ 1109.998762] hardirqs last enabled at (241060): [<ffffffff8105a4ad>] __local_bh_enable_ip+0x6d/0xd0 [ 1109.998762] hardirqs last disabled at (241059): [<ffffffff8105a46f>] __local_bh_enable_ip+0x2f/0xd0 [ 1109.998762] softirqs last enabled at (241020): [<ffffffff81059a52>] _local_bh_enable+0x22/0x50 [ 1109.998762] softirqs last disabled at (241021): [<ffffffff8105a626>] irq_exit+0x96/0xc0 [ 1109.998762] [ 1109.998762] other info that might help us debug this: [ 1109.998762] Possible unsafe locking scenario: [ 1109.998762] [ 1109.998762] CPU0 [ 1109.998762] ---- [ 1109.998762] lock(slock-AF_TIPC); [ 1109.998762] <Interrupt> [ 1109.998762] lock(slock-AF_TIPC); [ 1109.998762] [ 1109.998762] * DEADLOCK * [ 1109.998762] [ 1109.998762] 2 locks held by swapper/7/0: [ 1109.998762] #0: (rcu_read_lock){......}, at: [<ffffffff81782dc9>] __netif_receive_skb_core+0x69/0xb70 [ 1109.998762] #1: (rcu_read_lock){......}, at: [<ffffffffa0001c90>] tipc_l2_rcv_msg+0x40/0x260 [tipc] [ 1109.998762] [ 1109.998762] stack backtrace: [ 1109.998762] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.17.0-rc1+ #113 [ 1109.998762] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 1109.998762] ffffffff82745830 ffff880016c03828 ffffffff81a209eb 0000000000000007 [ 1109.998762] ffff880017b3cac0 ffff880016c03888 ffffffff81a1c5ef 0000000000000001 [ 1109.998762] ffff880000000001 ffff880000000000 ffffffff81012d4f 0000000000000000 [ 1109.998762] Call Trace: [ 1109.998762] <IRQ> [<ffffffff81a209eb>] dump_stack+0x4e/0x68 [ 1109.998762] [<ffffffff81a1c5ef>] print_usage_bug+0x1f1/0x202 [ 1109.998762] [<ffffffff81012d4f>] ? save_stack_trace+0x2f/0x50 [ 1109.998762] [<ffffffff810a406c>] mark_lock+0x28c/0x2f0 [ 1109.998762] [<ffffffff810a3440>] ? print_irq_inversion_bug.part.46+0x1f0/0x1f0 [ 1109.998762] [<ffffffff810a467d>] __lock_acquire+0x5ad/0x1d80 [ 1109.998762] [<ffffffff810a70dd>] ? trace_hardirqs_on+0xd/0x10 [ 1109.998762] [<ffffffff8108ace8>] ? sched_clock_cpu+0x98/0xc0 [ 1109.998762] [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30 [ 1109.998762] [<ffffffff810a10dc>] ? lock_release_holdtime.part.29+0x1c/0x1a0 [ 1109.998762] [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90 [ 1109.998762] [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc] [ 1109.998762] [<ffffffff810a6555>] lock_acquire+0x95/0x1e0 [ 1109.998762] [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffff810a6fb6>] ? trace_hardirqs_on_caller+0xa6/0x1c0 [ 1109.998762] [<ffffffff81a2d1ce>] _raw_spin_lock+0x3e/0x80 [ 1109.998762] [<ffffffffa0011969>] ? tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffffa000dec0>] ? tipc_sk_get+0x60/0x80 [tipc] [ 1109.998762] [<ffffffffa0011969>] tipc_sk_rcv+0x49/0x2b0 [tipc] [ 1109.998762] [<ffffffffa00076bd>] tipc_rcv+0x5ed/0x960 [tipc] [ 1109.998762] [<ffffffffa0001d1c>] tipc_l2_rcv_msg+0xcc/0x260 [tipc] [ 1109.998762] [<ffffffffa0001c90>] ? tipc_l2_rcv_msg+0x40/0x260 [tipc] [ 1109.998762] [<ffffffff81783345>] __netif_receive_skb_core+0x5e5/0xb70 [ 1109.998762] [<ffffffff81782dc9>] ? __netif_receive_skb_core+0x69/0xb70 [ 1109.998762] [<ffffffff81784eb9>] ? dev_gro_receive+0x259/0x4e0 [ 1109.998762] [<ffffffff817838f6>] __netif_receive_skb+0x26/0x70 [ 1109.998762] [<ffffffff81783acd>] netif_receive_skb_internal+0x2d/0x1f0 [ 1109.998762] [<ffffffff81785518>] napi_gro_receive+0xd8/0x240 [ 1109.998762] [<ffffffff815bf854>] e1000_clean_rx_irq+0x2c4/0x530 [ 1109.998762] [<ffffffff815c1a46>] e1000_clean+0x266/0x9c0 [ 1109.998762] [<ffffffff8108ad2b>] ? local_clock+0x1b/0x30 [ 1109.998762] [<ffffffff8108aa05>] ? sched_clock_local+0x25/0x90 [ 1109.998762] [<ffffffff817842b1>] net_rx_action+0x141/0x310 [ 1109.998762] [<ffffffff810bd710>] ? handle_fasteoi_irq+0xe0/0x150 [ 1109.998762] [<ffffffff81059fa6>] __do_softirq+0x116/0x4d0 [ 1109.998762] [<ffffffff8105a626>] irq_exit+0x96/0xc0 [ 1109.998762] [<ffffffff81a30d07>] do_IRQ+0x67/0x110 [ 1109.998762] [<ffffffff81a2ee2f>] common_interrupt+0x6f/0x6f [ 1109.998762] <EOI> [<ffffffff8100d2b7>] ? default_idle+0x37/0x250 [ 1109.998762] [<ffffffff8100d2b5>] ? default_idle+0x35/0x250 [ 1109.998762] [<ffffffff8100dd1f>] arch_cpu_idle+0xf/0x20 [ 1109.998762] [<ffffffff810999fd>] cpu_startup_entry+0x27d/0x4d0 [ 1109.998762] [<ffffffff81034c78>] start_secondary+0x188/0x1f0 When intra-node messages are delivered from one process to another process, tipc_link_xmit() doesn't disable BH before it directly calls tipc_sk_rcv() on process context to forward messages to destination socket. Meanwhile, if messages delivered by remote node arrive at the node and their destinations are also the same socket, tipc_sk_rcv() running on process context might be preempted by tipc_sk_rcv() running BH context. As a result, the latter cannot obtain the socket lock as the lock was obtained by the former, however, the former has no chance to be run as the latter is owning the CPU now, so headlock happens. To avoid it, BH should be always disabled in tipc_sk_rcv(). Signed-off-by: Ying Xue <ying.xue@windriver.com> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	tipc: fix a potential deadlock	Ying Xue	2014-10-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Locking dependency detected below possible unsafe locking scenario: CPU0 CPU1 T0: tipc_named_rcv() tipc_rcv() T1: [grab nametble write lock]* [grab node lock]* T2: tipc_update_nametbl() tipc_node_link_up() T3: tipc_nodesub_subscribe() tipc_nametbl_publish() T4: [grab node lock]* [grab nametble write lock]* The opposite order of holding nametbl write lock and node lock on above two different paths may result in a deadlock. If we move the the updating of the name table after link state named out of node lock, the reverse order of holding locks will be eliminated, and as a result, the deadlock risk. Signed-off-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'enic'	David S. Miller	2014-10-21
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Govindarajulu Varadarajan says: ==================== enic: Bug fixes This series fixes the following problem. Please apply this to net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	enic: Do not call napi_disable when preemption is disabled.	Govindarajulu Varadarajan	2014-10-21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In enic_stop, we disable preemption using local_bh_disable(). We disable preemption to wait for busy_poll to finish. napi_disable should not be called here as it might sleep. Moving napi_disable() call out side of local_bh_disable. BUG: sleeping function called from invalid context at include/linux/netdevice.h:477 in_atomic(): 1, irqs_disabled(): 0, pid: 443, name: ifconfig INFO: lockdep is turned off. Preemption disabled at:[<ffffffffa029c5c4>] enic_rfs_flw_tbl_free+0x34/0xd0 [enic] CPU: 31 PID: 443 Comm: ifconfig Not tainted 3.17.0-netnext-05504-g59f35b8 #268 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 ffff8800dac10000 ffff88020b8dfcb8 ffffffff8148a57c 0000000000000000 ffff88020b8dfcd0 ffffffff8107e253 ffff8800dac12a40 ffff88020b8dfd10 ffffffffa029305b ffff88020b8dfd48 ffff8800dac10000 ffff88020b8dfd48 Call Trace: [<ffffffff8148a57c>] dump_stack+0x4e/0x7a [<ffffffff8107e253>] __might_sleep+0x123/0x1a0 [<ffffffffa029305b>] enic_stop+0xdb/0x4d0 [enic] [<ffffffff8138ed7d>] __dev_close_many+0x9d/0xf0 [<ffffffff8138ef81>] __dev_close+0x31/0x50 [<ffffffff813974a8>] __dev_change_flags+0x98/0x160 [<ffffffff81397594>] dev_change_flags+0x24/0x60 [<ffffffff814085fd>] devinet_ioctl+0x63d/0x710 [<ffffffff81139c16>] ? might_fault+0x56/0xc0 [<ffffffff81409ef5>] inet_ioctl+0x65/0x90 [<ffffffff813768e0>] sock_do_ioctl+0x20/0x50 [<ffffffff81376ebb>] sock_ioctl+0x20b/0x2e0 [<ffffffff81197250>] do_vfs_ioctl+0x2e0/0x500 [<ffffffff81492619>] ? sysret_check+0x22/0x5d [<ffffffff81285f23>] ? __this_cpu_preempt_check+0x13/0x20 [<ffffffff8109fe19>] ? trace_hardirqs_on_caller+0x119/0x270 [<ffffffff811974ac>] SyS_ioctl+0x3c/0x80 [<ffffffff814925ed>] system_call_fastpath+0x1a/0x1f Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	enic: fix possible deadlock in enic_stop/ enic_rfs_flw_tbl_free	Govindarajulu Varadarajan	2014-10-21
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following warning is shown when spinlock debug is enabled. This occurs when enic_flow_may_expire timer function is running and enic_stop is called on same CPU. Fix this by using spink_lock_bh(). ================================= [ INFO: inconsistent lock state ] 3.17.0-netnext-05504-g59f35b8 #268 Not tainted --------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. ifconfig/443 [HC0[0]:SC0[0]:HE1:SE1] takes: (&(&enic->rfs_h.lock)->rlock){+.?...}, at: enic_rfs_flw_tbl_free+0x34/0xd0 [enic] {IN-SOFTIRQ-W} state was registered at: [<ffffffff810a25af>] __lock_acquire+0x83f/0x21c0 [<ffffffff810a45f2>] lock_acquire+0xa2/0xd0 [<ffffffff814913fc>] _raw_spin_lock+0x3c/0x80 [<ffffffffa029c3d5>] enic_flow_may_expire+0x25/0x130[enic] [<ffffffff810bcd07>] call_timer_fn+0x77/0x100 [<ffffffff810bd8e3>] run_timer_softirq+0x1e3/0x270 [<ffffffff8105f9ae>] __do_softirq+0x14e/0x280 [<ffffffff8105fdae>] irq_exit+0x8e/0xb0 [<ffffffff8103da0f>] smp_apic_timer_interrupt+0x3f/0x50 [<ffffffff81493742>] apic_timer_interrupt+0x72/0x80 [<ffffffff81018143>] default_idle+0x13/0x20 [<ffffffff81018a6a>] arch_cpu_idle+0xa/0x10 [<ffffffff81097676>] cpu_startup_entry+0x2c6/0x330 [<ffffffff8103b7ad>] start_secondary+0x21d/0x290 irq event stamp: 2997 hardirqs last enabled at (2997): [<ffffffff81491865>] _raw_spin_unlock_irqrestore+0x65/0x90 hardirqs last disabled at (2996): [<ffffffff814915e6>] _raw_spin_lock_irqsave+0x26/0x90 softirqs last enabled at (2968): [<ffffffff813b57a3>] dev_deactivate_many+0x213/0x260 softirqs last disabled at (2966): [<ffffffff813b5783>] dev_deactivate_many+0x1f3/0x260 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&enic->rfs_h.lock)->rlock); <Interrupt> lock(&(&enic->rfs_h.lock)->rlock); * DEADLOCK * Reported-by: Jan Stancek <jstancek@redhat.com> Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge branch 'gso_encap_fixes'	David S. Miller	2014-10-20
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Florian Westphal says: ==================== net: minor gso encapsulation fixes The following series fixes a minor bug in the gso segmentation handlers when encapsulation offload is used. Theoretically this could cause kernel panic when the stack tries to software-segment such a GRE offload packet, but it looks like there is only one affected call site (tbf scheduler) and it handles NULL return value. I've included a followup patch to add IS_ERR_OR_NULL checks where needed. While looking into this, I also found that size computation of the individual segments is incorrect if skb->encapsulation is set. Please see individual patches for delta vs. v1. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: core: handle encapsulation offloads when computing segment lengths	Florian Westphal	2014-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if ->encapsulation is set we have to use inner_tcp_hdrlen and add the size of the inner network headers too. This is 'mostly harmless'; tbf might send skb that is slightly over quota or drop skb even if it would have fit. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: make skb_gso_segment error handling more robust	Florian Westphal	2014-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	skb_gso_segment has three possible return values: 1. a pointer to the first segmented skb 2. an errno value (IS_ERR()) 3. NULL. This can happen when GSO is used for header verification. However, several callers currently test IS_ERR instead of IS_ERR_OR_NULL and would oops when NULL is returned. Note that these call sites should never actually see such a NULL return value; all callers mask out the GSO bits in the feature argument. However, there have been issues with some protocol handlers erronously not respecting the specified feature mask in some cases. It is preferable to get 'have to turn off hw offloading, else slow' reports rather than 'kernel crashes'. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	net: gso: use feature flag argument in all protocol gso handlers	Florian Westphal	2014-10-20
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	skb_gso_segment() has a 'features' argument representing offload features available to the output path. A few handlers, e.g. GRE, instead re-fetch the features of skb->dev and use those instead of the provided ones when handing encapsulation/tunnels. Depending on dev->hw_enc_features of the output device skb_gso_segment() can then return NULL even when the caller has disabled all GSO feature bits, as segmentation of inner header thinks device will take care of segmentation. This e.g. affects the tbf scheduler, which will silently drop GRE-encap GSO skbs that did not fit the remaining token quota as the segmentation does not work when device supports corresponding hw offload capabilities. Cc: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
*	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf	David S. Miller	2014-10-20
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pablo Neira Ayuso says: ==================== netfilter fixes for net The following patchset contains netfilter fixes for your net tree, they are: 1) Fix missing MODULE_LICENSE() in the new nf_reject_ipv{4,6} modules. 2) Restrict nat and masq expressions to the nat chain type. Otherwise, users may crash their kernel if they attach a nat/masq rule to a non nat chain. 3) Fix hook validation in nft_compat when non-base chains are used. Basically, initialize hook_mask to zero. 4) Make sure you use match/targets in nft_compat from the right chain type. The existing validation relies on the table name which can be avoided by 5) Better netlink attribute validation in nft_nat. This expression has to reject the configuration when no address and proto configurations are specified. 6) Interpret NFTA_NAT_REG__MAX if only if NFTA_NAT_REG__MIN is set. Yet another sanity check to reject incorrect configurations from userspace. 7) Conditional NAT attribute dumping depending on the existing configuration. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	netfilter: nft_nat: dump attributes if they are set	Pablo Neira Ayuso	2014-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dump NFTA_NAT_REG_ADDR_MIN if this is non-zero. Same thing with NFTA_NAT_REG_PROTO_MIN. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
\| *	netfilter: nft_nat: NFTA_NAT_REG_ADDR_MAX depends on NFTA_NAT_REG_ADDR_MIN	Pablo Neira Ayuso	2014-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interpret NFTA_NAT_REG_ADDR_MAX if NFTA_NAT_REG_ADDR_MIN is present, otherwise, skip it. Same thing with NFTA_NAT_REG_PROTO_MAX. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
\| *	netfilter: nft_nat: insufficient attribute validation	Pablo Neira Ayuso	2014-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have to validate that we at least get an NFTA_NAT_REG_ADDR_MIN or NFTA_NFT_REG_PROTO_MIN attribute. Reject the configuration if none of them are present. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
\| *	netfilter: nft_compat: validate chain type in match/target	Pablo Neira Ayuso	2014-10-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have to validate the real chain type to ensure that matches/targets are not used out from their scope (eg. MASQUERADE in nat chain type). The existing validation relies on the table name, but this is not sufficient since userspace can fool us by using the appropriate table name with a different chain type. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
\| *	netfilter: nft_compat: fix hook validation for non-base chains	Pablo Neira Ayuso	2014-10-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Set hook_mask to zero for non-base chains, otherwise people may hit bogus errors from the xt_check_target() and xt_check_match() when validating the uninitialized hook_mask. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
\| *	netfilter: nf_tables: restrict nat/masq expressions to nat chain type	Pablo Neira Ayuso	2014-10-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds the missing validation code to avoid the use of nat/masq from non-nat chains. The validation assumes two possible configuration scenarios: 1) Use of nat from base chain that is not of nat type. Reject this configuration from the nft__init() path of the expression. 2) Use of nat from non-base chain. In this case, we have to wait until the non-base chain is referenced by at least one base chain via jump/goto. This is resolved from the nft__validate() path which is called from nf_tables_check_loops(). The user gets an -EOPNOTSUPP in both cases. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
\| *	netfilter: missing module license in the nf_reject_ipvX modules	Pablo Neira Ayuso	2014-10-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	[ 23.545204] nf_reject_ipv4: module license 'unspecified' taints kernel. Fixes: c8d7b98 ("netfilter: move nf_send_resetX() code to nf_reject_ipvX modules") Reported-by: Dave Young <dyoung@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* \|	ax88179_178a: fix bonding failure	Ian Morgan	2014-10-20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The following patch fixes a bug which causes the ax88179_178a driver to be incapable of being added to a bond. When I brought up the issue with the bonding maintainers, they indicated that the real problem was with the NIC driver which must return zero for success (of setting the MAC address). I see that several other NIC drivers follow that pattern by either simply always returing zero, or by passing through a negative (error) result while rewriting any positive return code to zero. With that same philisophy applied to the ax88179_178a driver, it allows it to work correctly with the bonding driver. I believe this is suitable for queuing in -stable, as it's a small, simple, and obvious fix that corrects a defect with no other known workaround. This patch is against vanilla 3.17(.0). Signed-off-by: Ian Morgan <imorgan@primordial.ca> drivers/net/usb/ax88179_178a.c \| 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge tag 'ntb-3.18' of git://github.com/jonmason/ntb	Linus Torvalds	2014-10-19
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull ntb (non-transparent bridge) updates from Jon Mason: "Add support for Haswell NTB split BARs, a debugfs entry for basic debugging info, and some code clean-ups" * tag 'ntb-3.18' of git://github.com/jonmason/ntb: ntb: Adding split BAR support for Haswell platforms ntb: use errata flag set via DID to implement workaround ntb: conslidate reading of PPD to move platform detection earlier ntb: move platform detection to separate function NTB: debugfs device entry
\| * \|	ntb: Adding split BAR support for Haswell platforms	Dave Jiang	2014-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On the Haswell platform, a split BAR option to allow creation of 2 32bit BARs (4 and 5) from the 64bit BAR 4. Adding support for this new option. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
\| * \|	ntb: use errata flag set via DID to implement workaround	Dave Jiang	2014-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of using a module parameter, we should detect the errata via PCI DID and then set an appropriate flag. This will be used for additional errata later on. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
\| * \|	ntb: conslidate reading of PPD to move platform detection earlier	Dave Jiang	2014-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To simplify some of the platform detection code. Move the platform detection to a function to be called earlier. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
\| * \|	ntb: move platform detection to separate function	Dave Jiang	2014-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Move the platform detection function to separate functions to allow easier maintenence. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
\| * \|	NTB: debugfs device entry	Jon Mason	2014-10-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Create a debugfs entry for the NTB device to log the basic device info, as well as display the error count on a number of registers. Signed-off-by: Jon Mason <jon.mason@intel.com>
* \| \|	Merge branch 'i2c/for-next' of ↵	Linus Torvalds	2014-10-19
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c updates from Wolfram Sang: "Highlights from the I2C subsystem for 3.18: - new drivers for Axxia AM55xx, and Hisilicon hix5hd2 SoC. - designware driver gained AMD support, exynos gained exynos7 support The rest is usual driver stuff. Hopefully no lowlights this time" * 'i2c/for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: i801: Add Device IDs for Intel Sunrise Point PCH i2c: hix5hd2: add i2c controller driver i2c-imx: Disable the clock on probe failure i2c: designware: Add support for AMD I2C controller i2c: designware: Rework probe() to get clock a bit later i2c: designware: Default to fast mode in case of ACPI i2c: axxia: Add I2C driver for AXM55xx i2c: exynos: add support for HSI2C module on Exynos7 i2c: mxs: detect No Slave Ack on SELECT in PIO mode i2c: cros_ec: Remove EC_I2C_FLAG_10BIT i2c: cros-ec-tunnel: Add of match table i2c: rcar: remove sign-compare flaw i2c: ismt: Use minimum descriptor size i2c: imx: Add arbitration lost check i2c: rk3x: Remove unlikely() annotations i2c: rcar: check for no IRQ in rcar_i2c_irq() i2c: rcar: make rcar_i2c_prepare_msg() void i2c: rcar: simplify check for last message i2c: designware: add support of platform data to set I2C mode i2c: designware: add support of I2C standard mode
\| * \| \|	i2c: i801: Add Device IDs for Intel Sunrise Point PCH	james.d.ralston@intel.com	2014-10-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds the I2C/SMBus Device IDs for the Intel Sunrise Point PCH. Signed-off-by: James Ralston <james.d.ralston@intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
\| * \| \|	i2c: hix5hd2: add i2c controller driver	Wei Yan	2014-10-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I2C drivers for hix5hd2 soc series, including following chipset Hi3716CV200, Hi3719CV100, Hi3718CV100, Hi3719MV100, Hi3718MV100. Signed-off-by: Wei Yan <sledge.yanwei@huawei.com> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org> [wsa: folded dt docs into this patch] Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
\| * \| \|	i2c-imx: Disable the clock on probe failure	Fabio Estevam	2014-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the case of errors during probe, we should disable i2c_imx->clk. Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
\| * \| \|	i2c: designware: Add support for AMD I2C controller	Carl Peng	2014-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add support for AMD version of the DW I2C host controller. The device is enumerated from ACPI namespace with ACPI ID AMD0010. Because the core driver needs an input source clock, and this is not an Intel LPSS device where clocks are provided through drivers/acpi/acpi_lpss.c, we register the clock ourselves if the clock rate is given in ->driver_data Signed-off-by: Carl Peng <carlpeng008@gmail.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
\| * \| \|	i2c: designware: Rework probe() to get clock a bit later	Mika Westerberg	2014-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to be able to create missing clock for AMD (and in future possibly others) we move getting clock for the device a bit later. Also make ACPI/DT configuration in the same place depending on from where the device was enumerated from. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
\| * \| \|	i2c: designware: Default to fast mode in case of ACPI	Mika Westerberg	2014-10-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is no way in ACPI to tell in which speed the host controller is supposed to run, so we default to fast mode (400KHz). Since this has been the default all the time there should be no functional changes with this change. This is the first step required to refactor the driver probe so that we can supply source clock from ACPI part of the driver to the core. Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
\| * \| \|	i2c: axxia: Add I2C driver for AXM55xx	Anders Berg	2014-10-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add I2C bus driver for the controller found in the LSI Axxia family SoCs. The driver implements 10-bit addressing and SMBus transfer modes via emulation (including SMBus block data read). Signed-off-by: Anders Berg <anders.berg@avagotech.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>