aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
...
| | * | | MIPS: Fix BUILD_ROLLBACK_PROLOGUE for microMIPSPaul Burton2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the kernel is built for microMIPS, branches targets need to be known to be microMIPS code in order to result in bit 0 of the PC being set. The branch target in the BUILD_ROLLBACK_PROLOGUE macro was simply the end of the macro, which may be pointing at padding rather than at code. This results in recent enough GNU linkers complaining like so: mips-img-linux-gnu-ld: arch/mips/built-in.o: .text+0x3e3c: Unsupported branch between ISA modes. mips-img-linux-gnu-ld: final link failed: Bad value Makefile:936: recipe for target 'vmlinux' failed make: *** [vmlinux] Error 1 Fix this by changing the branch target to be the start of the appropriate handler, skipping over any padding. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14019/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | | MIPS: clear execution hazard after changing FTLB enablePaul Burton2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On current P-series cores from Imagination the FTLB can be enabled or disabled via a bit in the Config6 register, and an execution hazard is created by changing the value of bit. The ftlb_disable function already cleared that hazard but that does no good for other callers. Clear the hazard in the set_ftlb_enable function that creates it, and only for the cores where it applies. This has the effect of reverting c982c6d6c48b ("MIPS: cpu-probe: Remove cp0 hazard barrier when enabling the FTLB") which was incorrect. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: c982c6d6c48b ("MIPS: cpu-probe: Remove cp0 hazard barrier when enabling the FTLB") Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14023/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | | MIPS: Configure FTLB after probing TLB sizes from config4Paul Burton2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On some cores (proAptiv, P5600) we make use of the sizes of the TLBs to determine the desired FTLB:VTLB write ratio. However set_ftlb_enable & thus calculate_ftlb_probability is called before decode_config4. This results in us calculating a probability based on zero sizes, and we end up setting FTLBP=3 for a 3:1 FTLB:VTLB write ratio in all cases. This will make abysmal use of the available FTLB resources in the affected cores. Fix this by configuring the FTLB probability after having decoded config4. However we do need to have enabled the FTLB before that point such that fields in config4 actually reflect that an FTLB is present. So set_ftlb_enable is now called twice, with flags indicating that it should configure the write probability only the second time. Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: cf0a8aa0226d ("MIPS: cpu-probe: Set the FTLB probability bit on supported cores") Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14022/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | | MIPS: Stop setting I6400 FTLBPPaul Burton2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The FTLBP field in Config7 for the I6400 is intended as chicken bits for debugging rather than as a field that software actually makes use of. For best performance, FTLBP should be left at its default value of 0 with all TLB writes hitting the FTLB by default. Additionally, since set_ftlb_enable is called from decode_configs before decode_config4 which determines the size of the TLBs, this was previously always setting FTLBP=3 for a 3:1 FTLB:VTLB write ratio which makes abysmal use of the available FTLB resources. This effectively reverts b0c4e1b79d8a ("MIPS: Set up FTLB probability for I6400"). Signed-off-by: Paul Burton <paul.burton@imgtec.com> Fixes: b0c4e1b79d8a ("MIPS: Set up FTLB probability for I6400") Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14021/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | | MIPS: DEC: Avoid la pseudo-instruction in delay slotsRalf Baechle2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When expanding the la or dla pseudo-instruction in a delay slot the GNU assembler will complain should the pseudo-instruction expand to multiple actual instructions, since only the first of them will be in the delay slot leading to the pseudo-instruction being only partially executed if the branch is taken. Use of PTR_LA in the dec int-handler.S leads to such warnings: arch/mips/dec/int-handler.S: Assembler messages: arch/mips/dec/int-handler.S:149: Warning: macro instruction expanded into multiple instructions in a branch delay slot arch/mips/dec/int-handler.S:198: Warning: macro instruction expanded into multiple instructions in a branch delay slot Avoid this by open coding the PTR_LA macros. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | | MIPS: Octeon: mark GPIO controller node not populated after IRQ init.Steven J. Hill2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We clear the OF_POPULATED flag for the GPIO controller node on Octeon processors. Otherwise, none of the devices hanging on the GPIO lines are probed. The 'gpio-leds' driver on OCTEON failed to probe in addition to other devices on Cavium 71xx and 78xx development boards. Fixes: 15cc2ed6dcf9 ("of/irq: Mark initialised interrupt controllers as populated") Signed-off-by: Steven J. Hill <steven.hill@cavium.com> Tested-by: Aaro Koskinen <aaro.koskinen@iki.fi> Cc: David Daney <david.daney@cavium.com> Cc: Rob Herring <robh@kernel.org> Cc: linux-mips@linux-mips.org Cc: devicetree@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/14091/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | | MIPS: uprobes: fix use of uninitialised variableMarcin Nowakowski2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch_uprobe_pre_xol needs to emulate a branch if a branch instruction has been replaced with a breakpoint, but in fact an uninitialised local variable was passed to the emulator routine instead of the original instruction Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com> Fixes: 40e084a506eb ('MIPS: Add uprobes support.') Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14300/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | | MIPS: uprobes: remove incorrect set_orig_insnMarcin Nowakowski2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Generic kernel code implements a weak version of set_orig_insn that moves cached 'insn' from arch_uprobe to the original code location when the trap is removed. MIPS variant used arch_uprobe->orig_inst which was never initialised properly, so this code only inserted a nop instead of the original instruction. With that change orig_inst can also be safely removed. Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com> Fixes: 40e084a506eb ('MIPS: Add uprobes support.') Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14299/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | | MIPS: fix uretprobe implementationMarcin Nowakowski2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arch_uretprobe_hijack_return_addr should replace the return address for a call with a trampoline address. Signed-off-by: Marcin Nowakowski <marcin.nowakowski@imgtec.com> Fixes: 40e084a506eb ('MIPS: Add uprobes support.') Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/14298/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | | MIPS: smp-cps: Avoid BUG() when offlining pre-r6 CPUsMatt Redfearn2016-09-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0d2808f338c7 ("MIPS: smp-cps: Add support for CPU hotplug of MIPSr6 processors") added a call to mips_cm_lock_other in order to lock the CPC in CPUs containing a version 3 or higher Coherence Manager, which use the general CM core other register, where previous CMs had a dedicated core other register for the CPC. A kernel BUG() is triggered, however, if mips_cm_lock_other is called with a VP other than 0 on a CPU with CM < 3, a condition introduced by 0d2808f338c7. Avoid the BUG() by always locking VP0 when locking the CPC, since the required register, cpc_stat_conf, is shared by all vps in a core. Fixes: 0d2808f338c7 ("MIPS: smp-cps: Add support for CPU hotplug...) Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com> Cc: Qais Yousef <qsyousef@gmail.com> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: James Hogan <james.hogan@imgtec.com> Cc: Paul Burton <paul.burton@imgtec.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mips@linux-mips.org Cc: linux-kernel@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/14297/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds2016-10-02
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull sparc fixes from David Miller: 1) Fix section mismatches in some builds, from Paul Gortmaker. 2) Need to count huge zero page mappings when doing TSB sizing, from Mike Kravetz. 3) Fix handing of cpu_possible_mask when nr_cpus module option is specified, from Atish Patra. 4) Don't allocate irq stacks until nr_irqs has been processed, also from Atish Patra. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix non-SMP build. sparc64: Fix irq stack bootmem allocation. sparc64: Fix cpu_possible_mask if nr_cpus is set sparc64 mm: Fix more TSB sizing issues sparc64: fix section mismatch in find_numa_latencies_for_group
| | * | | | sparc64: Fix non-SMP build.David S. Miller2016-09-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Need to provide a dummy smp_fill_in_cpu_possible_map. Fixes: 9b2f753ec237 ("sparc64: Fix cpu_possible_mask if nr_cpus is set") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | sparc64: Fix irq stack bootmem allocation.Atish Patra2016-09-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, irq stack bootmem is allocated for all possible cpus before nr_cpus value changes the list of possible cpus. As a result, there is unnecessary wastage of bootmemory. Move the irq stack bootmem allocation so that it happens after possible cpu list is modified based on nr_cpus value. Signed-off-by: Atish Patra <atish.patra@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | sparc64: Fix cpu_possible_mask if nr_cpus is setAtish Patra2016-09-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If kernel boot parameter nr_cpus is set, it should define the number of CPUs that can ever be available in the system i.e. cpu_possible_mask. setup_nr_cpu_ids() overrides the nr_cpu_ids based on the cpu_possible_mask during kernel initialization. If cpu_possible_mask is not set based on the nr_cpus value, earlier part of the kernel would be initialized using nr_cpus value leading to a kernel crash. Set cpu_possible_mask based on nr_cpus value. Thus setup_nr_cpu_ids() becomes redundant and does not corrupt nr_cpu_ids value. Signed-off-by: Atish Patra <atish.patra@oracle.com> Reviewed-by: Bob Picco <bob.picco@oracle.com> Reviewed-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | sparc64 mm: Fix more TSB sizing issuesMike Kravetz2016-09-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit af1b1a9b36b8 ("sparc64 mm: Fix base TSB sizing when hugetlb pages are used") addressed the difference between hugetlb and THP pages when computing TSB sizes. The following additional issues were also discovered while working with the code. In order to save memory, THP makes use of a huge zero page. This huge zero page does not count against a task's RSS, but it does consume TSB entries. This is similar to hugetlb pages. Therefore, count huge zero page entries in hugetlb_pte_count. Accounting of THP pages is done in the routine set_pmd_at(). Unfortunately, this does not catch the case where a THP page is split. To handle this case, decrement the count in pmdp_invalidate(). pmdp_invalidate is only called when splitting a THP. However, 'sanity checks' are added in case it is ever called for other purposes. A more general issue exists with HPAGE_SIZE accounting. hugetlb_pte_count tracks the number of HPAGE_SIZE (8M) pages. This value is used to size the TSB for HPAGE_SIZE pages. However, each HPAGE_SIZE page consists of two REAL_HPAGE_SIZE (4M) pages. The TSB contains an entry for each REAL_HPAGE_SIZE page. Therefore, the number of REAL_HPAGE_SIZE pages should be used to size the huge page TSB. A new compile time constant REAL_HPAGE_PER_HPAGE is used to multiply hugetlb_pte_count before sizing the TSB. Changes from V1 - Fixed build issue if hugetlb or THP not configured Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | sparc64: fix section mismatch in find_numa_latencies_for_groupPaul Gortmaker2016-09-28
| | | |/ / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To fix: WARNING: vmlinux.o(.text.unlikely+0x580): Section mismatch in reference from the function find_numa_latencies_for_group() to the function .init.text:find_mlgroup() The function find_numa_latencies_for_group() references the function __init find_mlgroup(). This is often because find_numa_latencies_for_group lacks a __init annotation or the annotation of find_mlgroup is wrong. It turns out find_numa_latencies_for_group is only called from: static int __init numa_parse_mdesc(void) and hence we can tag find_numa_latencies_for_group with __init. In doing so we see that find_best_numa_node_for_mlgroup is only called from within __init and hence can also be marked with __init. Cc: "David S. Miller" <davem@davemloft.net> Cc: Nitin Gupta <nitin.m.gupta@oracle.com> Cc: Chris Hyser <chris.hyser@oracle.com> Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com> Cc: sparclinux@vger.kernel.org Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2016-10-02
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking fixes from David Miller: 1) Fix wrong TCP checksums on MTU probing when checksum offloading is disabled, from Douglas Caetano dos Santos. 2) Fix qdisc backlog updates in qfq and sfb schedulers, from Cong Wang. 3) Route lookup flow key protocol value is wrong in ip6gre_xmit_other(), fix from Lance Richardson. 4) Scheduling while atomic in multicast routing code of ipv4 and ipv6, fix from Nikolay Aleksandrov. 5) Fix packet alignment in fec driver, from Eric Nelson. 6) Fix perf regression in sctp due to struct layout and cache misses, from Xin Long. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock sctp: change to check peer prsctp_capable when using prsctp polices sctp: remove prsctp_param from sctp_chunk sctp: move sent_count to the memory hole in sctp_chunk tg3: Avoid NULL pointer dereference in tg3_io_error_detected() act_ife: Fix false encoding act_ife: Fix external mac header on encode VSOCK: Don't dec ack backlog twice for rejected connections Revert "net: ethernet: bcmgenet: use phydev from struct net_device" net: fec: align IP header in hardware net: fec: remove QUIRK_HAS_RACC from i.mx27 net: fec: remove QUIRK_HAS_RACC from i.mx25 ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_route ip6_gre: fix flowi6_proto value in ip6gre_xmit_other() tcp: fix a compile error in DBGUNDO() tcp: fix wrong checksum calculation on MTU probing sch_sfb: keep backlog updated with qlen sch_qfq: keep backlog updated with qlen can: dev: fix deadlock reported after bus-off
| | * | | | sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lockXin Long2016-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When sctp dumps all the ep->assocs, it needs to lock_sock first, but now it locks sock in rcu_read_lock, and lock_sock may sleep, which would break rcu_read_lock. This patch is to get and hold one sock when traversing the list. After that and get out of rcu_read_lock, lock and dump it. Then it will traverse the list again to get the next one until all sctp socks are dumped. For sctp_diag_dump_one, it fixes this issue by holding asoc and moving cb() out of rcu_read_lock in sctp_transport_lookup_process. Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | Merge branch 'sctp-fixes'David S. Miller2016-09-30
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Xin Long says: ==================== sctp: a bunch of fixes for prsctp polices This patchset is to fix 2 issues for prsctp polices: 1. patch 1 and 2 fix "netperf-Throughput_Mbps -37.2% regression" issue when overloading the CPU. 2. patch 3 fix "prsctp polices should check both sides' prsctp_capable, instead of only local side". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | | sctp: change to check peer prsctp_capable when using prsctp policesXin Long2016-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now before using prsctp polices, sctp uses asoc->prsctp_enable to check if prsctp is enabled. However asoc->prsctp_enable is set only means local host support prsctp, sctp should not abandon packet if peer host doesn't enable prsctp. So this patch is to use asoc->peer.prsctp_capable to check if prsctp is enabled on both side, instead of asoc->prsctp_enable, as asoc's peer.prsctp_capable is set only when local and peer both enable prsctp. Fixes: a6c2f792873a ("sctp: implement prsctp TTL policy") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | | sctp: remove prsctp_param from sctp_chunkXin Long2016-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now sctp uses chunk->prsctp_param to save the prsctp param for all the prsctp polices, we didn't need to introduce prsctp_param to sctp_chunk. We can just use chunk->sinfo.sinfo_timetolive for RTX and BUF polices, and reuse msg->expires_at for TTL policy, as the prsctp polices and old expires policy are mutual exclusive. This patch is to remove prsctp_param from sctp_chunk, and reuse msg's expires_at for TTL and chunk's sinfo.sinfo_timetolive for RTX and BUF polices. Note that sctp can't use chunk's sinfo.sinfo_timetolive for TTL policy, as it needs a u64 variables to save the expires_at time. This one also fixes the "netperf-Throughput_Mbps -37.2% regression" issue. Fixes: a6c2f792873a ("sctp: implement prsctp TTL policy") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | | sctp: move sent_count to the memory hole in sctp_chunkXin Long2016-09-30
| | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now pahole sctp_chunk, it has 2 memory holes: struct sctp_chunk { struct list_head list; atomic_t refcnt; /* XXX 4 bytes hole, try to pack */ ... long unsigned int prsctp_param; int sent_count; /* XXX 4 bytes hole, try to pack */ This patch is to move up sent_count to fill the 1st one and eliminate the 2nd one. It's not just another struct compaction, it also fixes the "netperf- Throughput_Mbps -37.2% regression" issue when overloading the CPU. Fixes: a6c2f792873a ("sctp: implement prsctp TTL policy") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | tg3: Avoid NULL pointer dereference in tg3_io_error_detected()Milton Miller2016-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While the driver is probing the adapter, an error may occur before the netdev structure is allocated and attached to pci_dev. In this case, not only netdev isn't available, but the tg3 private structure is also not available as it is just math from the NULL pointer, so dereferences must be skipped. The following trace is seen when the error is triggered: [1.402247] Unable to handle kernel paging request for data at address 0x00001a99 [1.402410] Faulting instruction address: 0xc0000000007e33f8 [1.402450] Oops: Kernel access of bad area, sig: 11 [#1] [1.402481] SMP NR_CPUS=2048 NUMA PowerNV [1.402513] Modules linked in: [1.402545] CPU: 0 PID: 651 Comm: eehd Not tainted 4.4.0-36-generic #55-Ubuntu [1.402591] task: c000001fe4e42a20 ti: c000001fe4e88000 task.ti: c000001fe4e88000 [1.402742] NIP: c0000000007e33f8 LR: c0000000007e3164 CTR: c000000000595ea0 [1.402787] REGS: c000001fe4e8b790 TRAP: 0300 Not tainted (4.4.0-36-generic) [1.402832] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28000422 XER: 20000000 [1.403058] CFAR: c000000000008468 DAR: 0000000000001a99 DSISR: 42000000 SOFTE: 1 GPR00: c0000000007e3164 c000001fe4e8ba10 c0000000015c5e00 0000000000000000 GPR04: 0000000000000001 0000000000000000 0000000000000039 0000000000000299 GPR08: 0000000000000000 0000000000000001 c000001fe4e88000 0000000000000006 GPR12: 0000000000000000 c00000000fb40000 c0000000000e6558 c000003ca1bffd00 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 c000000000d52768 GPR24: c000000000d52740 0000000000000100 c000003ca1b52000 0000000000000002 GPR28: 0000000000000900 0000000000000000 c00000000152a0c0 c000003ca1b52000 [1.404226] NIP [c0000000007e33f8] tg3_io_error_detected+0x308/0x340 [1.404265] LR [c0000000007e3164] tg3_io_error_detected+0x74/0x340 This patch avoids the NULL pointer dereference by moving the access after the netdev NULL pointer check on tg3_io_error_detected(). Also, we add a check for netdev being NULL on tg3_io_resume() [suggested by Michael Chan]. Fixes: 0486a063b1ff ("tg3: prevent ifup/ifdown during PCI error recovery") Fixes: dfc8f370316b ("net/tg3: Release IRQs on permanent error") Tested-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Signed-off-by: Milton Miller <miltonm@us.ibm.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com> Acked-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | Merge branch 'act_ife-fixes'David S. Miller2016-09-27
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yotam Gigi says: ==================== Fix tc-ife bugs This patch-set contains two bugfixes in the tc-ife action, one fixing some random behaviour in encode side, and one fixing the decode side packet parsing logic. v2->v3 - Fix the encode side instead of the decode side ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | | act_ife: Fix false encodingYotam Gigi2016-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On ife encode side, the action stores the different tlvs inside the ife header, where each tlv length field should refer to the length of the whole tlv (without additional padding) and not just the data length. On ife decode side, the action iterates over the tlvs in the ife header and parses them one by one, where in each iteration the current pointer is advanced according to the tlv size. Before, the encoding encoded only the data length inside the tlv, which led to false parsing of ife the header. In addition, due to the fact that the loop counter was unsigned, it could lead to infinite parsing loop. This fix changes the loop counter to be signed and fixes the encoding to take into account the tlv type and size. Fixes: 28a10c426e81 ("net sched: fix encoding to use real length") Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | | act_ife: Fix external mac header on encodeYotam Gigi2016-09-27
| | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On ife encode side, external mac header is copied from the original packet and may be overridden if the user requests. Before, the mac header copy was done from memory region that might not be accessible anymore, as skb_cow_head might free it and copy the packet. This led to random values in the external mac header once the values were not set by user. This fix takes the internal mac header from the packet, after the call to skb_cow_head. Fixes: ef6980b6becb ("net sched: introduce IFE action") Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Yotam Gigi <yotamg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | VSOCK: Don't dec ack backlog twice for rejected connectionsJorgen Hansen2016-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a pending socket is marked as rejected, we will decrease the sk_ack_backlog twice. So don't decrement it for rejected sockets in vsock_pending_work(). Testing of the rejected socket path was done through code modifications. Reported-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Reviewed-by: Adit Ranadive <aditr@vmware.com> Reviewed-by: Aditya Sarwade <asarwade@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | Revert "net: ethernet: bcmgenet: use phydev from struct net_device"Florian Fainelli2016-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 62469c76007e ("net: ethernet: bcmgenet: use phydev from struct net_device") because it causes GENETv1/2/3 adapters to expose the following behavior after an ifconfig down/up sequence: PING fainelli-linux (10.112.156.244): 56 data bytes 64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.352 ms 64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.472 ms (DUP!) 64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.496 ms (DUP!) 64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.517 ms (DUP!) 64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.536 ms (DUP!) 64 bytes from 10.112.156.244: seq=1 ttl=61 time=1.557 ms (DUP!) 64 bytes from 10.112.156.244: seq=1 ttl=61 time=752.448 ms (DUP!) This was previously fixed by commit 5dbebbb44a6a ("net: bcmgenet: Software reset EPHY after power on") but the commit we are reverting was essentially making this previous commit void, here is why. Without commit 62469c76007e we would have the following scenario after an ifconfig down then up sequence: - bcmgenet_open() calls bcmgenet_power_up() to make sure the PHY is initialized *before* we get to initialize the UniMAC, this is critical to ensure the PHY is in a correct state, priv->phydev is valid, this code executes fine - second time from bcmgenet_mii_probe(), through the normal phy_init_hw() call (which arguably could be optimized out) Everything is fine in that case. With commit 62469c76007e, we would have the following scenario to happen after an ifconfig down then up sequence: - bcmgenet_close() calls phy_disonnect() which makes dev->phydev become NULL - when bcmgenet_open() executes again and calls bcmgenet_mii_reset() from bcmgenet_power_up() to initialize the internal PHY, the NULL check becomes true, so we do not reset the PHY, yet we keep going on and initialize the UniMAC, causing MAC activity to occur - we call bcmgenet_mii_reset() from bcmgenet_mii_probe(), but this is too late, the PHY is botched, and causes the above bogus pings/packets transmission/reception to occur Reported-by: Jaedon Shin <jaedon.shin@gmail.com> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | Merge branch 'fec-align'David S. Miller2016-09-27
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Eric Nelson says: ==================== net: fec: updates to align IP header This patch series is the outcome of investigation into very high numbers of alignment faults on kernel 4.1.33 from the linux-fslc tree: https://github.com/freescale/linux-fslc/tree/4.1-1.0.x-imx The first two patches remove support for the receive accelerator (RACC) from the i.MX25 and i.MX27 SoCs which don't support the function. The third patch enables hardware alignment of the ethernet packet payload (and especially the IP header) to prevent alignment faults in the IP stack. Testing on i.MX6UL on the 4.1.33 kernel showed that this patch removed on the order of 70k alignment faults during a 100MiB transfer using wget. Testing on an i.MX6Q (SABRE Lite) board on net-next (4.8.0-rc7) showed a much more modest improvement from 10's of faults, and it's not clear why that's the case. ==================== Acked-by: Fugang Duan <fugang.duan@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | | net: fec: align IP header in hardwareEric Nelson2016-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The FEC receive accelerator (RACC) supports shifting the data payload of received packets by 16-bits, which aligns the payload (IP header) on a 4-byte boundary, which is, if not required, at least strongly suggested by the Linux networking layer. Without this patch, a huge number of alignment faults will be taken by the IP stack, as seen in /proc/cpu/alignment: ~/$ cat /proc/cpu/alignment User: 0 System: 72645 (inet_gro_receive+0x104/0x27c) Skipped: 0 Half: 0 Word: 0 DWord: 0 Multi: 72645 User faults: 3 (fixup+warn) This patch was suggested by Andrew Lunn in this message to linux-netdev: http://marc.info/?l=linux-arm-kernel&m=147465452108384&w=2 and adapted from a patch by Russell King from 2014: http://git.arm.linux.org.uk/cgit/linux-arm.git/commit/?id=70d8a8a Signed-off-by: Eric Nelson <eric@nelint.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | | net: fec: remove QUIRK_HAS_RACC from i.mx27Eric Nelson2016-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the i.MX27 reference manual, this SoC does not have support for the receive accelerator (RACC) register at offset 0x1C4. http://cache.nxp.com/files/32bit/doc/ref_manual/MCIMX27RM.pdf Signed-off-by: Eric Nelson <eric@nelint.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | | net: fec: remove QUIRK_HAS_RACC from i.mx25Eric Nelson2016-09-27
| | |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the i.MX25 reference manual, this SoC does not have support for the receive accelerator (RACC) register at offset 0x1C4. http://www.nxp.com/files/dsp/doc/ref_manual/IMX25RM.pdf Signed-off-by: Eric Nelson <eric@nelint.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | ipmr, ip6mr: fix scheduling while atomic and a deadlock with ipmr_get_routeNikolay Aleksandrov2016-09-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the commit below the ipmr/ip6mr rtnl_unicast() code uses the portid instead of the previous dst_pid which was copied from in_skb's portid. Since the skb is new the portid is 0 at that point so the packets are sent to the kernel and we get scheduling while atomic or a deadlock (depending on where it happens) by trying to acquire rtnl two times. Also since this is RTM_GETROUTE, it can be triggered by a normal user. Here's the sleeping while atomic trace: [ 7858.212557] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620 [ 7858.212748] in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/0 [ 7858.212881] 2 locks held by swapper/0/0: [ 7858.213013] #0: (((&mrt->ipmr_expire_timer))){+.-...}, at: [<ffffffff810fbbf5>] call_timer_fn+0x5/0x350 [ 7858.213422] #1: (mfc_unres_lock){+.....}, at: [<ffffffff8161e005>] ipmr_expire_process+0x25/0x130 [ 7858.213807] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.8.0-rc7+ #179 [ 7858.213934] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014 [ 7858.214108] 0000000000000000 ffff88005b403c50 ffffffff813a7804 0000000000000000 [ 7858.214412] ffffffff81a1338e ffff88005b403c78 ffffffff810a4a72 ffffffff81a1338e [ 7858.214716] 000000000000026c 0000000000000000 ffff88005b403ca8 ffffffff810a4b9f [ 7858.215251] Call Trace: [ 7858.215412] <IRQ> [<ffffffff813a7804>] dump_stack+0x85/0xc1 [ 7858.215662] [<ffffffff810a4a72>] ___might_sleep+0x192/0x250 [ 7858.215868] [<ffffffff810a4b9f>] __might_sleep+0x6f/0x100 [ 7858.216072] [<ffffffff8165bea3>] mutex_lock_nested+0x33/0x4d0 [ 7858.216279] [<ffffffff815a7a5f>] ? netlink_lookup+0x25f/0x460 [ 7858.216487] [<ffffffff8157474b>] rtnetlink_rcv+0x1b/0x40 [ 7858.216687] [<ffffffff815a9a0c>] netlink_unicast+0x19c/0x260 [ 7858.216900] [<ffffffff81573c70>] rtnl_unicast+0x20/0x30 [ 7858.217128] [<ffffffff8161cd39>] ipmr_destroy_unres+0xa9/0xf0 [ 7858.217351] [<ffffffff8161e06f>] ipmr_expire_process+0x8f/0x130 [ 7858.217581] [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180 [ 7858.217785] [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180 [ 7858.217990] [<ffffffff810fbc95>] call_timer_fn+0xa5/0x350 [ 7858.218192] [<ffffffff810fbbf5>] ? call_timer_fn+0x5/0x350 [ 7858.218415] [<ffffffff8161dfe0>] ? ipmr_net_init+0x180/0x180 [ 7858.218656] [<ffffffff810fde10>] run_timer_softirq+0x260/0x640 [ 7858.218865] [<ffffffff8166379b>] ? __do_softirq+0xbb/0x54f [ 7858.219068] [<ffffffff816637c8>] __do_softirq+0xe8/0x54f [ 7858.219269] [<ffffffff8107a948>] irq_exit+0xb8/0xc0 [ 7858.219463] [<ffffffff81663452>] smp_apic_timer_interrupt+0x42/0x50 [ 7858.219678] [<ffffffff816625bc>] apic_timer_interrupt+0x8c/0xa0 [ 7858.219897] <EOI> [<ffffffff81055f16>] ? native_safe_halt+0x6/0x10 [ 7858.220165] [<ffffffff810d64dd>] ? trace_hardirqs_on+0xd/0x10 [ 7858.220373] [<ffffffff810298e3>] default_idle+0x23/0x190 [ 7858.220574] [<ffffffff8102a20f>] arch_cpu_idle+0xf/0x20 [ 7858.220790] [<ffffffff810c9f8c>] default_idle_call+0x4c/0x60 [ 7858.221016] [<ffffffff810ca33b>] cpu_startup_entry+0x39b/0x4d0 [ 7858.221257] [<ffffffff8164f995>] rest_init+0x135/0x140 [ 7858.221469] [<ffffffff81f83014>] start_kernel+0x50e/0x51b [ 7858.221670] [<ffffffff81f82120>] ? early_idt_handler_array+0x120/0x120 [ 7858.221894] [<ffffffff81f8243f>] x86_64_start_reservations+0x2a/0x2c [ 7858.222113] [<ffffffff81f8257c>] x86_64_start_kernel+0x13b/0x14a Fixes: 2942e9005056 ("[RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts") Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | ip6_gre: fix flowi6_proto value in ip6gre_xmit_other()Lance Richardson2016-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to commit 3be07244b733 ("ip6_gre: fix flowi6_proto value in xmit path"), set flowi6_proto to IPPROTO_GRE for output route lookup. Up until now, ip6gre_xmit_other() has set flowi6_proto to a bogus value. This affected output route lookup for packets sent on an ip6gretap device in cases where routing was dependent on the value of flowi6_proto. Since the correct proto is already set in the tunnel flowi6 template via commit 252f3f5a1189 ("ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit path."), simply delete the line setting the incorrect flowi6_proto value. Suggested-by: Jiri Benc <jbenc@redhat.com> Fixes: c12b395a4664 ("gre: Support GRE over IPv6") Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com> Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | tcp: fix a compile error in DBGUNDO()Eric Dumazet2016-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If DBGUNDO() is enabled (FASTRETRANS_DEBUG > 1), a compile error will happen, since inet6_sk(sk)->daddr became sk->sk_v6_daddr Fixes: efe4208f47f9 ("ipv6: make lookups simpler and faster") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | tcp: fix wrong checksum calculation on MTU probingDouglas Caetano dos Santos2016-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With TCP MTU probing enabled and offload TX checksumming disabled, tcp_mtu_probe() calculated the wrong checksum when a fragment being copied into the probe's SKB had an odd length. This was caused by the direct use of skb_copy_and_csum_bits() to calculate the checksum, as it pads the fragment being copied, if needed. When this fragment was not the last, a subsequent call used the previous checksum without considering this padding. The effect was a stale connection in one way, as even retransmissions wouldn't solve the problem, because the checksum was never recalculated for the full SKB length. Signed-off-by: Douglas Caetano dos Santos <douglascs@taghos.com.br> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | Merge tag 'linux-can-fixes-for-4.8-20160922' of ↵David S. Miller2016-09-23
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-09-22 this is a pull request of one patch for the upcoming linux-4.8 release. The patch by Sergei Miroshnichenko fixes a potential deadlock in the generic CAN device code that cann occour after a bus-off. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | | can: dev: fix deadlock reported after bus-offSergei Miroshnichenko2016-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A timer was used to restart after the bus-off state, leading to a relatively large can_restart() executed in an interrupt context, which in turn sets up pinctrl. When this happens during system boot, there is a high probability of grabbing the pinctrl_list_mutex, which is locked already by the probe() of other device, making the kernel suspect a deadlock condition [1]. To resolve this issue, the restart_timer is replaced by a delayed work. [1] https://github.com/victronenergy/venus/issues/24 Signed-off-by: Sergei Miroshnichenko <sergeimir@emcraft.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
| | * | | | | sch_sfb: keep backlog updated with qlenWANG Cong2016-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | | sch_qfq: keep backlog updated with qlenWANG Cong2016-09-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reported-by: Stas Nichiporovich <stasn77@gmail.com> Fixes: 2ccccf5fb43f ("net_sched: update hierarchical backlog too") Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | | | Merge tag 'scsi-fixes' of ↵Linus Torvalds2016-10-01
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fix from James Bottomley: "One final fix before 4.8. There was a memory leak triggered by turning scsi mq off due to the fact that we assume on host release that the already running hosts weren't mq based because that's the state of the global flag (even though they were). Fix it by tracking this on a per host host basis" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: Avoid that toggling use_blk_mq triggers a memory leak
| | * \ \ \ \ \ Merge remote-tracking branch 'mkp-scsi/4.8/scsi-fixes' into fixesJames Bottomley2016-09-28
| | |\ \ \ \ \ \
| | | * | | | | | scsi: Avoid that toggling use_blk_mq triggers a memory leakBart Van Assche2016-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch avoids that the following memory leak is triggered if use_blk_mq is disabled after a SCSI host has been allocated by the ib_srp driver and before the same SCSI host is freed: unreferenced object 0xffff8803a168c568 (size 256): backtrace: [<ffffffff81620c95>] kmemleak_alloc+0x45/0xa0 [<ffffffff811bb104>] __kmalloc_node+0x1e4/0x400 [<ffffffff81309fe4>] blk_mq_alloc_tag_set+0xb4/0x230 [<ffffffff814731b7>] scsi_mq_setup_tags+0xc7/0xd0 [<ffffffff81469c26>] scsi_add_host_with_dma+0x216/0x2d0 [<ffffffffa064bef5>] srp_create_target+0xe55/0x13d0 [ib_srp] [<ffffffff8143ce23>] dev_attr_store+0x13/0x20 [<ffffffff8125f030>] sysfs_kf_write+0x40/0x50 [<ffffffff8125e397>] kernfs_fop_write+0x137/0x1c0 [<ffffffff811d8c13>] __vfs_write+0x23/0x140 [<ffffffff811d92e0>] vfs_write+0xb0/0x190 [<ffffffff811da5b4>] SyS_write+0x44/0xa0 [<ffffffff8162c8a5>] entry_SYSCALL_64_fastpath+0x18/0xa8 Fixes: 9aa9cc4221f5 ("scsi: remove the disable_blk_mq host flag") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: <stable@vger.kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
| * | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2016-10-01
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fix from Dmitry Torokhov: "One small change to make joydev (which is used by older games) to bind to devices that export Z axis but not X or Y (such as TRC rudder)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: joydev - recognize devices with Z axis as joysticks
| | * | | | | | | | Input: joydev - recognize devices with Z axis as joysticksVille Ranki2016-09-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current implementation of joydev's input_device_id table recognizes only devices with ABS_X, ABS_WHEEL or ABS_THROTTLE axes as joysticks. There are joystick devices that do not have those axes, for example TRC Rudder device. The device in question has ABS_Z, ABS_RX and ABS_RY axes causing it not being detected as joystick. This patch adds ABS_Z to the input_device_id list allowing devices with ABS_Z axis to be detected correctly. Signed-off-by: Ville Ranki <ville.ranki@iki.fi> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
| * | | | | | | | | Merge branch 'akpm' (patches from Andrew)Linus Torvalds2016-09-30
| |\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge more fixes from Andrew Morton: "Three fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: include/linux/property.h: fix typo/compile error ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock() mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
| | * | | | | | | | | include/linux/property.h: fix typo/compile errorJohn Youn2016-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes commit d76eebfa175e ("include/linux/property.h: fix build issues with gcc-4.4.4"). With that commit we get the following compile error when using the PROPERTY_ENTRY_INTEGER_ARRAY macro. include/linux/property.h:201:39: error: `u32_data' undeclared (first use in this function) PROPERTY_ENTRY_INTEGER_ARRAY(_name_, u32, _val_) ^ include/linux/property.h:193:17: note: in definition of macro `PROPERTY_ENTRY_INTEGER_ARRAY' { .pointer = { _type_##_data = _val_ } }, \ ^ This needs a '.' to reference the union member. It seems this was just overlooked here since it is done correctly in similar constructs in other parts of the original commit. This fix is in preparation of upcoming commits that will use this macro. Fixes: commit d76eebfa175e ("include/linux/property.h: fix build issues with gcc-4.4.4") Link: http://lkml.kernel.org/r/2de3b929290d88a723ed829a3e3cbd02044714df.1475114627.git.johnyoun@synopsys.com Signed-off-by: John Youn <johnyoun@synopsys.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| | * | | | | | | | | ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()Eric Ren2016-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The testcase "mmaptruncate" of ocfs2-test deadlocks occasionally. In this testcase, we create a 2*CLUSTER_SIZE file and mmap() on it; there are 2 process repeatedly performing the following operations respectively: one is doing memset(mmaped_addr + 2*CLUSTER_SIZE - 1, 'a', 1), while the another is playing ftruncate(fd, 2*CLUSTER_SIZE) and then ftruncate(fd, CLUSTER_SIZE) again and again. This is the backtrace when the deadlock happens: __wait_on_bit_lock+0x50/0xa0 __lock_page+0xb7/0xc0 ocfs2_write_begin_nolock+0x163f/0x1790 [ocfs2] ocfs2_page_mkwrite+0x1c7/0x2a0 [ocfs2] do_page_mkwrite+0x66/0xc0 handle_mm_fault+0x685/0x1350 __do_page_fault+0x1d8/0x4d0 trace_do_page_fault+0x37/0xf0 do_async_page_fault+0x19/0x70 async_page_fault+0x28/0x30 In ocfs2_write_begin_nolock(), we first grab the pages and then allocate disk space for this write; ocfs2_try_to_free_truncate_log() will be called if -ENOSPC is returned; if we're lucky to get enough clusters, which is usually the case, we start over again. But in ocfs2_free_write_ctxt() the target page isn't unlocked, so we will deadlock when trying to grab the target page again. Also, -ENOMEM might be returned in ocfs2_grab_pages_for_write(). Another deadlock will happen in __do_page_mkwrite() if ocfs2_page_mkwrite() returns non-VM_FAULT_LOCKED, and along with a locked target page. These two errors fail on the same path, so fix them by unlocking the target page manually before ocfs2_free_write_ctxt(). Jan Kara helps me clear out the JBD2 part, and suggest the hint for root cause. Changes since v1: 1. Also put ENOMEM error case into consideration. Link: http://lkml.kernel.org/r/1474173902-32075-1-git-send-email-zren@suse.com Signed-off-by: Eric Ren <zren@suse.com> Reviewed-by: He Gang <ghe@suse.com> Acked-by: Joseph Qi <joseph.qi@huawei.com> Cc: Mark Fasheh <mfasheh@suse.de> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| | * | | | | | | | | mm: workingset: fix crash in shadow node shrinker caused by ↵Johannes Weiner2016-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | replace_page_cache_page() Antonio reports the following crash when using fuse under memory pressure: kernel BUG at /build/linux-a2WvEb/linux-4.4.0/mm/workingset.c:346! invalid opcode: 0000 [#1] SMP Modules linked in: all of them CPU: 2 PID: 63 Comm: kswapd0 Not tainted 4.4.0-36-generic #55-Ubuntu Hardware name: System manufacturer System Product Name/P8H67-M PRO, BIOS 3904 04/27/2013 task: ffff88040cae6040 ti: ffff880407488000 task.ti: ffff880407488000 RIP: shadow_lru_isolate+0x181/0x190 Call Trace: __list_lru_walk_one.isra.3+0x8f/0x130 list_lru_walk_one+0x23/0x30 scan_shadow_nodes+0x34/0x50 shrink_slab.part.40+0x1ed/0x3d0 shrink_zone+0x2ca/0x2e0 kswapd+0x51e/0x990 kthread+0xd8/0xf0 ret_from_fork+0x3f/0x70 which corresponds to the following sanity check in the shadow node tracking: BUG_ON(node->count & RADIX_TREE_COUNT_MASK); The workingset code tracks radix tree nodes that exclusively contain shadow entries of evicted pages in them, and this (somewhat obscure) line checks whether there are real pages left that would interfere with reclaim of the radix tree node under memory pressure. While discussing ways how fuse might sneak pages into the radix tree past the workingset code, Miklos pointed to replace_page_cache_page(), and indeed there is a problem there: it properly accounts for the old page being removed - __delete_from_page_cache() does that - but then does a raw raw radix_tree_insert(), not accounting for the replacement page. Eventually the page count bits in node->count underflow while leaving the node incorrectly linked to the shadow node LRU. To address this, make sure replace_page_cache_page() uses the tracked page insertion code, page_cache_tree_insert(). This fixes the page accounting and makes sure page-containing nodes are properly unlinked from the shadow node LRU again. Also, make the sanity checks a bit less obscure by using the helpers for checking the number of pages and shadows in a radix tree node. Fixes: 449dd6984d0e ("mm: keep page cache radix tree nodes in check") Link: http://lkml.kernel.org/r/20160919155822.29498-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Antonio SJ Musumeci <trapexit@spawn.link> Debugged-by: Miklos Szeredi <miklos@szeredi.hu> Cc: <stable@vger.kernel.org> [3.15+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | | MAINTAINERS: Switch to kernel.org email address for Javi MerinoJavi Merino2016-09-30
| |/ / / / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change my email address to my kernel.org account instead of the ARM one. Signed-off-by: Javi Merino <javi.merino@arm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>