aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* selftests: pmtu: Introduce support for multiple testsStefano Brivio2018-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | Introduce list of tests and their descriptions, and loop on it in main body. Tests will now just take care of calling setup with a list of "units" they need, and return 0 on success, 1 on failure, 2 if the test had to be skipped. Main script body will take care of displaying results and cleaning up after every test. Introduce guard variable so that we don't clean up twice in case of interrupts or unexpected failures. The pmtu_vti6_exception test can now run its third step even if the previous one failed, as we can return values from it. Also introduce support to display test descriptions, and display aligned OK/FAIL/SKIP test outcomes. Buffer error strings so that in case of failure we can display them right under the outcome for each test. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* selftests: pmtu: Factor out MTU parsing helperStefano Brivio2018-03-17
| | | | | | | ...so that it can be used for any iproute command output. Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* selftests: pmtu: Use namespace command prefix to fetch route mtuStefano Brivio2018-03-17
| | | | | | | | | | | In 7af137b72131 ("selftests: net: Introduce first PMTU test") I accidentally assumed route_get_* helpers would run from a single namespace. Make them a bit more generic, by passing the namespace command prefix as a parameter instead. Fixes: 7af137b72131 ("selftests: net: Introduce first PMTU test") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* selftests: pmtu: Reverse return codes of functionsStefano Brivio2018-03-17
| | | | | | | | | | | | David suggests it's more intuitive to return non-zero on failures, and zero on success. No need to introduce tail 'return 0' in functions, they will return the exit code of the last command anyway. Suggested-by: David Ahern <dsahern@gmail.com> Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'ibmvnic-Update-TX-pool-and-TX-routines'David S. Miller2018-03-17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thomas Falcon says: ==================== ibmvnic: Update TX pool and TX routines This patch restructures the TX pool data structure and provides a separate TX pool array for TSO transmissions. This is already used in some way due to our unique DMA situation, namely that we cannot use single DMA mappings for packet data. Previously, both buffer arrays used the same pool entry. This restructuring allows for some additional cleanup in the driver code, especially in some places in the device transmit routine. In addition, it allows us to more easily track the consumer and producer indexes of a particular pool. This has been further improved by better tracking of in-use buffers to prevent possible data corruption in case an invalid buffer entry is used. v5: Fix bisectability mistake in the first patch. Removed TSO-specific data in a later patch when it is no longer used. v4: Fix error in 7th patch that causes an oops by using the older fixed value for number of buffers instead of the respective field in the tx pool data structure v3: Forgot to update TX pool cleaning function to handle new data structures. Included 7th patch for that. v2: Fix typo in 3/6 commit subject line ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * ibmvnic: Remove unused TSO resources in TX pool structureThomas Falcon2018-03-17
| | | | | | | | | | | | | | | | | | Finally, remove the TSO-specific fields in the TX pool strcutures. These are no longer needed with the introduction of separate buffer pools for TSO transmissions. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ibmvnic: Update TX pool cleaning routineThomas Falcon2018-03-17
| | | | | | | | | | | | | | | | | | | | Update routine that cleans up any outstanding transmits that have not received completions when the device needs to close. Introduces a helper function that cleans one TX pool to make code more readable. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ibmvnic: Improve TX buffer accountingThomas Falcon2018-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve TX pool buffer accounting to prevent the producer index from overruning the consumer. First, set the next free index to an invalid value if it is in use. If next buffer to be consumed is in use, drop the packet. Finally, if the transmit fails for some other reason, roll back the consumer index and set the free map entry to its original value. This should also be done if the DMA map fails. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ibmvnic: Update TX and TX completion routinesThomas Falcon2018-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Update TX and TX completion routines to account for TX pool restructuring. TX routine first chooses the pool depending on whether a packet is GSO or not, then uses it accordingly. For the completion routine to know which pool it needs to use, set the most significant bit of the correlator index to one if the packet uses the TSO pool. On completion, unset the bit and use the correlator index to release the buffer pool entry. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ibmvnic: Update TX pool initialization routineThomas Falcon2018-03-17
| | | | | | | | | | | | | | | | | | Introduce function that initializes one TX pool. Use that to create each pool entry in both the standard TX pool and TSO pool arrays. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ibmvnic: Update release TX pool routineThomas Falcon2018-03-17
| | | | | | | | | | | | | | | | Introduce function that frees one TX pool. Use that to release each pool in both the standard TX pool and TSO pool arrays. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ibmvnic: Update and clean up reset TX pool routineThomas Falcon2018-03-17
| | | | | | | | | | | | | | | | | | Update TX pool reset routine to accommodate new TSO pool array. Introduce a function that resets one TX pool, and use that function to initialize each pool in both pool arrays. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * ibmvnic: Generalize TX pool structureThomas Falcon2018-03-17
|/ | | | | | | | | | | Remove some unused fields in the structure and include values describing the individual buffer size and number of buffers in a TX pool. This allows us to use these fields for TX pool buffer accounting as opposed to using hard coded values. Include a new pool array for TSO transmissions. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* sctp: use proc_remove_subtree()Al Viro2018-03-17
| | | | | | | | | use proc_remove_subtree() for subtree removal, both on setup failure halfway through and on teardown. No need to make simple things complex... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'hv_netvsc-minor-enhancements'David S. Miller2018-03-17
|\ | | | | | | | | | | | | | | | | | | | | | | Stephen Hemminger says: ==================== hv_netvsc: minor enhancements A couple of small things for net-next ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * hv_netvsc: add trace pointsStephen Hemminger2018-03-17
| | | | | | | | | | | | | | | | This adds tracepoints to the driver which has proved useful in debugging startup and shutdown race conditions. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * hv_netvsc: pass netvsc_device to rndis haltStephen Hemminger2018-03-17
|/ | | | | | | | The caller has a valid pointer, pass it to rndis_filter_halt_device and avoid any possible RCU races here. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: ethernet: ti: cpsw: enable vlan rx vlan offloadGrygorii Strashko2018-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In VLAN_AWARE mode CPSW can insert VLAN header encapsulation word on Host port 0 egress (RX) before the packet data if RX_VLAN_ENCAP bit is set in CPSW_CONTROL register. VLAN header encapsulation word has following format: HDR_PKT_Priority bits 29-31 - Header Packet VLAN prio (Highest prio: 7) HDR_PKT_CFI bits 28 - Header Packet VLAN CFI bit. HDR_PKT_Vid bits 27-16 - Header Packet VLAN ID PKT_Type bits 8-9 - Packet Type. Indicates whether the packet is VLAN-tagged, priority-tagged, or non-tagged. 00: VLAN-tagged packet 01: Reserved 10: Priority-tagged packet 11: Non-tagged packet This feature can be used to implement TX VLAN offload in case of VLAN-tagged packets and to insert VLAN tag in case Non-tagged packet was received on port with PVID set. As per documentation, CPSW never modifies packet data on Host egress (RX) and as result, without this feature enabled, Host port will not be able to receive properly packets which entered switch non-tagged through external Port with PVID set (when non-tagged packet forwarded from external Port with PVID set to another external Port - packet will be VLAN tagged properly). Implementation details: - on RX driver will check CPDMA status bit RX_VLAN_ENCAP BIT(19) in CPPI descriptor to identify when VLAN header encapsulation word is present. - PKT_Type = 0x01 or 0x02 then ignore VLAN header encapsulation word and pass packet as is; - if HDR_PKT_Vid = 0 then ignore VLAN header encapsulation word and pass packet as is; - In dual mac mode traffic is separated between ports using default port vlans, which are not be visible to Host and so should not be reported. Hence, check for default port vlans in dual mac mode and ignore VLAN header encapsulation word; - otherwise fill SKB with VLAN info using __vlan_hwaccel_put_tag(); - PKT_Type = 0x00 (VLAN-tagged) then strip out VLAN header from SKB. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Fix queue free path of ULD driversArjun Vynipadath2018-03-17
| | | | | | | | | | | | | | Setting sge_uld_rxq_info to NULL in free_queues_uld(). We are referencing sge_uld_rxq_info in cxgb_up(). This will fix a panic when interface is brought up after a ULDq creation failure. Fixes: 94cdb8bb993a (cxgb4: Add support for dynamic allocation of resources for ULD) Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Ganesh Goudhar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* rds: tcp: must use spin_lock_irq* and not spin_lock_bh with rds_tcp_conn_lockSowmini Varadhan2018-03-17
| | | | | | | | | | | | | | | | rds_tcp_connection allocation/free management has the potential to be called from __rds_conn_create after IRQs have been disabled, so spin_[un]lock_bh cannot be used with rds_tcp_conn_lock. Bottom-halves that need to synchronize for critical sections protected by rds_tcp_conn_lock should instead use rds_destroy_pending() correctly. Reported-by: syzbot+c68e51bb5e699d3f8d91@syzkaller.appspotmail.com Fixes: ebeeb1ad9b8a ("rds: tcp: use rds_destroy_pending() to synchronize netns/module teardown and rds connection/workq management") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'tipc-obsolete-zone-concept'David S. Miller2018-03-17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jon Maloy says: ==================== tipc: obsolete zone concept Functionality related to the 'zone' concept was never implemented in TIPC. In this series we eliminate the remaining traces of it in the code, and can hence take a first step in reducing the footprint and complexity of the binding table. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * tipc: some name changesJon Maloy2018-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We rename some lists and fields in struct publication both to make the naming more consistent and to better reflect their roles. We also update the descriptions of those lists. node_list -> local_publ cluster_list -> all_publ pport_list -> binding_sock ref -> port There are no functional changes in this commit. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tipc: merge two lists in struct publicationJon Maloy2018-03-17
| | | | | | | | | | | | | | | | | | | | | | | | The size of struct publication can be reduced further. Membership in lists 'nodesub_list' and 'local_list' is mutually exlusive, in that remote publications use the former and local publications the latter. We replace the two lists with one single, named 'binding_node' which reflects what it really is. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tipc: remove zone_list member in struct publicationJon Maloy2018-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | As a further consequence of the previous commits, we can also remove the member 'zone_list 'in struct name_info and struct publication. Instead, we now let the member cluster_list take over the role a container of all publications of a given <type,lower, upper>. We also remove the counters for the size of those lists, since they don't serve any purpose. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tipc: remove zone publication list in name tableJon Maloy2018-03-17
| | | | | | | | | | | | | | | | | | | | | | As a consequence of the previous commit we nan now eliminate zone scope related lists in the name table. We start with name_table::publ_list[3], which can now be replaced with two lists, one for node scope publications and one for cluster scope publications. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * tipc: obsolete TIPC_ZONE_SCOPEJon Maloy2018-03-17
|/ | | | | | | | | | | | | | | | | | | | | Publications for TIPC_CLUSTER_SCOPE and TIPC_ZONE_SCOPE are in all aspects handled the same way, both on the publishing node and on the receiving nodes. Despite previous ambitions to the contrary, this is never going to change, so we take the conseqeunce of this and obsolete TIPC_ZONE_SCOPE and related macros/functions. Whenever a user is doing a bind() or a sendmsg() attempt using ZONE_SCOPE we translate this internally to CLUSTER_SCOPE, while we remain compatible with users and remote nodes still using ZONE_SCOPE. Furthermore, the non-formalized scope value 0 has always been permitted for use during lookup, with the same meaning as ZONE_SCOPE/CLUSTER_SCOPE. We now permit it even as binding scope, but for compatibility reasons we choose to not change the value of TIPC_CLUSTER_SCOPE. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'pernet-convert-part8'David S. Miller2018-03-17
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kirill Tkhai says: ==================== Converting pernet_operations (part #8) this series continues to review and to convert pernet_operations to make them possible to be executed in parallel for several net namespaces at the same time. There are different operations over the tree, mostly are ipvs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Convert ip_vs_ftp_opsKirill Tkhai2018-03-17
| | | | | | | | | | | | | | | | | | | | | | These pernet_operations register and unregister ipvs app. register_ip_vs_app(), unregister_ip_vs_app() and register_ip_vs_app_inc() modify per-net structures, and there are no global structures touched. So, this looks safe to be marked as async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Convert ipvs_core_dev_opsKirill Tkhai2018-03-17
| | | | | | | | | | | | | | | | Exit method stops two per-net threads and cancels delayed work. Everything looks nicely per-net divided. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Convert ipvs_core_opsKirill Tkhai2018-03-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These pernet_operations register and unregister nf hooks, /proc entries, sysctl, percpu statistics. There are several global lists, and the only list modified without exclusive locks is ip_vs_conn_tab in ip_vs_conn_flush(). We iterate the list and force the timers expire at the moment. Since there were possible several timer expirations before this patch, and since they are safe, the patch does not invent new parallelism of their destruction. These pernet_operations look safe to be converted. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Convert ovs_net_opsKirill Tkhai2018-03-17
| | | | | | | | | | | | | | | | | | | | | | | | These pernet_operations initialize and destroy net_generic() data pointed by ovs_net_id. Exit method destroys vports from alive net to exiting net. Since they are only pernet_operations interested in this data, and exit method is executed under exclusive global lock (ovs_mutex), they are safe to be executed in parallel. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Convert mpls_net_opsKirill Tkhai2018-03-17
| | | | | | | | | | | | | | | | | | These pernet_operations register and unregister sysctl table. Exit methods frees platform_labels from net::mpls::platform_label. Everything is per-net, and they looks safe to be marked async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Convert l2tp_net_opsKirill Tkhai2018-03-17
|/ | | | | | | | | | Init method is rather simple. Exit method queues del_work for every tunnel from per-net list. This seems to be safe to be marked async. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
* net-tcp_bbr: set tp->snd_ssthresh to BDP upon STARTUP exitYousuk Seung2018-03-16
| | | | | | | | | | | | | | Set tp->snd_ssthresh to BDP upon STARTUP exit. This allows us to check if a BBR flow exited STARTUP and the BDP at the time of STARTUP exit with SCM_TIMESTAMPING_OPT_STATS. Since BBR does not use snd_ssthresh this fix has no impact on BBR's behavior. Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: add snd_ssthresh stat in SCM_TIMESTAMPING_OPT_STATSYousuk Seung2018-03-16
| | | | | | | | | | | | | This patch adds TCP_NLA_SND_SSTHRESH stat into SCM_TIMESTAMPING_OPT_STATS that reports tcp_sock.snd_ssthresh. Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* selftests/txtimestamp: Add more configurable parametersVinicius Costa Gomes2018-03-16
| | | | | | | | | Add a way to configure if poll() should wait forever for an event, the number of packets that should be sent for each and if there should be any delay between packets. Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* liquidio: Simplified napi pollIntiyaz Basha2018-03-16
| | | | | | | | | | | | 1) Moved interrupt enable related code from octeon_process_droq_poll_cmd() to separate function octeon_enable_irq(). 2) Removed wrapper function octeon_process_droq_poll_cmd(), and directlyi using octeon_droq_process_poll_pkts(). 3) Removed unused macros POLL_EVENT_XXX. Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com> Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net-smc-IPv6-support'David S. Miller2018-03-16
|\ | | | | | | | | | | | | | | | | | | | | | | Ursula Braun says: ==================== net/smc: IPv6 support these smc patches for the net-next tree add IPv6 support. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: enable ipv6 support for smcKarsten Graul2018-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | Add ipv6 support to the smc socket layer functions. Make use of the updated clc layer functions to retrieve and match ipv6 information. The indicator for ipv4 or ipv6 is the protocol constant that is provided in the socket() call with address family AF_SMC. Based-on-patch-by: Takanori Ueda <tkueda@jp.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: add ipv6 support to CLC layerKarsten Graul2018-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The CLC layer is updated to support ipv6 proposal messages from peers and to match incoming proposal messages against the ipv6 addresses of the net device. struct smc_clc_ipv6_prefix is updated to provide the space for an ipv6 address (struct was not used before). SMC_CLC_MAX_LEN is updated to include the size of the proposal prefix. Existing code in net is not affected, the previous SMC_CLC_MAX_LEN value is large enough to hold ipv4 proposal messages. Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net/smc: restructure netinfo for CLC proposal msgsKarsten Graul2018-03-16
|/ | | | | | | | | | | | | | Introduce functions smc_clc_prfx_set to retrieve IP information for the CLC proposal msg and smc_clc_prfx_match to match the contents of a proposal message against the IP addresses of the net device. The new functions replace the functionality provided by smc_clc_netinfo_by_tcpsk, which is removed by this patch. The match functionality is extended to scan all ipv4 addresses of the net device for a match against the ipv4 subnet from the proposal msg. Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: notify fatal error to uld driversGanesh Goudar2018-03-16
| | | | | | | | notify uld drivers if the adapter encounters fatal error. Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'rtnl_lock_killable'David S. Miller2018-03-16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Kirill Tkhai says: ==================== Introduce rtnl_lock_killable() rtnl_lock() is widely used mutex in kernel. Some of kernel code does memory allocations under it. In case of memory deficit this may invoke OOM killer, but the problem is a killed task can't exit if it's waiting for the mutex. This may be a reason of deadlock and panic. This patchset adds a new primitive, which responds on SIGKILL, and it allows to use it in the places, where we don't want to sleep forever. Also, the first place is made to use it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Use rtnl_lock_killable() in register_netdev()Kirill Tkhai2018-03-16
| | | | | | | | | | | | | | | | This patch adds rtnl_lock_killable() to one of hot path using rtnl_lock(). Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * net: Add rtnl_lock_killable()Kirill Tkhai2018-03-16
|/ | | | | | | | | | | | | | | rtnl_lock() is widely used mutex in kernel. Some of kernel code does memory allocations under it. In case of memory deficit this may invoke OOM killer, but the problem is a killed task can't exit if it's waiting for the mutex. This may be a reason of deadlock and panic. This patch adds a new primitive, which responds on SIGKILL, and it allows to use it in the places, where we don't want to sleep forever. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* doc: Change the udp/sctp rmem/wmem default value.Tonghao Zhang2018-03-16
| | | | | | | The SK_MEM_QUANTUM was changed from PAGE_SIZE to 4096. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* udp: Move the udp sysctl to namespace.Tonghao Zhang2018-03-16
| | | | | | | | | | | This patch moves the udp_rmem_min, udp_wmem_min to namespace and init the udp_l3mdev_accept explicitly. The udp_rmem_min/udp_wmem_min affect udp rx/tx queue, with this patch namespaces can set them differently. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'net-ipv6-Address-checks-need-to-consider-the-L3-domain'David S. Miller2018-03-16
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | David Ahern says: ==================== net/ipv6: Address checks need to consider the L3 domain IPv6 prohibits a local address from being used as a gateway for a route. However, it is ok for the gateway to be a local address in a different L3 domain (e.g., VRF). This allows, for example, veth pairs to connect VRFs. ip6_route_info_create calls ipv6_chk_addr_and_flags for gateway addresses to determine if the address is a local one, but ipv6_chk_addr_and_flags does not currently consider L3 domains. As a result routes can not be added in one VRF with a nexthop that points to a local address in a second VRF. Resolve by comparing the l3mdev for the passed in device and requiring an l3mdev match with the device containing an address. The intent of checking for an address on the specified device versus any device in the domain is mantained by a new argument to skip the check between the passed in device and the device with the address. Patch 1 moves the gateway validation from ip6_route_info_create into a helper; the function is long enough and refactoring drops the indent level. Patch 2 adds a skip_dev_check argument to ipv6_chk_addr_and_flags to allow a device to always be passed yet skip the device check when looking at addresses and fixes up a few ipv6_chk_addr callers that pass a NULL device. Patch 3 adds l3mdev checks to ipv6_chk_addr_and_flags. Patches 4 and 5 do some refactoring to the fib_tests script and then patch 6 adds nexthop validation tests. v4 - separated l3mdev check into a separate patch (patch 3 of this set) as suggested by Kirill - consolidated dev and ipv6_chk_addr_and_flags call into 1 if (Kirill) - added a temp variable for gw type (Kirill) v3 - set skip_dev_check in ipv6_chk_addr based on dev == NULL (per comment from Ido) v2 - handle 2 variations of route spec with sane error path - add test cases ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
| * selftests: fib_tests: Add IPv6 nexthop spec testsDavid Ahern2018-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add series of tests for valid and invalid nexthop specs for IPv6. $ TEST=fib_nexthop_test ./fib_tests.sh ... IPv6 nexthop tests TEST: Directly connected nexthop, unicast address [ OK ] TEST: Directly connected nexthop, unicast address with device [ OK ] TEST: Gateway is linklocal address [ OK ] TEST: Gateway is linklocal address, no device [ OK ] TEST: Gateway can not be local unicast address [ OK ] TEST: Gateway can not be local unicast address, with device [ OK ] TEST: Gateway can not be a local linklocal address [ OK ] TEST: Gateway can be local address in a VRF [ OK ] TEST: Gateway can be local address in a VRF, with device [ OK ] TEST: Gateway can be local linklocal address in a VRF [ OK ] TEST: Redirect to VRF lookup [ OK ] TEST: VRF route, gateway can be local address in default VRF [ OK ] TEST: VRF route, gateway can not be a local address [ OK ] TEST: VRF route, gateway can not be a local addr with device [ OK ] Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * selftests: fib_tests: Allow user to run a specific testDavid Ahern2018-03-16
| | | | | | | | | | | | | | | | Allow a user to run just a specific fib test by setting the TEST environment variable. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>