aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband
Commit message (Collapse)AuthorAge
...
| * | | RDMA/hns: Move all prints out of irq handleliuyixian2018-09-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It will trigger unnecessary interrupts caused by time out if prints inside aeq handle under some configurations. Thus, move all prints out of aeq handle to work queue. Signed-off-by: liuyixian <liuyixian@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/uverbs: Fix error unwind in ib_uverbs_add_oneJason Gunthorpe2018-09-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The error path has several mistakes - cdev_del should not be called if cdev_device_add fails - We must call put_device on all the goto exit paths as that is what frees the uapi, SRCU and the struct itself. While we are here consolidate all the uvdev_dev init that cannot fail at the top. Fixes: c5c4d92e70f3 ("RDMA/uverbs: Use cdev_device_add() instead of cdev_add()") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com>
| * | | RDMA/core: Properly return the error code of rdma_set_src_addr_rcuYueHaibing2018-09-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rdma_set_src_addr_rcu should check copy_src_l2_addr fails, rather than always return 0. Also copy_src_l2_addr should return 'ret' as its return value when rdma_translate_ip fails. Fixes: c31d4b2ddf07 ("RDMA/core: Protect against changing dst->dev during destination resolve") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/i40iw: Fix incorrect iterator typeHåkon Bugge2018-09-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f27b4746f378 ("i40iw: add connection management code") uses an incorrect rcu iterator, whilst holding the rtnl_lock. Since the critical region invokes i40iw_manage_qhash(), which is a sleeping function, the rcu locking and traversal cannot be used. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/uverbs: Remove is_closed from ib_uverbs_fileJason Gunthorpe2018-09-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This does nothing but indicate if the uverbs_file is in the device's list, use list_del_init instead. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
| * | | IB/ipoib: Log sysfs 'dev_id' accesses from userspaceArseny Maslennikov2018-09-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some tools may currently be using only the deprecated attribute; let's print an elaborate and clear deprecation notice to kmsg. To do that, we have to replace the whole sysfs file, since we inherit the original one from netdev. Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | IB/ipoib: Use dev_port to expose network interface port numbersArseny Maslennikov2018-09-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some InfiniBand network devices have multiple ports on the same PCI function. This initializes the `dev_port' sysfs field of those network interfaces with their port number. Prior to this the kernel erroneously used the `dev_id' sysfs field of those network interfaces to convey the port number to userspace. The use of `dev_id' was considered correct until Linux 3.15, when another field, `dev_port', was defined for this particular purpose and `dev_id' was reserved for distinguishing stacked ifaces (e.g: VLANs) with the same hardware address as their parent device. Similar fixes to net/mlx4_en and many other drivers, which started exporting this information through `dev_id' before 3.15, were accepted into the kernel 4 years ago. See 76a066f2a2a0 (`net/mlx4_en: Expose port number through sysfs'). Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru> Signed-off-by: Doug Ledford <dledford@redhat.com>
| * | | RDMA/core: Consider net ns of gid attribute for RoCEParav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When resolving destination address or route, when net namespace is unavailable, refer to the net namespace of the netdevice of the SGID attribute. This is typically the case for requests arriving from the network for RoCE ports. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Introduce rdma_read_gid_attr_ndev_rcu() to check GID attributeParav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce an API rdma_read_gid_attr_ndev_rcu() to return GID attribute netdevice which is in UP state for accessing netdevice's fields such as net namespace and ifindex. This is useful for users who intent to access netdevice fields under rcu lock. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Simplify roce_resolve_route_from_path()Parav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently RoCE route resolve functionality is split between two functions. (a) roce_resolve_route_from_path() and its helper function rdma_resolve_ip_route(). Due to this multiple sockaddr src structures are created in both functions with rdma_dev_addr is an interface between the two for checks. Since there is only one user of rdma_resolve_ip_route() as RoCE, combine the functionality of both functions to roce_resolve_route_from_path() and further reduce the scope of rdma_dev_addr to core/addr.c This also allow to extend addr_resolve() in subsequent patch to consider netdev properties of GID in safer way under rcu lock. Additionally src and dst addresses were always provided, so skip the src addr NULL pointer check as they are present on the stack now. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Protect against changing dst->dev during destination resolveParav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During resolving address process, during route lookup and while performing src address translation in case of loopback mode, hold the rcu lock so that if netdevice is moving to different net namespace, or being unregistered, it can be synchronized with net/core/dev.c, ie change_net_namespace() ->dev_close_many() ->rt6_uncached_list_flush_dev() who would change dst->dev to loopback device of the given net namespace. Therefore, hold the rcu lock and sync with synchronize_net() of change_net_namespace() to ensure that netdevice cannot get freed while dst->dev is being used. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Refer to network type instead of device typeParav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set and refer to rdma_dev_addr network type instead of dst->ndev to reduce dependency on accessing dst netdevice. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Use common code flow for IPv4/6 for addr resolveParav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use common code flow for resolving neighbour and for finding source addresses. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Rename rdma_copy_addr to rdma_copy_src_l2_addrParav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that rdma_copy_addr() only copies the source addresses and all callers are interested in copying only source addresses, simplify it to drop the destination address argument. Given that it only copies source layer2 addresses, rename it to rdma_copy_src_l2_addr for better code readability. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Introduce and use rdma_set_src_addr() between IPv4 and IPv6Parav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rdma_translate_ip() is done while resolving address for the loopback addresses. The current flow is convoluted with resolve neighbor being optional. This patch simplifies the code in following ways. (a) Use common code between IPv4 and IPv6 for address translation, loopback checks and acquiring netdevice. (b) During neigh resolve in addr_resolve_neigh(), only copy destination address. (c) Always resolve the source address before the destination address, because it doesn't depend on resolving neigh being requested or not. This helps to reduce 3 calls of rdma_copy_addr and rdma_translate_ip to one and makes it easier to follow the code flow. Now that ib_nl_fetch_ha() doesn't depend on dst, drop dst argument from ib_nl_fetch_ha(). Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Let protocol specific function typecast sockaddr structureParav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current code typecasts destination address using extra variable but uses source address as is. Even though the compiler optimizes such code well, just let each protocol specific function typecast for src and dest both and have symmetric code. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Avoid unnecessary sa_family overwriteParav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | addr4_resolve() and addr6_resolve() are called by checking the value of sa_family. Both above functions overwrite the value after typecasting, this is not necessary. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core Introduce and use rdma_find_ndev_for_src_ip_rcuParav Pandit2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two issues: 1. When address family is other than IPv4 or v6, rdma_translate_ip() returns success which is incorrect. 2. When address familty is AF_INET6, and if the source address is not found, it returns success, which is also incorrect. Therefore, introduce and use rdma_find_ndev_for_src_ip_rcu() helper function which returns correct success or error status and is also useful for future code refactor in addr_resolve(). Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/mlx5: Allow transition of DCI QP to resetMoni Shoua2018-09-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The transition is allowed from any state and the atrribute mask must be IB_QP_STATE. Fixes: c32a4f296e1d ("IB/mlx5: Add support for DC Initiator QP") Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/hfi1: set_intr_bits uses incorrect source for register modificationMichael J. Ruhl2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | HFI IRQ enable bits are not being set correctly. Send context error and DC IRQs are not being enabled correctly. In addition, send context error IRQs are not being delivered. Because of this, send context errors are not being handled correctly when they occur. When setting the IRQ bits, if an IRQ range is used, and the last bit is on a register boundary (bit 63), the calculated index for the final register modification is incorrect (index + 1 vs. index). The incorrect index calculation causes incorrect IRQ bits to be set. In this case the send context error IRQ is NOT enabled. Fix by using the 'last' value rather than the counted 'src' value to determine the final index to use. This satisfies all cases. Fixes: a2f7bbdc2dba ("IB/hfi1: Rework the IRQ API to be more flexible") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/hfi1: Missing return value in error path for user sdmaMichael J. Ruhl2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the set_txreq_header_agh() function returns an error, the exit path is chosen. In this path, the code fails to set the return value. This will cause the caller to not realize an error has occurred. Set the return value correctly in the error path. Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/hfi1: Right size user_sdma sequence numbers and related variablesMichael J. Ruhl2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hardware limits the maximum number of packets to u16 packets. Match that size for all relevant sequence numbers in the user_sdma engine. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/hfi1: Remove race conditions in user_sdma send pathMichael J. Ruhl2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Packet queue state is over used to determine SDMA descriptor availablitity and packet queue request state. cpu 0 ret = user_sdma_send_pkts(req, pcount); cpu 0 if (atomic_read(&pq->n_reqs)) cpu 1 IRQ user_sdma_txreq_cb calls pq_update() (state to _INACTIVE) cpu 0 xchg(&pq->state, SDMA_PKT_Q_ACTIVE); At this point pq->n_reqs == 0 and pq->state is incorrectly SDMA_PKT_Q_ACTIVE. The close path will hang waiting for the state to return to _INACTIVE. This can also change the state from _DEFERRED to _ACTIVE. However, this is a mostly benign race. Remove the racy code path. Use n_reqs to determine if a packet queue is active or not. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/hfi1: Eliminate races in the SDMA send error pathMichael J. Ruhl2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pq_update() can only be called in two places: from the completion function when the complete (npkts) sequence of packets has been submitted and processed, or from setup function if a subset of the packets were submitted (i.e. the error path). Currently both paths can call pq_update() if an error occurrs. This race will cause the n_req value to go negative, hanging file_close(), or cause a crash by freeing the txlist more than once. Several variables are used to determine SDMA send state. Most of these are unnecessary, and have code inspectible races between the setup function and the completion function, in both the send path and the error path. The request 'status' value can be set by the setup or by the completion function. This is code inspectibly racy. Since the status is not needed in the completion code or by the caller it has been removed. The request 'done' value races between usage by the setup and the completion function. The completion function does not need this. When the number of processed packets matches npkts, it is done. The 'has_error' value races between usage of the setup and the completion function. This can cause incorrect error handling and leave the n_req in an incorrect value (i.e. negative). Simplify the code by removing all of the unneeded state checks and variables. Clean up iovs node when it is freed. Eliminate race conditions in the error path: If all packets are submitted, the completion handler will set the completion status correctly (ok or aborted). If all packets are not submitted, the caller must wait until the submitted packets have completed, and then set the completion status. These two change eliminate the race condition in the error path. Reviewed-by: Mitko Haralanov <mitko.haralanov@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/hns: Fix an error code in hns_roce_v2_init_eq_table()Dan Carpenter2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The error code isn't set on this path. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/{hfi1, qib, rdmavt}: Schedule multi RC/UC packets instead of postingMichael J. Ruhl2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The post_send() path determines if it should post directly or, schedule the post for later. The current logic is: if the swqe ring is empty or (for hfi1) wqe->length <= piothreshold post the send else schedule This can allow large requests to call the send engine directly. Large requests can potentially produce a large number of packets prior to returning to the caller, blocking the caller from posting more requests, and allowing better parallel processing. Allow the driver(s) more say in this logic (pass call_send to the driver, rather than examining a return value). Update hfi1/qib logic to schedule the send engine if an RC or UC message is larger than the QP MTU size. Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | infiniband: remove redundant condition check before debugfs_removezhong jiang2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | debugfs_remove has taken the IS_ERR_OR_NULL into account. Just remove the unnecessary condition. Signed-off-by: zhong jiang <zhongjiang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Allow creating a matcher for a NIC TX flow tableMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently a matcher can only be created and attached to a NIC RX flow table. Extend it to allow it on NIC TX flow tables as well. In order to achieve that, we: 1) Expose a new attribute: MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS. enum ib_flow_flags is used as valid flags. Only IB_FLOW_ATTR_FLAGS_EGRESS is supported. 2) Remove the requirement to have a DEVX or QP destination when creating a flow. A flow added to NIC TX flow table will forward the packet outside of the vport (Wire or E-Switch in the SR-iOV case). Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Add NIC TX namespace when getting a flow tableMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the ability to get a NIC TX flow table when using _get_flow_table(). This will allow to create a matcher and a flow rule on the NIC TX path. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Add flow actions support to raw create flowMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Support attaching flow actions to a flow rule via raw create flow. For now only NIC RX path is supported. This change requires to export flow resources management functions so we can maintain proper bookkeeping of flow actions. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Refactor raw flow creationMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move struct mlx5_flow_act to be passed from the method entry point, this will allow to add support for flow action for the raw create flow path. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Don't overwrite action if already setMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We support only a single action type per flow rule, in case the user passes the same type of flow actions fail the flow creation. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Refactor flow action parsing to be more genericMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make the parsing of flow actions more generic so it could be used by mlx5 raw create flow. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/uverbs: Move flow resources initializationMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use ib_set_flow() when initializing flow related resources. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | IB/uverbs: Add IDRs array attribute type to ioctl() interfaceGuy Levi2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Methods sometimes need to get a flexible set of IDRs and not a strict set as can be achieved today by the conventional IDR attribute. Add a new IDRS_ARRAY attribute to the generic uverbs ioctl layer. IDRS_ARRAY points to array of idrs of the same object type and same access rights, only write and read are supported. Signed-off-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>`` Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Enable attaching packet reformat action to steering flowsMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Any matching rules will be mutated based on the packet reformat context which is attached to that given flow rule. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Enable reformat on NIC RX if supportedMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A L3_TUNNEL_TO_L2 decap flow action requires to enable the encap bit on the flow table, enable it if supported. This will allow to attach those flow actions to NIC RX steering. We don't enable if running on a representor. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Enable attaching DECAP action to steering flowsMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Any matching packet will be stripped of it's VXLAN tunnel, only the inner L2 onward is left. The user will receive the decapsulated packet. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Enable decap and packet reformat on flow tablesMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If NIC RX flow tables support decap operation, enable it on creation, This allows to perform decapsulation of tunnelled packets by steering rules. If NIC TX flow tables support reformat operation, enable it on creation. We don't enable those capabilities on representors as the E-Switch should handle packet modification (can be configured via TC) and as current hardware can't handle both FDB and NIC flow tables with decap/packet reformat support. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Enable attaching modify header to steering flowsMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When creating a flow steering rule, allow the user to attach a modify header action. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/mlx5: Add NIC TX steering supportMark Bloch2018-09-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Just like ingress steering, allow a user to create steering rules that match egress vport traffic. We expose the same number of priorities as the bypass (NIC RX) steering. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Assign device ifindex before publishing the deviceParav Pandit2018-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even though device->ifindex is assigned before adding the device in the list which is read by netlink flow, it is better to assign rdma device index before publishing the device in the system to users and clients. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Follow correct unregister order between sysfs and cgroupParav Pandit2018-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | During register_device() init sequence is, (a) register with rdma cgroup followed by (b) register with sysfs Therefore, unregister_device() sequence should follow the reverse order. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/umem: Restore lockdep check while downgrading lockLeon Romanovsky2018-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lockdep engine handles correctly downgrade of locks and it simply incorrect to disable lockdep checks prior to calling mmu_notifier. Remove lockdep_off and ensure locks correctness. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Define client_data_lock as rwlock instead of spinlockParav Pandit2018-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even though device registration/unregistration and client registration/unregistration is not a performance path, define the client_data_lock as rwlock for code clarity. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Use simpler spin lock irq API from blocking contextParav Pandit2018-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | add_client_context(), ib_unregister_device() and ib_unregister_client() are designed to call from blocking context. There is no need to save and restore last interrupt state when called from such blocking context. Even though this is not a performance path, using the right spin lock API is desired for code clarity. To avoid checkpatch warning while removing flags, sizeof() is used. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Remove context entries from list while unregistering deviceParav Pandit2018-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While unregistering a device, remove the context elements from the list to not have any stale entries. With that any errors/bugs can be checked when device is freed. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: Use simplified list_for_eachParav Pandit2018-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While traversing client_data_list in following conditions, linked list is only read, no elements of the list are removed. Therefore, use list_for_each_entry(), instead of list_for_each_safe(). Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/core: No need to protect kfree with spin lock and semaphoreParav Pandit2018-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While unregistering a client, only context removal should be protected with lock. There is no need to protect a freeing of such context which is already removed from the list. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
| * | | RDMA/{cma, core}: Avoid callback on rdma_addr_cancel()Parav Pandit2018-09-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently rdma_addr_cancel() is an async operation, which notifies that cancel is done by executing the callback function given during rdma_resolve_ip(). If resolve_ip request is already completed than callback is not executed. Instead, now rdma_resolve_addr() and rdma_addr_cancel() simplified in following ways. 1. rdma_addr_cancel() now a synchronous method. If request was pending, after it is cancelled, no callback is notified. 2. rdma_resolve_addr() and respective addr_handler() callback doesn't need to hold reference to cm_id. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>