aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband
Commit message (Collapse)AuthorAge
* Merge branch 'for-next-merge' of ↵Linus Torvalds2012-01-18
|\ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending * 'for-next-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: ib_srpt: Initial SRP Target merge for v3.3-rc1
| * ib_srpt: Initial SRP Target merge for v3.3-rc1Bart Van Assche2011-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the kernel module ib_srpt SCSI RDMA Protocol (SRP) target implementation conforming to the SRP r16a specification for the mainline drivers/target infrastructure. This driver was originally developed by Vu Pham and has been optimized by Bart Van Assche and merged into upstream LIO based on his srpt-lio-4.1 branch here: https://github.com/bvanassche/srpt-lio/commits/srpt-lio-4.1/ This updated patch also contains the following two changes from lio-core-2.6.git/master. One is to fix a bug with 1 >= task->task_sg[] chained mappings in ib_srpt, and the other to convert the configfs control plane to reference IB Port GUID and struct srpt_port directly following mainline v4.x target_core_fabric_configfs.c convertion for ib_srpt to work with rtslib/rtsadmin v2 code. These seperate patches can be found here: ib_srpt: Fix bug with chainged SGLs in srpt_map_sg_to_ib_sge http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=ea485147563b6555a97dbf811825fbb586519252 ib_srpt: Convert se_wwn endpoint reference to struct srpt_port->port_wwn http://www.risingtidesystems.com/git/?p=lio-core-2.6.git;a=commitdiff;h=4e544a210acb227df1bb4ca5086e65bdf4e648ea This also includes the following recent v1 -> v2 review changes: ib_srpt: Fix potential out-of-bounds array access ib_srpt: Avoid failed multipart RDMA transfers ib_srpt: Fix srpt_alloc_fabric_acl failure case return value ib_srpt: Update comments to reference $driver/$port layout ib_srpt: Fix sport->port_guid formatting code ib_srpt: Remove legacy use_port_guid_in_session_name module parameter ib_srpt: Convert srp_max_rdma_size into per port configfs attribute ib_srpt: Convert srp_max_rsp_size into per port configfs attribute ib_srpt: Convert srpt_sq_size into per port configfs attribute and v2 -> v3 review changes: ib_srpt: Fix possible race with srp_sq_size in srpt_create_ch_ib ib_srpt: Fix possible race with srp_max_rsp_size in srpt_release_channel_work ib_srpt: Fix up MAX_SRPT_RDMA_SIZE define ib_srpt: Make srpt_map_sg_to_ib_sge() failure case return -EAGAIN ib_srpt: Convert port_guid to use subnet_prefix + interface_id formatting ib_srpt: Make srpt_check_stop_free return kref_put status ib_srpt: Make compilation with BUG=n proceed` ib_srpt: Use new target_core_fabric.h include ib_srpt: Check hex2bin() return code to silence build warning Cc: Bart Van Assche <bvanassche@acm.org> Cc: Roland Dreier <roland@purestorage.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Vu Pham <vu@mellanox.com> Cc: David Dillow <dillowda@ornl.gov> Signed-off-by: Nicholas A. Bellinger <nab@risingtidesystems.com>
* | module_param: make bool parameters really bool (drivers & misc)Rusty Russell2012-01-12
| | | | | | | | | | | | | | | | | | | | | | | | module_param(bool) used to counter-intuitively take an int. In fddd5201 (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Acked-by: Mauro Carvalho Chehab <mchehab@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | module_param: avoid bool abuse, add bint for special cases.Rusty Russell2012-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For historical reasons, we allow module_param(bool) to take an int (or an unsigned int). That's going away. A few drivers really want an int: they set it to -1 and a parameter will set it to 0 or 1. This sucks: reading them from sysfs will give 'Y' for both -1 and 1, but if we change it to an int, then the users might be broken (if they did "param" instead of "param=1"). Use a new 'bint' parser for them. (ntfs has a different problem: it needs an int for debug_msgs because it's also exposed via sysctl.) Cc: Steve Glendinning <steve.glendinning@smsc.com> Cc: Jean Delvare <khali@linux-fr.org> Cc: Guenter Roeck <guenter.roeck@ericsson.com> Cc: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Cc: Christoph Raisch <raisch@de.ibm.com> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: linux390@de.ibm.com Cc: Anton Altaparmakov <anton@tuxera.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.de> Cc: lm-sensors@lm-sensors.org Cc: linux-rdma@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux-ntfs-dev@lists.sourceforge.net Cc: alsa-devel@alsa-project.org Acked-by: Takashi Iwai <tiwai@suse.de> (For the sound part) Acked-by: Guenter Roeck <guenter.roeck@ericsson.com> (For the hwmon driver) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | Merge tag 'infiniband-for-linus' of ↵Linus Torvalds2012-01-08
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband infiniband changes for 3.3 merge window * tag 'infiniband-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: rdma/core: Fix sparse warnings RDMA/cma: Fix endianness bugs RDMA/nes: Fix terminate during AE RDMA/nes: Make unnecessarily global nes_set_pau() static RDMA/nes: Change MDIO bus clock to 2.5MHz IB/cm: Fix layout of APR message IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoE IB/qib: Default some module parameters optimally IB/qib: Optimize locking for get_txreq() IB/qib: Fix a possible data corruption when receiving packets IB/qib: Eliminate 64-bit jiffies use IB/qib: Fix style issues IB/uverbs: Protect QP multicast list
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| | \
| *-------. \ Merge branches 'cma', 'misc', 'mlx4', 'nes', 'qib' and 'uverbs' into for-nextRoland Dreier2012-01-04
| |\ \ \ \ \ \
| | | | | | * | IB/uverbs: Protect QP multicast listEli Cohen2012-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Userspace verbs multicast attach/detach operations on a QP are done while holding the rwsem of the QP for reading. That's not sufficient since a reader lock allows more than one reader to acquire the lock. However, multicast attach/detach does list manipulation that can corrupt the list if multiple threads run in parallel. Fix this by acquiring the rwsem as a writer to serialize attach/detach operations. Add idr_write_qp() and put_qp_write() to encapsulate this. This fixes oops seen when running applications that perform multicast joins/leaves. Reported by: Mike Dubman <miked@mellanox.com> Signed-off-by: Eli Cohen <eli@mellanox.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | | * | | IB/qib: Default some module parameters optimallyMike Marciniszyn2012-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minimize the need for users to have to set module parameters to get good performance. The following two parameters are changed: - rcvhdrcnt to twice the rcvegrcnt - pcie_caps=0x51 The rcvhdrcnt at twice the egrcount allows the preemptive NAK code during reception to function in 100% of the cases rather than a sender jiffies-based timeout. The pcie_caps default of 0x51 will set the proposed MaxPayload and MaxReceiveReqest to 256 and 4096 respectively. The capabilities on the root complex will be used to limit those values. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | | * | | IB/qib: Optimize locking for get_txreq()Mike Marciniszyn2012-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code locks the QP s_lock, followed by the pending_lock, I guess to to protect against the allocate failing. This patch only locks the pending_lock, assuming that the empty case is an exeception, in which case the pending_lock is dropped, and the original code is executed. This will save a lock of s_lock in the normal case. The observation is that the sdma descriptors will deplete at twice the rate of txreq's, so this should be rare. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | | * | | IB/qib: Fix a possible data corruption when receiving packetsRam Vepa2012-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prevent a receive data corruption by ensuring that the write to update the rcvhdrheadn register to generate an interrupt is at the very end of the receive processing. Signed-off-by: Ramkrishna Vepa <ram.vepa@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | | * | | IB/qib: Eliminate 64-bit jiffies useMike Marciniszyn2012-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The qib driver makes use of the the 64-bit jiffies API. Code inspection reveals that that version of the API is not really required. This patch converts to use the "normal" jiffies. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | | * | | IB/qib: Fix style issuesMike Marciniszyn2012-01-03
| | | | | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | More style issues revealed with checkpatch.pl -f. Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | * | | RDMA/nes: Fix terminate during AETatyana Nikolova2012-01-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix for reset which happens right after sending a terminate message. Terminate timer is not deleted when the connection is closed. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Faisal Latif <Faisal.Latif@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | * | | RDMA/nes: Make unnecessarily global nes_set_pau() staticTatyana Nikolova2012-01-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Warned about by sparse. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Faisal Latif <Faisal.Latif@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | * | | RDMA/nes: Change MDIO bus clock to 2.5MHzTatyana Nikolova2012-01-04
| | | | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the PHY clock divisor to make the MDIO clock 2.5MHz, instead of 3.5MHz (which is out of spec). Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Faisal Latif <Faisal.Latif@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | * / / IB/mlx4: Fix SL to 802.1Q priority-bits mapping for IBoEOr Gerlitz2012-01-04
| | | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For IBoE, SLs 0-7 are mapped to Ethernet 802.1Q user priority bits (pbits) which are part of the VLAN tag, SLs 8-15 are reserved. Under Ethernet, the ConnectX firmware treats (decode/encode) the four bit SL field in various constructs such as QPC / UD WQE / CQE as PPP0 and not as 0PPP. This correlates well to the fact that within the vlan tag the pbits are located in bits 15-13 and not 12-14. The current code wasn't consistent around that area - the encoding was correct for the IBoE QPC.path.schedule_queue field, but was wrong for IBoE CQEs and when MLX header was built. These inconsistencies resulted in wrong SL <--> wire 802.1Q pbits mapping, which is fixed by using SL <--> PPP0 all around the place. Signed-off-by: Oren Duer <oren@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | * / / IB/cm: Fix layout of APR messageEli Cohen2012-01-04
| | |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a missing 16-bit reserved field between ap_status and info fields. Signed-off-by: Eli Cohen <eli@mellanox.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | | rdma/core: Fix sparse warningsSean Hefty2012-01-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean up sparse warnings in the rdma core layer. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | | RDMA/cma: Fix endianness bugsSean Hefty2012-01-04
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix endianness bugs reported by sparse in the RDMA core stack. Note that these are real bugs, but don't affect any existing code to the best of my knowledge. The mlid issue would only affect kernel users of rdma_join_multicast which have the rdma_cm attach/detach its QP. There are no current in tree users that do this. (rdma_join_multicast may be used called by user space applications, which does not have this issue.) And the pkey setting is simply returned as informational. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | | Merge branch 'for-linus2' of ↵Linus Torvalds2012-01-08
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits) reiserfs: Properly display mount options in /proc/mounts vfs: prevent remount read-only if pending removes vfs: count unlinked inodes vfs: protect remounting superblock read-only vfs: keep list of mounts for each superblock vfs: switch ->show_options() to struct dentry * vfs: switch ->show_path() to struct dentry * vfs: switch ->show_devname() to struct dentry * vfs: switch ->show_stats to struct dentry * switch security_path_chmod() to struct path * vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb vfs: trim includes a bit switch mnt_namespace ->root to struct mount vfs: take /proc/*/mounts and friends to fs/proc_namespace.c vfs: opencode mntget() mnt_set_mountpoint() vfs: spread struct mount - remaining argument of next_mnt() vfs: move fsnotify junk to struct mount vfs: move mnt_devname vfs: move mnt_list to struct mount vfs: switch pnode.h macros to struct mount * ...
| * | | infiniband: umode_t noise, including open-coded S_ISDIR()Al Viro2012-01-03
| | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | switch ->is_visible() to returning umode_tAl Viro2012-01-03
| | | | | | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | | switch device_get_devnode() and ->devnode() to umode_t *Al Viro2012-01-03
| |/ / | | | | | | | | | | | | | | | | | | both callers of device_get_devnode() are only interested in lower 16bits and nobody tries to return anything wider than 16bit anyway. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2011-12-23
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: net/bluetooth/l2cap_core.c Just two overlapping changes, one added an initialization of a local variable, and another change added a new local variable. Signed-off-by: David S. Miller <davem@davemloft.net>
| | \
| | \
| *-. \ Merge branches 'cma', 'mlx4' and 'qib' into for-nextRoland Dreier2011-12-19
| |\ \ \
| | | * | IB/qib: Correct sense on freectxts increment and decrementMike Marciniszyn2011-12-19
| | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 53ab1c64983 ("IB/qib: Correct nfreectxts for multiple HCAs") reversed the increments and decrements of dd->nfreectxts. Fix it. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | * / IB/mlx4: Fix shutdown crash accessing a non-existent bitmapRoland Dreier2011-12-06
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit cfcde11c3d7a ("IB/mlx4: Use flow counters on IBoE ports") added code that sets elements of counters[] to -1 if no counter is allocated, but then goes ahead and passes every entry to mlx4_counter_free() on shutdown. This is a bad idea, especially if MLX4_DEV_CAP_FLAG_COUNTERS isn't set so there isn't even an underlying bitmap to free from. Tested-by: Sean Hefty <sean.hefty@intel.com> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * / RDMA/cma: Verify private data lengthSean Hefty2011-12-19
| |/ | | | | | | | | | | | | | | | | | | | | | | | | private_data_len is defined as a u8. If the user specifies a large private_data size (> 220 bytes), we will calculate a total length that exceeds 255, resulting in private_data_len wrapping back to 0. This can lead to overwriting random kernel memory. Avoid this by verifying that the resulting size fits into a u8. Reported-by: B. Thery <benjamin.thery@bull.net> Addresses: <http://bugs.openfabrics.org/bugzilla/show_bug.cgi?id=2335> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | mlx4_ib: disable SRIOV mode for IB ports (not yet supported)Jack Morgenstein2011-12-13
| | | | | | | | | | Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | mlx4_core: Add "native" argument to mlx4_cmd and its callers (where needed)Jack Morgenstein2011-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For SRIOV, some Hypervisor commands can be executed directly (native = 1). Others should go through the command wrapper flow (for tracking resource usage, for example, or for changing some HCA configurations that slaves need to be notified of). This patch sets the groundwork for this capability -- adding the correct value of "native" in each case. Note that if SRIOV is not activated, this parameter has no effect. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | mlx4: Extanding port_mask functionalityJack Morgenstein2011-12-13
| | | | | | | | | | | | | | | | | | | | | | | | Port mask now has additional state. Port can be set as "none". In this case neither the mlx4_en or mlx4_ib drivers take ownership of the port. In multifunction mode there is an option to set the vfs as single ported devices. (in single function mode, both physical ports belong to same function) Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>
* | infiniband: ipoib: Sanitize neighbour handling in ipoib_main.cDavid Miller2011-12-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Reduce the number of dst_get_neighbour_noref() calls within a single call chain. Primarily by passing the neighbour pointer down to the helper functions. Handle dst_get_neighbour_noref() returning NULL in ipoib_start_xmit() by incrementing the dropped counter and freeing the packet. We don't want it to fall through into the ARP/RARP/multicast handling, since that should only happen when skb_dst() is NULL. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Roland Dreier <roland@purestorage.com>
* | infiniband: cxgb4: Consolidate 3 copies of the same operation into 1 helper ↵David Miller2011-12-05
| | | | | | | | | | | | | | | | | | | | | | | | function. Three pieces of code do the same thing, create a l2t entry and then import this information into the c4iw_ep object. Create a helper function and call it from these 3 locations instead. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Roland Dreier <roland@purestorage.com>
* | infiniband: nes: Use dst's neighbour entry.David Miller2011-12-05
| | | | | | | | | | | | | | Do this instead of performing a by-hand lookup. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Roland Dreier <roland@purestorage.com>
* | cxgb3: Rework t3_l2t_get to take a dst_entry instead of a neighbour.David Miller2011-12-05
| | | | | | | | | | | | | | | | This way we consolidate the RCU locking down into the place where it actually matters, and also we can make the code handle dst_get_neighbour_noref() returning NULL properly. Signed-off-by: David S. Miller <davem@davemloft.net>
* | infiniband: addr: Consolidate code to fetch neighbour hardware address from dst.David Miller2011-12-05
| | | | | | | | | | | | | | | | | | | | | | | | IPV4 should do exactly what the IPV6 code does here, which is use the neighbour obtained via the dst entry. And now that the two code paths do the same thing, use a common helper function to perform the operation. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Roland Dreier <roland@purestorage.com>
* | net: Rename dst_get_neighbour{, _raw} to dst_get_neighbour_noref{, _raw}.David Miller2011-12-05
| | | | | | | | | | | | | | | | To reflect the fact that a refrence is not obtained to the resulting neighbour entry. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Roland Dreier <roland@purestorage.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2011-12-02
|\|
| *---. Merge branches 'cxgb4', 'ipoib', 'misc' and 'qib' into for-nextRoland Dreier2011-11-29
| |\ \ \
| | | | * IB/qib: Fix over-scheduling of QSFP workMike Marciniszyn2011-11-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Don't over-schedule QSFP work on driver initialization. It could end up being run simultaneously on two different CPUs resulting in bad EEPROM reads. In combination with setting the physical IB link state prior to the IBC being brought out of reset, this can cause the link state machine to start training early with wrong settings. Signed-off-by: Mitko Haralanov <mitko@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | * IB/qib: Don't use schedule_work()Mike Marciniszyn2011-11-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It was mistakenly introduced by dde05cbdf8b1 ("IB/qib: Hold links until tuning data is available"). Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | * | IB: Fix RCU lockdep splatsEric Dumazet2011-11-29
| | | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit f2c31e32b37 ("net: fix NULL dereferences in check_peer_redir()") forgot to take care of infiniband uses of dst neighbours. Many thanks to Marc Aurele who provided a nice bug report and feedback. Reported-by: Marc Aurele La France <tsi@ualberta.ca> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: <stable@kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | * / IB/ipoib: Prevent hung task or softlockup processing multicast responseMike Marciniszyn2011-11-29
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This following can occur with ipoib when processing a multicast reponse: BUG: soft lockup - CPU#0 stuck for 67s! [ib_mad1:982] Modules linked in: ... CPU 0: Modules linked in: ... Pid: 982, comm: ib_mad1 Not tainted 2.6.32-131.0.15.el6.x86_64 #1 ProLiant DL160 G5 RIP: 0010:[<ffffffff814ddb27>] [<ffffffff814ddb27>] _spin_unlock_irqrestore+0x17/0x20 RSP: 0018:ffff8802119ed860 EFLAGS: 00000246 0000000000000004 RBX: ffff8802119ed860 RCX: 000000000000a299 RDX: ffff88021086c700 RSI: 0000000000000246 RDI: 0000000000000246 RBP: ffffffff8100bc8e R08: ffff880210ac229c R09: 0000000000000000 R10: ffff88021278aab8 R11: 0000000000000000 R12: ffff8802119ed860 R13: ffffffff8100be6e R14: 0000000000000001 R15: 0000000000000003 FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00000000006d4840 CR3: 0000000209aa5000 CR4: 00000000000406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Call Trace: [<ffffffffa032c247>] ? ipoib_mcast_send+0x157/0x480 [ib_ipoib] [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffff8100bc8e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffffa03283d4>] ? ipoib_path_lookup+0x124/0x2d0 [ib_ipoib] [<ffffffffa03286fc>] ? ipoib_start_xmit+0x17c/0x430 [ib_ipoib] [<ffffffff8141e758>] ? dev_hard_start_xmit+0x2c8/0x3f0 [<ffffffff81439d0a>] ? sch_direct_xmit+0x15a/0x1c0 [<ffffffff81423098>] ? dev_queue_xmit+0x388/0x4d0 [<ffffffffa032d6b7>] ? ipoib_mcast_join_finish+0x2c7/0x510 [ib_ipoib] [<ffffffffa032dab8>] ? ipoib_mcast_sendonly_join_complete+0x1b8/0x1f0 [ib_ipoib] [<ffffffffa02a0946>] ? mcast_work_handler+0x1a6/0x710 [ib_sa] [<ffffffffa015f01e>] ? ib_send_mad+0xfe/0x3c0 [ib_mad] [<ffffffffa00f6c93>] ? ib_get_cached_lmc+0xa3/0xb0 [ib_core] [<ffffffffa02a0f9b>] ? join_handler+0xeb/0x200 [ib_sa] [<ffffffffa029e4fc>] ? ib_sa_mcmember_rec_callback+0x5c/0xa0 [ib_sa] [<ffffffffa029e79c>] ? recv_handler+0x3c/0x70 [ib_sa] [<ffffffffa01603a4>] ? ib_mad_completion_handler+0x844/0x9d0 [ib_mad] [<ffffffffa015fb60>] ? ib_mad_completion_handler+0x0/0x9d0 [ib_mad] [<ffffffff81088830>] ? worker_thread+0x170/0x2a0 [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40 [<ffffffff810886c0>] ? worker_thread+0x0/0x2a0 [<ffffffff8108ddf6>] ? kthread+0x96/0xa0 [<ffffffff8100c1ca>] ? child_rip+0xa/0x20 Coinciding with stack trace is the following message: ib0: ib_address_create failed The code below in ipoib_mcast_join_finish() will note the above failure in the address handle but otherwise continue: ah = ipoib_create_ah(dev, priv->pd, &av); if (!ah) { ipoib_warn(priv, "ib_address_create failed\n"); } else { The while loop at the bottom of ipoib_mcast_join_finish() will attempt to send queued multicast packets in mcast->pkt_queue and eventually end up in ipoib_mcast_send(): if (!mcast->ah) { if (skb_queue_len(&mcast->pkt_queue) < IPOIB_MAX_MCAST_QUEUE) skb_queue_tail(&mcast->pkt_queue, skb); else { ++dev->stats.tx_dropped; dev_kfree_skb_any(skb); } My read is that the code will requeue the packet and return to the ipoib_mcast_join_finish() while loop and the stage is set for the "hung" task diagnostic as the while loop never sees a non-NULL ah, and will do nothing to resolve. There are GFP_ATOMIC allocates in the provider routines, so this is possible and should be dealt with. The test that induced the failure is associated with a host SM on the same server during a shutdown. This patch causes ipoib_mcast_join_finish() to exit with an error which will flush the queued mcast packets. Nothing is done to unwind the QP attached state so that subsequent sends from above will retry the join. Reviewed-by: Ram Vepa <ram.vepa@qlogic.com> Reviewed-by: Gary Leshner <gary.leshner@qlogic.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | RDMA/cxgb4: Fix retry with MPAv1 logic for MPAv2Kumar Sanghvi2011-11-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix logic so that we don't retry with MPAv1 once we have done that already. Otherwise, we end up retrying with MPAv1 even when its not needed on getting peer aborts - and this could lead to kernel panic. Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | RDMA/cxgb4: Fix iw_cxgb4 count_rcqes() logicJonathan Lallinger2011-11-28
| |/ | | | | | | | | | | | | | | | | | | Fix another place in the code where logic dealing with the t4_cqe was using the wrong QID. This fixes the counting logic so that it tests against the SQ QID instead of the RQ QID when counting RCQES. Signed-off by: Jonathan Lallinger <jonathan@ogc.us> Signed-off by: Steve Wise <swise@ogc.us> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | neigh: Add infrastructure for allocating device neigh privates.David Miller2011-11-30
| | | | | | | | | | | | | | | | | | netdev->neigh_priv_len records the private area length. This will trigger for neigh_table objects which set tbl->entry_size to zero, and the first instances of this will be forthcoming. Signed-off-by: David S. Miller <davem@davemloft.net>
* | net: remove ipv6_addr_copy()Alexey Dobriyan2011-11-22
| | | | | | | | | | | | | | C assignment can handle struct in6_addr copying. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | infiniband: Update net drivers for netdev_features_t changes.David S. Miller2011-11-16
|/ | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'modsplit-Oct31_2011' of ↵Linus Torvalds2011-11-06
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits) Revert "tracing: Include module.h in define_trace.h" irq: don't put module.h into irq.h for tracking irqgen modules. bluetooth: macroize two small inlines to avoid module.h ip_vs.h: fix implicit use of module_get/module_put from module.h nf_conntrack.h: fix up fallout from implicit moduleparam.h presence include: replace linux/module.h with "struct module" wherever possible include: convert various register fcns to macros to avoid include chaining crypto.h: remove unused crypto_tfm_alg_modname() inline uwb.h: fix implicit use of asm/page.h for PAGE_SIZE pm_runtime.h: explicitly requires notifier.h linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h miscdevice.h: fix up implicit use of lists and types stop_machine.h: fix implicit use of smp.h for smp_processor_id of: fix implicit use of errno.h in include/linux/of.h of_platform.h: delete needless include <linux/module.h> acpi: remove module.h include from platform/aclinux.h miscdevice.h: delete unnecessary inclusion of module.h device_cgroup.h: delete needless include <linux/module.h> net: sch_generic remove redundant use of <linux/module.h> net: inet_timewait_sock doesnt need <linux/module.h> ... Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in - drivers/media/dvb/frontends/dibx000_common.c - drivers/media/video/{mt9m111.c,ov6650.c} - drivers/mfd/ab3550-core.c - include/linux/dmaengine.h
| * infiniband: add moduleparam.h to drivers/infiniband as requiredPaul Gortmaker2011-10-31
| | | | | | | | | | | | | | | | These files were getting the moduleparam infrastructure from the implicit presence of module.h being everywhere, but that is going away soon. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>