aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge branch 'for-linus' of ↵Linus Torvalds2010-03-03
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (48 commits) IB/srp: Clean up error path in srp_create_target_ib() IB/srp: Split send and recieve CQs to reduce number of interrupts RDMA/nes: Add support for KR device id 0x0110 IB/uverbs: Use anon_inodes instead of private infinibandeventfs IB/core: Fix and clean up ib_ud_header_init() RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removing RDMA/cxgb3: Don't allocate the SW queue for user mode CQs RDMA/cxgb3: Increase the max CQ depth RDMA/cxgb3: Doorbell overflow avoidance and recovery IB/core: Pack struct ib_device a little tighter IB/ucm: Clean whitespace errors IB/ucm: Increase maximum devices supported IB/ucm: Use stack variable 'base' in ib_ucm_add_one IB/ucm: Use stack variable 'devnum' in ib_ucm_add_one IB/umad: Clean whitespace IB/umad: Increase maximum devices supported IB/umad: Use stack variable 'base' in ib_umad_init_port IB/umad: Use stack variable 'devnum' in ib_umad_init_port IB/umad: Remove port_table[] IB/umad: Convert *cdev to cdev in struct ib_umad_port ...
| * Merge branch 'misc' into for-nextRoland Dreier2010-03-02
| |\ | | | | | | | | | | | | Conflicts: drivers/infiniband/core/uverbs_main.c
| | * IB/uverbs: Use anon_inodes instead of private infinibandeventfsRoland Dreier2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The anon_inodes interface has been split to allow creating a bare (non-installed) file pointer and also extended to allow specifying O_RDONLY in the flags. This makes it a suitable replacement for the private "infinibandeventfs" pseudo-filesystem used by uverbs, and this replacement saves a small chunk of boilerplate code. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * IB/core: Fix and clean up ib_ud_header_init()Eli Cohen2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ib_ud_header_init() first clears header and then fills up the various fields. Later on, it tests header->immediate_present, which it has already cleared, so the condition is always false. Fix this by adding an immediate_present parameter and setting header->immediate_present as is done with grh_present. Also remove unused calculation of header_len. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * RDMA: Use rlimit helpersJiri Slaby2010-02-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure compiler won't do weird things with limits by using the rlimit helpers added in 3e10e716 ("resource: add helpers for fetching rlimits"). E.g. fetching them twice may return 2 different values after writable limits are implemented. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | Merge branch 'srp' into for-nextRoland Dreier2010-03-02
| |\ \
| | * | IB/srp: Clean up error path in srp_create_target_ib()Roland Dreier2010-03-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of repeating the error unwinding steps in each place an error can be detected, use the common idiom of gotos into an error flow. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/srp: Split send and recieve CQs to reduce number of interruptsBart Van Assche2010-03-02
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can reduce the number of IB interrupts from two interrupts per srp_queuecommand() call to one by using separate CQs for send and receive completions and processing send completions by polling every time a TX IU is allocated. Receive completion events still trigger an interrupt. Signed-off-by: Bart Van Assche <bart.vanassche@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | Merge branch 'nes' into for-nextRoland Dreier2010-03-02
| |\ \
| | * | RDMA/nes: Add support for KR device id 0x0110Chien Tung2010-02-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for KR device id 0x0110. While at it, cleanup nes_init_phy() by splitting it into nes_init_1g_phy() and nes_init_2025_phy(). Remove support for NES_PHY_TYPE_IRIS, which was used on an XFP board that was only manufactured in small quantities and given out for evals in even smaller quantities. Signed-off-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | RDMA/nes: Change WQ overflow return codeOr Gerlitz2010-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the nes driver to return -ENOMEM on SQ/RQ overflow to match the return code of other RDMA HW drivers (e.g cxgb3, ehca, mlx4, mthca). Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Acked-by: Chien Tung <chien.tin.tung@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | RDMA/nes: Multiple disconnects cause crash during AE handlingFaisal Latif2010-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a double disconnect during AE processing, causing crashes. While fixing the crash, also simplify the AE handling code. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | RDMA/nes: Fix crash when listener destroyed during loopback setupFaisal Latif2010-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a listener is destroyed and there is an MPA response pending for loopback connection, the active side cm_node gets destroyed twice: once in cm_event_connect_error() and again in nes_accept()/nes_reject(). Increment the cm_node's refcount so it's not destroyed by cm_event_connect_error(). Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | RDMA/nes: Use atomic counters for CM listener create and destroyFaisal Latif2010-02-19
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | After running long iterative MPI tests, sometimes ethtool reports a "CM Destroy Listener" count more than the "CM Create Listener" count. This inconsistency is fixed by making counter variables atomic. Signed-off-by: Faisal Latif <faisal.latif@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | Merge branch 'mlx4' into for-nextRoland Dreier2010-03-02
| |\ \
| | * | IB/mlx4: Simplify retrieval of ib_deviceEli Cohen2010-02-12
| | |/ | | | | | | | | | | | | | | | | | | | | | struct ib_qp already holds a pointer to the ib device. No need to dive to the hw device object to retrieve it. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | Merge branch 'iser' into for-nextRoland Dreier2010-03-02
| |\ \
| | * | IB/iser: Remove redundant locking from iser scsi command response flowOr Gerlitz2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the iSER receive completion flow takes the session lock twice. Optimize it to avoid the first one by letting iser_task_rdma_finalize() be called only from the cleanup_task callback invoked by iscsi_free_task, thus reducing the contention on the session lock between the scsi command submission to the scsi command completion flows. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/iser: Use libiscsi passthrough modeOr Gerlitz2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | libiscsi passthrough mode invokes the transport xmit calls directly without first going through an internal queue, unlike the other mode, which uses a queue and a xmitworker thread. Now that the "cant_sleep" prerequisite of iscsi_host_alloc is met, move to use it. Handling xmit errors is now done by the passthrough flow of libiscsi. Since the queue/worker aren't used in this mode, the code that schedules the xmitworker is removed. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/iser: Remove unnecessary connection checksOr Gerlitz2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove unnecessary checks for the IB connection state and for QP overflow, as conn state changes are reported by iSER to libiscsi and handled there. QP overflow is theoretically possible only when unsolicited data-outs are used; anyway it's being checked and handled by HW drivers. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/iser: Use atomic allocationsOr Gerlitz2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two minor flows in iSER's data path still use allocations; move them to be atomic as a preperation step towards moving to use libiscsi passthrough mode. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/iser: Simplify send flow/descriptorsOr Gerlitz2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify and shrink the logic/code used for the send descriptors. Changes include removing struct iser_dto (an unnecessary abstraction), using struct iser_regd_buf only for handling SCSI commands, using dma_sync instead of dma_map/unmap, etc. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/iser: Use different CQ for send completionsOr Gerlitz2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a different CQ for send completions, where send completions are polled by the interrupt-driven receive completion handler. Therefore, interrupts aren't used for the send CQ. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/iser: Remove atomic counter for posted receive buffersOr Gerlitz2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that both the posting and reaping of receive buffers is done in the completion path, the counter of outstanding buffers not be atomic. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/iser: New receive buffer posting logicOr Gerlitz2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the recv buffer posting logic is based on the transactional nature of iSER which allows for posting a buffer before sending a PDU. Change this to post only when the number of outstanding recv buffers is below a water mark and in a batched manner, thus simplifying and optimizing the data path. Use a pre-allocated ring of recv buffers instead of allocating from kmem cache. A special treatment is given to the login response buffer whose size must be 8K unlike the size of buffers used for any other purpose which is 128 bytes. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/iser: Revert commit bba7ebb "avoid recv buffer exhaustion"Or Gerlitz2010-02-24
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | We will make a major change in the recv buffer posting logic, after which the problem commit bba7ebb "avoid recv buffer exhaustion caused by unexpected PDUs" comes to solve doesn't exist any more, so revert it. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | Merge branch 'ipoib' into for-nextRoland Dreier2010-03-02
| |\ \
| | * | IPoIB: Remove TX moderation settings from ethtool supportOr Gerlitz2010-02-11
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As of commit f56bcd8 ("IPoIB: Use separate CQ for UD send completions"), there are no TX interrupts. Change the ethtool code not to report TX moderation settings, so users will not be misled to think they can control TX interrupt moderation. Pointed out by Alex Vainman <alexv@voltaire.com> Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | Merge branch 'ehca' into for-nextRoland Dreier2010-03-02
| |\ \
| | * | IB/ehca: Require in_wc in process_mad()Alexander Schmidt2010-02-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the caller does not pass a valid in_wc to process_mad(), return MAD failure status, as it is not possible to generate a valid MAD redirect response (and redirects are the only MAD responses ehca generates). Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/ehca: Allow access for ib_query_qp()Alexander Schmidt2010-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The max_dest_rd_atomic and max_qp_rd_atomic values are properly returned by query_qp(), so there should not be an error returned when they are queried. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | IB/ehca: Do not turn off irqs in tasklet contextAlexander Schmidt2010-02-12
| | |/ | | | | | | | | | | | | | | | | | | | | | The irq_spinlock is only taken in tasklet context, so it is safe not to disable hardware interrupts. Signed-off-by: Alexander Schmidt <alexs@linux.vnet.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | Merge branch 'cxgb3' into for-nextRoland Dreier2010-03-02
| |\ \
| | * | RDMA/cxgb3: Mark RDMA device with CXIO_ERROR_FATAL when removingSteve Wise2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If cxgb3 calls the iw_cxgb3 t3cclient remove function due to a device removal event, then the iwch device must be marked with CXIO_ERROR_FATAL since the device below us is going away. Otherwise, we can get stuck in a deadlock as RDMA ULPs try and deallocate objects (like MRs, QPs, etc). So always mark the device with CXIO_ERROR_FATAL when removing. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | RDMA/cxgb3: Don't allocate the SW queue for user mode CQsSteve Wise2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Only kernel mode CQs need the SW queue memory allocated. The SW queue for user mode CQs is allocated in userspace by libcxgb3. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | RDMA/cxgb3: Increase the max CQ depthSteve Wise2010-02-24
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | RDMA/cxgb3: Doorbell overflow avoidance and recoverySteve Wise2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | T3 hardware doorbell FIFO overflows can cause application stalls due to lost doorbell ring events. This has been seen when running large NP IMB alltoall MPI jobs. The T3 hardware supports an xon/xoff-type flow control mechanism to help avoid overflowing the HW doorbell FIFO. This patch uses these interrupts to disable RDMA QP doorbell rings when we near an overflow condition, and then turn them back on (and ring all the active QP doorbells) when when the doorbell FIFO empties out. In addition if an doorbell ring is dropped by the hardware, the code will now recover. Design: cxgb3: - enable these DB interrupts - in the interrupt handler, schedule work tasks to call the ULPs event handlers with the new events. - ring all the qset txqs when an overflow is detected. iw_cxgb3: - disable db ringing on all active qps when we get the DB_FULL event - enable db ringing on all active qps and ring all active dbs when we get the DB_EMPTY event - On DB_DROP event: - disable db rings in the event handler - delay-schedule a work task which rings and enables the dbs on all active qps. - in post_send and post_recv logic, don't ring the db if it's disabled. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| | * | RDMA/cxgb3: Remove BUG_ON() on CQ rearm failureSteve Wise2010-02-11
| | |/ | | | | | | | | | | | | | | | | | | | | | Failure to rearm a CQ means the cxgb3 device is wedged, but we shouldn't kill the whole system with a BUG_ON() if this happens. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | Merge branch 'cma' into for-nextRoland Dreier2010-03-02
| |\ \
| | * | RDMA/cm: Remove unused definition of RDMA_PS_SCTPSean Hefty2010-02-11
| | |/ | | | | | | | | | | | | | | | | | | | | | The defined SCTP number is incorrect (0x83, rather than 0x84), and since it is not used anywhere, simply remove the definition. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/core: Pack struct ib_device a little tighterAlexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | A small change to reduce the size of ib_device to 1112 bytes (from 1128). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/ucm: Clean whitespace errorsAlexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | As shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/ucm: Increase maximum devices supportedAlexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some large systems may support more than IB_UCM_MAX_DEVICES (currently 32). This change allows us to support more devices in a backwards-compatible manner. the first IB_UCM_MAX_DEVICES keep the same major/minor device numbers they've always had. If there are more than IB_UCM_MAX_DEVICES, then we dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/ucm: Use stack variable 'base' in ib_ucm_add_oneAlexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UCM_MAX_DEVICES. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/ucm: Use stack variable 'devnum' in ib_ucm_add_oneAlexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UCM_MAX_DEVICES in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/umad: Clean whitespaceAlexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | Clean errors as shown when 'let c_space_errors=1' is set in vim. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/umad: Increase maximum devices supportedAlexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some large systems may support more than IB_UMAD_MAX_PORTS (currently 64). This change allows us to support more ports in a backwards-compatible manner. The first IB_UMAD_MAX_PORTS keep the same major/minor device numbers they've always had. If there are more than IB_UMAD_MAX_PORTS, we then dynamically request a new major device number (new minors start at 0). Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/umad: Use stack variable 'base' in ib_umad_init_portAlexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | This change is not useful by itself, but sets us up for a future change that allows us to support more than IB_UMAD_MAX_PORTS in a system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/umad: Use stack variable 'devnum' in ib_umad_init_portAlexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | This change is not useful by itself, but sets us up for a future change that allows us to dynamically allocate device numbers in case we have more than IB_UMAD_MAX_PORTS in the system. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * | IB/umad: Remove port_table[]Alexander Chiang2010-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We no longer need this data structure, as it was used to associate an inode back to a struct ib_umad_port during ->open(). But now that we're embedding a struct cdev in struct ib_umad_port, we can use the container_of() macro to go from the inode back to the device instead. Signed-off-by: Alex Chiang <achiang@hp.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>