aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband
Commit message (Collapse)AuthorAge
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivialLinus Torvalds2007-02-19
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits) Documentation/kernel-docs.txt update. arch/cris: typo in KERN_INFO Storage class should be before const qualifier kernel/printk.c: comment fix update I/O sched Kconfig help texts - CFQ is now default, not AS. Remove duplicate listing of Cris arch from README kbuild: more doc. cleanups doc: make doc. for maxcpus= more visible drivers/net/eexpress.c: remove duplicate comment add a help text for BLK_DEV_GENERIC correct a dead URL in the IP_MULTICAST help text fix the BAYCOM_SER_HDX help text fix SCSI_SCAN_ASYNC help text trivial documentation patch for platform.txt Fix typos concerning hierarchy Fix comment typo "spin_lock_irqrestore". Fix misspellings of "agressive". drivers/scsi/a100u2w.c: trivial typo patch Correct trivial typo in log2.h. Remove useless FIND_FIRST_BIT() macro from cardbus.c. ...
| * Various typo fixes.Robert P. J. Day2007-02-17
| | | | | | | | | | | | | | | | Correct mis-spellings of "algorithm", "appear", "consistent" and (shame, shame) "kernel". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
* | IB/core: Set static rate in ib_init_ah_from_path()Roland Dreier2007-02-16
| | | | | | | | | | | | | | | | | | | | | | | | The static rate from the path record should be put into the address vector -- a long time ago the rate in the address attributes needed to be a relative rate, which required more munging, but now that the conversion from absolute to relative is done in the low-level driver, it's easy for ib_init_ah_from_path() to put the absolute rate in. Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/ipath: Make ipath_map_sg() staticRoland Dreier2007-02-16
| | | | | | | | Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/core: Fix sparse warnings about shadowed declarationsRoland Dreier2007-02-16
| | | | | | | | | | | | | | Change a couple of variable names to avoid sparse warnings about symbols being shadowed. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | RDMA/cma: Add multicast communication supportSean Hefty2007-02-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend rdma_cm to support multicast communication. Multicast support is added to the existing RDMA_PS_UDP port space, as well as a new RDMA_PS_IPOIB port space. The latter port space allows joining the multicast groups used by IPoIB, which enables offloading IPoIB traffic to a separate QP. The port space determines the signature used in the MGID when joining the group. The newly added RDMA_PS_IPOIB also allows for unicast operations, similar to RDMA_PS_UDP. Supporting the RDMA_PS_IPOIB requires changing how UD QPs are initialized, since we can no longer assume that the qkey is constant. This requires saving the Q_Key to use when attaching to a device, so that it is available when creating the QP. The Q_Key information is exported to the user through the existing rdma_init_qp_attr() interface. Multicast support is also exported to userspace through the rdma_ucm. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/sa: Track multicast join/leave requestsSean Hefty2007-02-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The IB SA tracks multicast join/leave requests on a per port basis and does not do any reference counting: if two users of the same port join the same group, and one leaves that group, then the SA will remove the port from the group even though there is one user who wants to stay a member left. Therefore, in order to support multiple users of the same multicast group from the same port, we need to perform reference counting locally. To do this, add an multicast submodule to ib_sa to perform reference counting of multicast join/leave operations. Modify ib_ipoib (the only in-kernel user of multicast) to use the new interface. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IPoIB: CM error handling thinko fixMichael S. Tsirkin2007-02-16
| | | | | | | | | | | | | | | | | | ipoib_cm_alloc_rx_skb() might be called from IRQ context, so it must use dev_kfree_skb_any(), not kfree_skb(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | RDMA/cxgb3: Remove Open Grid Computing copyrights in iw_cxgb3 driverSteve Wise2007-02-16
| | | | | | | | | | | | | | Remove the Open Grid Computing copyright. It shouldn't be there. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | RDMA/cxgb3: Fail posts synchronously when in TERMINATE stateSteve Wise2007-02-16
| | | | | | | | | | | | | | | | For T3B devices, mark user QP in error once we transition to TERMINATE. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | RDMA/iwcm: iw_cm_id destruction race fixesSteve Wise2007-02-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iwcm iw_cm_id destruction race condition fixes: - iwcm_deref_id() always wakes up if there's another reference. - clean up race condition in cm_work_handler(). - create static void free_cm_id() which deallocs the work entries and then kfrees the cm_id memory. This reduces code replication. - rem_ref() if this is the last reference -and- the IWCM owns freeing the cm_id, then free it. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/ehca: Change query_port() to return LINK_UP instead UNKNOWNHoang-Nam Nguyen2007-02-16
| | | | | | | | | | | | | | | | | | Set the port phys state as returned from ehca_query_port() to LINK_UP. ehca actually represents a logical HCA, whose phys/link state always is LINK_UP. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/ehca: Allow en/disabling scaling code via module parameterHoang-Nam Nguyen2007-02-16
| | | | | | | | | | | | | | | | Allow users to en/disable scaling code when loading ib_ehca module, rather than requiring the module to be rebuilt to change the setting. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/ehca: Fix race condition/locking issues in scaling codeHoang-Nam Nguyen2007-02-16
| | | | | | | | | | | | | | | | Fix a race condition in find_next_cpu_online() and some other locking issues in ehca scaling code. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/ehca: Rework irq handlerHoang-Nam Nguyen2007-02-16
| | | | | | | | | | | | | | Rework ehca interrupt handling to avoid/reduce missed irq events. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IPoIB: Only allow root to change between datagram and connected modeRoland Dreier2007-02-16
| | | | | | | | | | | | | | Change the permissions of the "mode" sysfs attribute to be S_IWUSR instead of S_IWUGO. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/mthca: Fix allocation of ICM chunks in coherent memoryRoland Dreier2007-02-16
| | | | | | | | | | | | | | | | | | | | The change to allow allocating ICM chunks from coherent memory did not increment the count of sg entries properly, so a chunk that required more than allocation would not be mapped properly by the HCA. Fix this by adding the missing increment of chunk->nsg. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | IB/mthca: Allow the QP state transition RESET->RESETDotan Barak2007-02-16
|/ | | | | | | | RESET->RESET is an allowed QP state transition, so mthca should handle it correctly, by just returning success without involving the firmware. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* [PATCH] Scheduled removal of SA_xxx interrupt flags fixupsThomas Gleixner2007-02-14
| | | | | | | | | | | | | | | | | | | | | | The obsolete SA_xxx interrupt flags have been used despite the scheduled removal. Fixup the remaining users. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Jeff Garzik <jeff@garzik.org> Cc: Wim Van Sebroeck <wim@iguana.be> Cc: Roland Dreier <rolandd@cisco.com> Cc: Alessandro Zummo <a.zummo@towertech.it> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: Dave Airlie <airlied@linux.ie> Cc: James Simmons <jsimmons@infradead.org> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] remove many unneeded #includes of sched.hTim Schmielau2007-02-14
| | | | | | | | | | | | | | | | | | | | | | | | After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'for-linus' of ↵Linus Torvalds2007-02-14
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/mthca: Always fill MTTs from CPU IB/mthca: Merge MR and FMR space on 64-bit systems IB/mthca: Fix access to MTT and MPT tables on non-cache-coherent CPUs IB/mthca: Give reserved MTTs a separate cache line IB/mthca: Fix reserved MTTs calculation on mem-free HCAs RDMA/cxgb3: Add driver for Chelsio T3 RNIC IB: Remove redundant "_wq" from workqueue names RDMA/cma: Increment port number after close to avoid re-use IB/ehca: Fix memleak on module unloading IB/mthca: Work around gcc bug on sparc64 IPoIB: Connected mode experimental support IB/core: Use ARRAY_SIZE macro for mandatory_table IB/mthca: Use correct structure size in call to memset()
| * IB/mthca: Always fill MTTs from CPUMichael S. Tsirkin2007-02-12
| | | | | | | | | | | | | | | | | | | | | | Speed up memory registration by filling in MTTs directly when the CPU can write directly to the whole table (all mem-free cards, and to Tavor mode on 64-bit systems with the patch I posted earlier). This reduces the number of FW commands needed to register an MR by at least a factor of 2 and speeds up memory registration significantly. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Merge MR and FMR space on 64-bit systemsMichael S. Tsirkin2007-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For Tavor, we currently reserve separate MPT and MTT space for FMRs to avoid abusing the vmalloc space on 32 bit kernels. No such problem exists on 64 bit kernels so let's not do it there. This way we have a shared pool for MR and FMR resources, used on demand. This will also make it possible to write MTTs for regular regions directly from driver. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Fix access to MTT and MPT tables on non-cache-coherent CPUsMichael S. Tsirkin2007-02-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We allocate the MTT table with alloc_pages() and then do pci_map_sg(), so we must call pci_dma_sync_sg() after the CPU writes to the MTT table. This works since the device will never write MTTs on mem-free HCAs, once we get rid of the use of the WRITE_MTT firmware command. This change is needed to make that work, and is an improvement for now, since it gives FMRs a chance at working. For MPTs, both the device and CPU might write there, so we must allocate DMA coherent memory for these. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Give reserved MTTs a separate cache lineMichael S. Tsirkin2007-02-12
| | | | | | | | | | | | | | | | | | MTTs are allocated in non-cache-coherent memory, so we must give reserved MTTs their own cache line, to prevent both device and CPU from writing into the same cache line at the same time. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Fix reserved MTTs calculation on mem-free HCAsMichael S. Tsirkin2007-02-12
| | | | | | | | | | | | | | | | | | The reserved_mtts field has different meaning in Tavor and Arbel, so we are wasting mtt entries on memfree. Fix the Arbel case to match Tavor semantics. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * RDMA/cxgb3: Add driver for Chelsio T3 RNICSteve Wise2007-02-12
| | | | | | | | | | | | | | Add an RDMA/iWARP driver for the Chelsio T3 1GbE and 10GbE adapters. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB: Remove redundant "_wq" from workqueue namesSean Hefty2007-02-10
| | | | | | | | Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * RDMA/cma: Increment port number after close to avoid re-useSean Hefty2007-02-10
| | | | | | | | | | | | | | | | | | Randomize the starting port number and avoid re-using port values immediately after they are closed. Instead keep track of the last port value used and increment it every time a new port number is assigned, to better replicate other port spaces. Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/ehca: Fix memleak on module unloadingAkinobu Mita2007-02-10
| | | | | | | | | | | | | | | | | | | | | | Percpu data is not freed on module unloading. Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Christoph Raisch <raisch@de.ibm.com> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Work around gcc bug on sparc64David Howells2007-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For some reason gcc-3.4.5 on sparc64 does: WARNING: "____ilog2_NaN" [drivers/infiniband/hw/mthca/ib_mthca.ko] undefined! Points to note: (1) The asm volatile flush/flushw are just markers for viewing what comes out in the assembly; removing them has no effect on the result. (2) Changing almost anything else in dwh__mthca_arbel_init_srq_context() or dwh__mthca_alloc_srq() causes the problem to go away. The compiler command line issued by the kernel build is: /opt/crosstool/gcc-3.4.5-glibc-2.3.6/sparc64-unknown-linux-gnu/bin/sparc64-unknown-linux-gnu-gcc -fno-strict-aliasing -fno-common -Os -m64 -mno-fpu -mcpu=ultrasparc -mcmodel=medlow -ffixed-g4 -ffixed-g5 -fcall-used-g7 -Wa,--undeclared-regs -pg -fno-omit-frame-pointer -fno-optimize-sibling-calls -fasynchronous-unwind-tables -g -c -o drivers/infiniband/hw/mthca/.tmp_mthca_srq.o drivers/infiniband/hw/mthca/mthca_srq.c This can be reduced to this whilst still retaining the problem: /opt/crosstool/gcc-3.4.5-glibc-2.3.6/sparc64-unknown-linux-gnu/bin/sparc64-unknown-linux-gnu-gcc -m64 -c -o drivers/infiniband/hw/mthca/mthca_srq.o drivers/infiniband/hw/mthca/mthca_srq.c -Os Removing -Os or changing it to -O or -O0 thru -O6 gets rid of the problem. This patch to the kernel code fixes the problem: Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IPoIB: Connected mode experimental supportMichael S. Tsirkin2007-02-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following patch adds experimental support for IPoIB connected mode, as defined by the draft from the IETF ipoib working group. The idea is to increase performance by increasing the MTU from the maximum of 2K (theoretically 4K) supported by IPoIB on top of UD. With this code, I'm able to get 800MByte/sec or more with netperf without options on a Mellanox 4x back-to-back DDR system. Some notes on code: 1. SRQ is used for scalability to large cluster sizes 2. Only RC connections are used (UC does not support SRQ now) 3. Retry count is set to 0 since spec draft warns against retries 4. Each connection is used for data transfers in only 1 direction, so each connection is either active(TX) or passive (RX). 2 sides that want to communicate create 2 connections. 5. Each active (TX) connection has a separate CQ for send completions - this keeps the code simple without CQ resize and other tricks 6. To detect stale passive side connections (where the remote side is down), we keep an LRU list of passive connections (updated once per second per connection) and destroy a connection after it has been unused for several seconds. The LRU rule makes it possible to avoid scanning connections that have recently been active. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/core: Use ARRAY_SIZE macro for mandatory_tableAhmed S. Darwish2007-02-10
| | | | | | | | | | | | | | | | Use ARRAY_SIZE() macro already defined in kernel.h instead of open coding equivalent code. Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
| * IB/mthca: Use correct structure size in call to memset()Roland Dreier2007-02-10
| | | | | | | | | | | | | | | | | | When clearing the ib_ah_attr parameter in to_ib_ah_attr(), use sizeof *ib_ah_attr instead of sizeof *path. Pointed out by Jack Morgenstein <jackm@mellanox.co.il>. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* | [PATCH] mark struct file_operations const 3Arjan van de Ven2007-02-12
| | | | | | | | | | | | | | | | | | | | | | Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | [PATCH] Transform kmem_cache_alloc()+memset(0) -> kmem_cache_zalloc().Robert P. J. Day2007-02-11
|/ | | | | | | | | | | | | | | | | | | | | | Replace appropriate pairs of "kmem_cache_alloc()" + "memset(0)" with the corresponding "kmem_cache_zalloc()" call. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Andi Kleen <ak@muc.de> Cc: Roland McGrath <roland@redhat.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Acked-by: Joel Becker <Joel.Becker@oracle.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Jan Kara <jack@ucw.cz> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: James Morris <jmorris@namei.org> Cc: Chris Wright <chrisw@sous-sol.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] iscsi endianness annotationsAl Viro2007-02-09
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Network: convert network devices to use struct device instead of class_deviceGreg Kroah-Hartman2007-02-07
| | | | | | | | | | | This lets the network core have the ability to handle suspend/resume issues, if it wants to. Thanks to Frederik Deweerdt <frederik.deweerdt@gmail.com> for the arm driver fixes. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* IB/ehca: Remove obsolete prototypesHoang-Nam Nguyen2007-02-04
| | | | | | | Remove prototypes for functions that don't exist. Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Remove use of do_mmap()Hoang-Nam Nguyen2007-02-04
| | | | | | | | | | | | This patch removes do_mmap() from ehca: - Call remap_pfn_range() for hardware register block - Use vm_insert_page() to register memory allocated for completion queues and queue pairs - The actual mmap() call/trigger is now controlled by user space, ie. libehca Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* RDMA/addr: Handle ethernet neighbour updates during route resolutionSteve Wise2007-02-04
| | | | | | | | | | | | | | | The iWARP connection manager uses the ib_addr services to do route resolution (neighbour discovery in the IP world). The ib_addr netevent callback routine, however, currently only acts on InfiniBand neighbour updates. It needs to act on ethernet neighbour updates as well. This patch just removes filtering on device type altogether and will trigger on any neighour updates where the nud_type is valid. This simplifies the code some. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Don't wait for response when QP is in error state.Ishai Rabinovitz2007-02-04
| | | | | | | | | | | | | | | When there is a call to send_tsk_mgmt SRP posts a send and waits for 5 seconds to get a response. When the QP is in the error state it is obvious that there will be no response so it is quite useless to wait. In fact, the timeout causes SRP to wait a long time to reconnect when a QP error occurs. (Each abort and each reset_device calls send_tsk_mgmt, which waits for the timeout). The following patch solves this problem by identifying the failure and returning an immediate error code. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: Return qp pointer as part of ib_wcMichael S. Tsirkin2007-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | struct ib_wc currently only includes the local QP number: this matches the IB spec, but seems mostly useless. The following patch replaces this with the pointer to qp itself, and updates all low level drivers and all users. This has the following advantages: - Ability to get a per-qp context through wc->qp->qp_context - Existing drivers already have the qp pointer ready in poll cq, so this change actually saves a tiny bit (extra memory read) on data path (for ehca it would actually be expensive to find the QP pointer when polling a CQ, but ehca does not support SRQ so we can leave wc->qp as NULL for ehca) - Users that need the QP number can still get it through wc->qp->qp_num Use case: In IPoIB connected mode code, I have a common CQ shared by multiple QPs. To track connection usage, I need a way to get at some per-QP context upon the completion, and I would like to avoid allocating context object per work request just to stick a QP pointer into it. With this code, I can just use wc->qp->qp_context. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Fix mismatched spin_unlock in irq handlerHoang-Nam Nguyen2007-01-22
| | | | | | | | The lock is taken with _irqsave and hence must be released with _irqrestore on all paths. Signed-off-by Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Fix improper use of yield() with spinlock heldHoang-Nam Nguyen2007-01-22
| | | | | Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Check match_strdup() returnIshai Rabinovitz2007-01-22
| | | | | | | | Checks if the kmalloc in match_strdup() was successful, and bail out on looking at the token if it failed. Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Don't execute QUERY_QP firmware command for QP in RESET stateDotan Barak2007-01-09
| | | | | | | | If a QP being queried is in the RESET state, don't execute the QUERY_QP firmware command (because it will fail). Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ehca: Use proper GFP_ flags for get_zeroed_page()Hoang-Nam Nguyen2007-01-09
| | | | | | | | | | | | | | | | Here is a patch for ehca to use proper flag, ie. GFP_ATOMIC resp. GFP_KERNEL, when calling get_zeroed_page() to prevent "Bug: scheduling while atomic...". This error does not cause a kernel panic but makes ipoib un-usable afterwards. It is reproducible on 2.6.20-rc4 if one does ifconfig down during a flood ping test. I have not observed this error in earlier releases incl. 2.6.20-rc1. This error occurs when a qp event/irq is received and ehca event handler allocates a control block/page to obtain HCA error data block. Use of GFP_ATOMIC when in interrupt context prevents this issue. Signed-off-by Hoang-Nam Nguyen <hnguyen@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix PRM compliance problem in atomic-send completionsJack Morgenstein2007-01-07
| | | | | | | | | | | | According to the Tavor and Arbel programmer's reference manuals, the number of bytes transferred is not provided in the byte_cnt field of the CQ entry for atomic operation completions. For atomic operations, the number of bytes transferred is always 8 (when the status is "success"), and this constant value should always be used by the driver in the ib_wc entry returned, rather than using the CQE. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* RDMA/ucma: Don't report events with invalid user contextSean Hefty2007-01-07
| | | | | | | | | | | | | | | | | | | | | | | | | | There's a problem with how rdma cm events are reported to userspace that can lead to application crashes. When a new connection request arrives, a context for the connection is allocated in the kernel. The connection event is then reported to userspace. The userspace library retrieves the event and allocates its own context for the connection. The userspace context is associated with the kernel's context when accepting. This allows the kernel to give userspace context with other events. A problem occurs if a second event for the same connection occurs before the user has had a chance to call accept. The userspace context has not yet been set, which causes the librdmacm to crash. (This has been seen when the app takes too long to call accept, resulting in the remote side timing out and rejecting the connection) Fix this by ignoring events for new connections until userspace has set their context. This can only happen if an error occurs on a new connection before the user accepts it. This is okay, since the accept will just fail later. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>