aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband
Commit message (Collapse)AuthorAge
* IB/mthca: make a function staticAdrian Bunk2006-04-19
| | | | | | | This patch makes the needlessly global mthca_update_rate() static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: Fix whitespaceRoland Dreier2006-04-19
| | | | Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: Make more names staticRoland Dreier2006-04-19
| | | | | | Make symbols that are only used in a single source file static. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mad: Fix RMPP version check during agent registrationHal Rosenstock2006-04-19
| | | | | | | | | | Only check that RMPP version is not specified when MAD class does not support RMPP. Just because a class is allowed to use RMPP doesn't mean that rmpp_version needs to be set for the MAD agent to register. Checking this was a recent change which was too pedantic. Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Remove request from list when SCSI abort succeedsRoland Dreier2006-04-19
| | | | | | | | If a SCSI abort succeeds, then the aborted request should to be removed from the list of pending requests. This fixes list corruption after an abort occurs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix max_srq_sge returned by ib_query_device for Tavor devicesJack Morgenstein2006-04-12
| | | | | | | | | | | | | | | | | | | | | | The driver allocates SRQ WQEs size with a power of 2 size both for Tavor and for memfree. For Tavor, however, the hardware only requires the WQE size to be a multiple of 16, not a power of 2, and the max number of scatter-gather allowed is reported accordingly by the firmware (and this is the value currently returned by ib_query_device() and ibv_query_device()). If the max number of scatter/gather entries reported by the FW is used when creating an SRQ, the creation will fail for Tavor, since the required WQE size will be increased to the next power of 2, which turns out to be larger than the device permitted max WQE size (which is not a power of 2). This patch reduces the reported SRQ max wqe size so that it can be used successfully in creating an SRQ on Tavor HCAs. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/cache: Use correct pointer to calculate sizeMichael S. Tsirkin2006-04-10
| | | | | | | | | | When allocating gid_cache, use kmalloc(sizeof *gid_cache, ...) rather than kmalloc(sizeof *pkey_cache, ...). It doesn't really matter which one is used, since the size ends up the same either way, but it's much better to say what we mean. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Use spin_lock_irq() instead of spin_lock_irqsave()Roland Dreier2006-04-10
| | | | | | | | We know ipoib_flush_paths() is called from plain process context with interrupts enabled, since it does wait_for_completion(). So there's no need to use spin_lock_irqsave() -- spin_lock_irq() is fine. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Close race in ipoib_flush_paths()Eli Cohen2006-04-10
| | | | | | | | ib_sa_cancel_query() must be called with priv->lock held since a completion might arrive and set path->query to NULL. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Disable tuning PCI read burst sizeMichael S. Tsirkin2006-04-10
| | | | | | | | | | | | | | | | | | The PCI spec recommends against drivers playing with a device's PCI read burst size, and says that systems software should configure it. And we actually have users that report that changing it from the default set by BIOS hurts performance and/or stability for them. On the other hand, the Mellanox Programmer's Reference Manual recommends turning it up all the way to the maximum value. Some tests conducted here in the lab do not show performance improvement from this tuning, but this might be just me. As a work-around, make this tuning an option, off by default (safe value), with an eye towards removing it completely one day if no one complains. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Make send and receive queue sizes tunableShirley Ma2006-04-10
| | | | | | | | | | | Make IPoIB's send and receive queue sizes tunable via module parameters ("send_queue_size" and "recv_queue_size"). This allows the queue sizes to be enlarged to fix disastrously bad performance on some platforms and workloads, without bloating memory usage when large queues aren't needed. Signed-off-by: Shirley Ma <xma@us.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Wait for join to finish before freeing mcast structEli Cohen2006-04-10
| | | | | | | | | | | ipoib_mcast_restart_task() might free an mcast object while a join request is still outstanding, leading to an oops when the query completes. Fix this by waiting for query to complete, similar to what ipoib_stop_thread() is doing. The wait for mcast completion code is consolidated in wait_for_mcast_join(). Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB: simplify static rate encodingJack Morgenstein2006-04-10
| | | | | | | | | | | | | | | | | Push translation of static rate to HCA format into low-level drivers, where it belongs. For static rate encoding, use encoding of rate field from IB standard PathRecord, with addition of value 0, for backwards compatibility with current usage. The changes are: - Add enum ib_rate to midlayer includes. - Get rid of static rate translation in IPoIB; just use static rate directly from Path and MulticastGroup records. - Update mthca driver to translate absolute static rate into the format used by hardware. This also fixes mthca's static rate handling for HCAs that are capable of 4X DDR. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Consolidate private neighbour data handlingMichael S. Tsirkin2006-04-04
| | | | | | | | | | | | | | | | | | | | | | | | | Consolidate IPoIB's private neighbour data handling into ipoib_neigh_alloc() and ipoib_neigh_free(). This will make it easier to keep track of the neighbour structures that IPoIB is handling, and is a nice cleanup of the code: add/remove: 2/1 grow/shrink: 1/8 up/down: 100/-178 (-78) function old new delta ipoib_neigh_alloc - 61 +61 ipoib_neigh_free - 36 +36 ipoib_mcast_join_finish 1288 1291 +3 path_rec_completion 575 573 -2 ipoib_mcast_join_task 664 660 -4 ipoib_neigh_destructor 101 92 -9 ipoib_neigh_setup_dev 14 3 -11 ipoib_neigh_setup 17 - -17 path_free 238 215 -23 ipoib_mcast_free 329 306 -23 ipoib_mcast_send 718 684 -34 neigh_add_path 705 650 -55 Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Fix memory leak in options parsingRoland Dreier2006-04-03
| | | | | | | | Fix memory leak if parsing destination GID fails. Coverity bug 1042 Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Always build debugging code unless CONFIG_EMBEDDED=yRoland Dreier2006-04-02
| | | | | | | | | | | | Change the mthca debugging trace output code so that it can enabled and disabled at runtime with the debug_level module parameter in sysfs. Also, don't allow CONFIG_INFINIBAND_MTHCA_DEBUG to be disabled unless CONFIG_EMBEDDED is selected. We want users (and especially distros) to have this turned on unless they really need to save space, because by the time we want debugging output, it's usually too late to rebuild a kernel. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Always build debugging code unless CONFIG_EMBEDDED=yRoland Dreier2006-04-02
| | | | | | | | | | | Don't allow CONFIG_INFINIBAND_IPOIB_DEBUG to be disabled unless CONFIG_EMBEDDED is selected. We want users (and especially distros) to have this turned on unless they really need to save space, because by the time we want debugging output, it's usually too late to rebuild a kernel. The debugging output can be controlled at runtime via the debug_level module parameter in sysfs. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mad: fix oops in cancel_madsMichael S. Tsirkin2006-04-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have seen the following OOPs in cancel_mads, when restarting opensm multiple times: Call Trace: [<c010549b>] show_stack+0x9b/0xb0 [<c01055ec>] show_registers+0x11c/0x190 [<c01057cd>] die+0xed/0x160 [<c031b966>] do_page_fault+0x3f6/0x5d0 [<c010511f>] error_code+0x4f/0x60 [<f8ac4e38>] cancel_mads+0x128/0x150 [ib_mad] [<f8ac2811>] unregister_mad_agent+0x11/0x130 [ib_mad] [<f8ac2a12>] ib_unregister_mad_agent+0x12/0x20 [ib_mad] [<f8b10f23>] ib_umad_close+0xf3/0x130 [ib_umad] [<c0162937>] __fput+0x187/0x1c0 [<c01627a9>] fput+0x19/0x20 [<c0160f7a>] filp_close+0x3a/0x60 [<c0121ca8>] put_files_struct+0x68/0xa0 [<c0103cf7>] do_signal+0x47/0x100 [<c0103ded>] do_notify_resume+0x3d/0x40 [<c0103f9e>] work_notifysig+0x13/0x25 We traced this back to local_completions unlocking mad_agent_priv->lock while still keeping a pointer into local_list. A later call to list_del(&local->completion_list) would then corrupt the list. To fix this, remove the entry from local_list after looking it up but before releasing mad_agent_priv->lock, to prevent cancel_mads from finding and freeing it. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: kbuild infrastructureBryan O'Sullivan2006-03-31
| | | | | | | | Integrate the ipath core and OpenIB drivers into the kernel build infrastructure. Add entry to MAINTAINERS. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: infiniband verbs supportBryan O'Sullivan2006-03-31
| | | | | | | | The ipath_verbs.c file implements the driver-specific components of the kernel's Infiniband verbs layer. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: misc infiniband code, part 2Bryan O'Sullivan2006-03-31
| | | | | | | | Management datagram support, queue pairs, and reliable and unreliable connections. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: misc infiniband code, part 1Bryan O'Sullivan2006-03-31
| | | | | | | | Completion queues, local and remote memory keys, and memory region support. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: infiniband RC protocol supportBryan O'Sullivan2006-03-31
| | | | | | | | This is an implementation of the Infiniband RC ("reliable connection") protocol. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: infiniband UC and UD protocol supportBryan O'Sullivan2006-03-31
| | | | | | | | These files implement the Infiniband UC ("unreliable connection") and UD ("unreliable datagram") protocols. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: infiniband header filesBryan O'Sullivan2006-03-31
| | | | | | | These header files are used by the layered Infiniband driver. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: layering interfaces used by higher-level driver codeBryan O'Sullivan2006-03-31
| | | | | | | | The layering interfaces are used to implement the Infiniband protocols and the ethernet emulation driver. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: support for userspace apps using core driverBryan O'Sullivan2006-03-31
| | | | | | | | | | These files introduce a char device that userspace apps use to gain direct memory-mapped access to the InfiniPath hardware, and routines for pinning and unpinning user memory in cases where the hardware needs to DMA into the user address space. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: sysfs and ipathfs support for core driverBryan O'Sullivan2006-03-31
| | | | | | | | | | | | | The ipathfs filesystem contains files that are not appropriate for sysfs, because they contain binary data. The hierarchy is simple; the top-level directory contains driver-wide attribute files, while numbered subdirectories contain per-device attribute files. Our userspace code currently expects this filesystem to be mounted on /ipathfs, but a final location has not yet been chosen. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: misc driver support codeBryan O'Sullivan2006-03-31
| | | | | | | | | | | | | | | | | | | | | | | | | EEPROM support, interrupt handling, statistics gathering, and write combining management for x86_64. A note regarding i2c: The Atmel EEPROM hardware we use looks like an i2c device electrically, but is not i2c compliant at all from a functional perspective. We tried using the kernel's i2c support to talk to it, but failed. Normal i2c devices have a single 7-bit or 10-bit i2c address that they respond to. Valid 7-bit addresses range from 0x03 to 0x77. Addresses 0x00 to 0x02 and 0x78 to 0x7F are special reserved addresses (e.g. 0x00 is the "general call" address.) The Atmel device, on the other hand, responds to ALL addresses. It's designed to be the only device on a given i2c bus. A given i2c device address corresponds to the memory address within the i2c device itself. At least one reason why the linux core i2c stuff won't work for this is that it prohibits access to reserved addresses like 0x00, which are really valid addresses on the Atmel devices. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: chip initialisation code, and diag supportBryan O'Sullivan2006-03-31
| | | | | | | | | | | ipath_init_chip.c sets up an InfiniPath device for use. ipath_diag.c permits userspace diagnostic tools to read and write a chip's registers. It is different in purpose from the mmap interfaces to the /sys/bus/pci resource files. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: support for PCI Express devicesBryan O'Sullivan2006-03-31
| | | | | | | | This file contains routines and definitions specific to InfiniPath devices that have PCI Express interfaces. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: support for HyperTransport devicesBryan O'Sullivan2006-03-31
| | | | | | | | The ipath_ht400.c file contains routines and definitions specific to HyperTransport-based InfiniPath devices. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: core driver header filesBryan O'Sullivan2006-03-31
| | | | | | | | | | | | | | ipath_common.h and ips_common.h contain definitions shared between userspace and the kernel. ipath_kernel.h is the core driver header file. ipath_debug.h contains mask values used for controlling driver debugging. ipath_registers.h contains bitmask definitions used in chip registers. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/ipath: core device driverBryan O'Sullivan2006-03-31
| | | | | | | | | | | | | The ipath driver is a low-level driver for PathScale InfiniPath host channel adapters (HCAs) based on the HT-400 and PE-800 chips, including the InfiniPath HT-460, the small form factor InfiniPath HT-460, the InfiniPath HT-470 and the Linux Networx LS/X. The ipath_driver.c file contains much of the low-level device handling code. Signed-off-by: Bryan O'Sullivan <bos@pathscale.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mad: RMPP support for additional classesHal Rosenstock2006-03-30
| | | | | | | | | Add RMPP support for additional management classes that support it. Also, validate RMPP is consistent with management class specified. Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mad: include GID/class when matching receivesJack Morgenstein2006-03-30
| | | | | | | | | | | | | | | Received responses are currently matched against sent requests based on TID only. According to the spec, responses should match based on the combination of TID, management class, and requester LID/GID. Without the additional qualification, an agent that is responding to two requests, both of which have the same TID, can match RMPP ACKs with the incorrect transaction. This problem can occur on the SM node when responding to SA queries. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix section mismatch problemsRoland Dreier2006-03-29
| | | | | | | | | Quite a few cleanup functions in mthca were marked as __devexit. However, they could also be called from error paths during initialization, so they cannot be marked that way. Just delete all of the incorrect annotations. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Fix oops with raw socketsRoland Dreier2006-03-29
| | | | | | | | | | ipoib_hard_header() needs to handle the case that daddr is NULL. This can happen when packets are injected via a raw socket, and IPoIB shouldn't oops in this case. Reported by Anton Blanchard <anton@samba.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix check of size in SRQ creationJack Morgenstein2006-03-29
| | | | | | | | | | | | | | | The previous patch for Tavor broke MemFree logic. The driver should perform limit check only for Tavor. For MemFree, the check is incorrect, since ds (WQE stride) is always a power-of-2 (although the max_desc_size may not be). In Tavor, however, WQE stride and desc_size are the same, and are not necessarily power-of-2. The check was really for the WQE stride (and it Tavor, we use max_desc_size for the stride). Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Fix unmapping of fake scatterlistRoland Dreier2006-03-29
| | | | | | | | | The recently merged patch to create a fake scatterlist for non-SG SCSI commands had a bug: the driver ended up doing dma_unmap_sg() on a scatterlist scmnd->request_buffer rather than the fake scatter list it created. Fix this so that the driver unmaps the same thing it maps. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: P_Key change event handlingLeonid Arsh2006-03-24
| | | | | | | | | | | | | | | | | | | | | | | | This patch causes the network interface to respond to P_Key change events correctly. As a result, you'll see a child interface in the "RUNNING" state (netif_carrier_on()) only when the corresponding P_Key is configured by the SM. When SM removes a P_Key, the "RUNNING" state will be disabled for the corresponding network interface. To implement this, I added IB_EVENT_PKEY_CHANGE event handling. To prevent flushing the device before the device is open by the "delay open" mechanism, I added an additional device flag called IPOIB_FLAG_INITIALIZED. This also prevents the child network interface from trying to join to multicast groups until the PKEY is configured. We used to get error messages like: ib0.f2f2: couldn't attach QP to multicast group ff12:401b:f2f2:0:0:0:ffff:ffff in this case. To fix this, I just check IPOIB_FLAG_OPER_UP flag in ipoib_set_mcast_list(). Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix modify QP error pathRoland Dreier2006-03-24
| | | | | | | | | If the call to mthca_MODIFY_QP() failed, then mthca_modify_qp() would still do some things it shouldn't, such as store away attributes for special QPs. Fix this, and simplify the code, by simply jumping to the exit path if mthca_MODIFY_QP() fails. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Fix network interface "RUNNING" statusLeonid Arsh2006-03-24
| | | | | | | | | | With the current IPoIB driver, the status of network interfaces stays "RUNNING" even if the link goes down (for example because a cable is unplugged). Fix this by flushing the IPoIB interface when the link goes down. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix indentationRoland Dreier2006-03-24
| | | | | | Fix some whitespace damage (indenting with spaces) that snuck in. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Fix uninitialized variable in mthca_alloc_qp()Jack Morgenstein2006-03-24
| | | | | | | | mthca_alloc_sqp() by mthca_set_qp_size() need to set qp->transport before calling mthca_set_qp_size(), since the value is used there. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Check SRQ limit in modify SRQ operationJack Morgenstein2006-03-24
| | | | | | | | | When setting the shared receive queue (SRQ) watermark in a modify SRQ operation, make sure that the supplied value is not larger than the full size of the SRQ. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Check that SRQ WQE size does not exceed device's max valueJack Morgenstein2006-03-24
| | | | | | | | | | Guarantee the calculated work queue entry size does not exceed the max allowable WQE size when creating an SRQ. This is a problem with Arbel in Tavor-compatibility mode because the current WQE size computation method rounds up to next power of 2. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/mthca: Check that sgid_index and path_mtu are valid in modify_qpDotan Barak2006-03-24
| | | | | | | | Add a check that the modify QP parameters sgid_index and path_mtu are valid, since they might come from userspace. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IB/srp: Use a fake scatterlist for non-SG SCSI commandsRoland Dreier2006-03-24
| | | | | | | | | | | | Since the SCSI midlayer is moving towards entirely getting rid of commands with use_sg == 0, we should treat this case as an exception. Therefore, change the IB SRP initiator to create a fake scatterlist for these commands with sg_init_one(). This simplifies the flow of DMA mapping and unmapping, since SRP can just use dma_map_sg() and dma_unmap_sg() unconditionally, rather than having to choose between the dma_{map,unmap}_sg() and dma_{map,unmap}_single() variants. Signed-off-by: Roland Dreier <rolandd@cisco.com>
* IPoIB: Pass correct pointer when flushing child interfacesLeonid Arsh2006-03-24
| | | | | | | ipoib_ib_dev_flush() should get passed cpriv->dev, not &cpriv->dev. Signed-off-by: Leonid Arsh <leonida@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>