aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/infiniband/core
Commit message (Collapse)AuthorAge
* RDMA/cm: Change return value from find_gid_port()shefty2012-11-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem reported by Dan Carpenter <dan.carpenter@oracle.com>: The patch 3c86aa70bf67: "RDMA/cm: Add RDMA CM support for IBoE devices" from Oct 13, 2010, leads to the following warning: net/sunrpc/xprtrdma/svc_rdma_transport.c:722 svc_rdma_create() error: passing non neg 1 to ERR_PTR This bug would result in a NULL dereference. svc_rdma_create() is supposed to return ERR_PTRs or valid pointers, but instead it returns ERR_PTRs, valid pointers and 1. The call tree is: svc_rdma_create() => rdma_bind_addr() => cma_acquire_dev() => find_gid_port() rdma_bind_addr() should return a valid errno. Fix this by having find_gid_port() also return a valid errno. If we can't find the specified GID on a given port, return -EADDRNOTAVAIL, rather than -EAGAIN, to better indicate the error. We also drop using the special return value of '1' and instead pass through the error returned by the underlying verbs call. On such errors, rather than aborting the search, we simply continue to check the next device/port. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* Merge git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linuxDavid S. Miller2012-10-09
|\ | | | | | | | | | | | | Pulled mainline in order to get the UAPI infrastructure already merged before I pull in David Howells's UAPI trees for networking. Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge tag 'rdma-for-linus' of ↵Linus Torvalds2012-10-07
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband changes from Roland Dreier: "Second batch of changes for the 3.7 merge window: - Late-breaking fix for IPoIB on mlx4 SR-IOV VFs. - Fix for IPoIB build breakage with CONFIG_INFINIBAND_IPOIB_CM=n (new netlink config changes are to blame). - Make sure retry count values are in range in RDMA CM. - A few nes hardware driver fixes and cleanups. - Have iSER initiator use >1 interrupt vectors if available." * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: RDMA/cma: Check that retry count values are in range IB/iser: Add more RX CQs to scale out processing of SCSI responses RDMA/nes: Bump the version number of nes driver RDMA/nes: Remove unused module parameter "send_first" RDMA/nes: Remove unnecessary if-else statement RDMA/nes: Add missing break to switch. mlx4_core: Adjust flow steering attach wrapper so that IB works on SR-IOV VFs IPoIB: Fix build with CONFIG_INFINIBAND_IPOIB_CM=n
| | * RDMA/cma: Check that retry count values are in rangeSean Hefty2012-10-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The retry_count and rnr_retry_count connection parameters are both 3-bit values. Check that the values are in range and reduce if they're not. This fixes a problem reported by Doug Ledford <dledford@redhat.com> that resulted in the userspace rping test (part of the librdmacm samples) failing to run over Intel IB HCAs. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ Use min_t() to avoid warnings about type mismatch. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
* | | infiniband: pass rdma_cm module to netlink_dump_startGao feng2012-10-07
|/ / | | | | | | | | | | | | | | | | set netlink_dump_control.module to avoid panic. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | idr: rename MAX_LEVEL to MAX_IDR_LEVELFengguang Wu2012-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid name conflicts: drivers/video/riva/fbdev.c:281:9: sparse: preprocessor token MAX_LEVEL redefined While at it, also make the other names more consistent and add parentheses. [akpm@linux-foundation.org: repair fallout] [sfr@canb.auug.org.au: IB/mlx4: fix for MAX_ID_MASK to MAX_IDR_MASK name change] Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Bernd Petrovitsch <bernd@petrovitsch.priv.at> Cc: walter harms <wharms@bfs.de> Cc: Glauber Costa <glommer@parallels.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Roland Dreier <roland@purestorage.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | Merge branch 'for-linus' of ↵Linus Torvalds2012-10-02
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs update from Al Viro: - big one - consolidation of descriptor-related logics; almost all of that is moved to fs/file.c (BTW, I'm seriously tempted to rename the result to fd.c. As it is, we have a situation when file_table.c is about handling of struct file and file.c is about handling of descriptor tables; the reasons are historical - file_table.c used to be about a static array of struct file we used to have way back). A lot of stray ends got cleaned up and converted to saner primitives, disgusting mess in android/binder.c is still disgusting, but at least doesn't poke so much in descriptor table guts anymore. A bunch of relatively minor races got fixed in process, plus an ext4 struct file leak. - related thing - fget_light() partially unuglified; see fdget() in there (and yes, it generates the code as good as we used to have). - also related - bits of Cyrill's procfs stuff that got entangled into that work; _not_ all of it, just the initial move to fs/proc/fd.c and switch of fdinfo to seq_file. - Alex's fs/coredump.c spiltoff - the same story, had been easier to take that commit than mess with conflicts. The rest is a separate pile, this was just a mechanical code movement. - a few misc patches all over the place. Not all for this cycle, there'll be more (and quite a few currently sit in akpm's tree)." Fix up trivial conflicts in the android binder driver, and some fairly simple conflicts due to two different changes to the sock_alloc_file() interface ("take descriptor handling from sock_alloc_file() to callers" vs "net: Providing protocol type via system.sockprotoname xattr of /proc/PID/fd entries" adding a dentry name to the socket) * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (72 commits) MAX_LFS_FILESIZE should be a loff_t compat: fs: Generic compat_sys_sendfile implementation fs: push rcu_barrier() from deactivate_locked_super() to filesystems btrfs: reada_extent doesn't need kref for refcount coredump: move core dump functionality into its own file coredump: prevent double-free on an error path in core dumper usb/gadget: fix misannotations fcntl: fix misannotations ceph: don't abuse d_delete() on failure exits hypfs: ->d_parent is never NULL or negative vfs: delete surplus inode NULL check switch simple cases of fget_light to fdget new helpers: fdget()/fdput() switch o2hb_region_dev_write() to fget_light() proc_map_files_readdir(): don't bother with grabbing files make get_file() return its argument vhost_set_vring(): turn pollstart/pollstop into bool switch prctl_set_mm_exe_file() to fget_light() switch xfs_find_handle() to fget_light() switch xfs_swapext() to fget_light() ...
| * switch simple cases of fget_light to fdgetAl Viro2012-09-26
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * switch infinibarf users of fget() to fget_light()Al Viro2012-09-26
| | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* | Merge tag 'rdma-for-linus' of ↵Linus Torvalds2012-10-02
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband updates from Roland Dreier: "First batch of InfiniBand/RDMA changes for the 3.7 merge window: - mlx4 IB support for SR-IOV - A couple of SRP initiator fixes - Batch of nes hardware driver fixes - Fix for long-standing use-after-free crash in IPoIB - Other miscellaneous fixes" This merge also removes a new use of __cancel_delayed_work(), and replaces it with the regular cancel_delayed_work() that is now irq-safe thanks to the workqueue updates. That said, I suspect the sequence in question should probably use "mod_delayed_work()". I just did the minimal "don't use deprecated functions" fixup, though. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (45 commits) IB/qib: Fix local access validation for user MRs mlx4_core: Disable SENSE_PORT for multifunction devices mlx4_core: Clean up enabling of SENSE_PORT for older (ConnectX-1/-2) HCAs mlx4_core: Stash PCI ID driver_data in mlx4_priv structure IB/srp: Avoid having aborted requests hang IB/srp: Fix use-after-free in srp_reset_req() IB/qib: Add a qib driver version RDMA/nes: Fix compilation error when nes_debug is enabled RDMA/nes: Print hardware resource type RDMA/nes: Fix for crash when TX checksum offload is off RDMA/nes: Cosmetic changes RDMA/nes: Fix for incorrect MSS when TSO is on RDMA/nes: Fix incorrect resolving of the loopback MAC address mlx4_core: Fix crash on uninitialized priv->cmd.slave_sem mlx4_core: Trivial cleanups to driver log messages mlx4_core: Trivial readability fix: "0X30" -> "0x30" IB/mlx4: Create paravirt contexts for VFs when master IB driver initializes mlx4: Modify proxy/tunnel QP mechanism so that guests do no calculations mlx4: Paravirtualize Node Guids for slaves mlx4: Activate SR-IOV mode for IB ...
| | \
| | \
| *-. \ Merge branches 'cma', 'cxgb4', 'ipoib', 'mlx4', 'mlx4-sriov', 'nes', 'qib' ↵Roland Dreier2012-10-02
| |\ \ \ | | | |/ | | |/| | | | | and 'srp' into for-linus
| | | * IB/core: Add ib_find_exact_cached_pkey()Jack Morgenstein2012-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When P_Key tables potentially contain both full and partial membership copies for the same P_Key, we need a function to find the index for an exact (16-bit) P_Key. This is necessary when the master forwards QP1 MADs sent by guests. If the guest has sent the MAD with a limited membership P_Key, we need to to forward the MAD using the same limited membership P_Key. Since the master may have both the limited and the full member P_Keys in its table, we must make sure to retrieve the limited membership P_Key in this case. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | * IB/core: Handle table with full and partial membership for the same P_KeyJack Morgenstein2012-09-30
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Extend the cached and non-cached P_Key table lookups to handle limited and full membership of the same P_Key to co-exist in the P_Key table. This is necessary for SR-IOV, to allow for some guests would to have the full membership P_Key in their virtual P_Key table, while other guests on the same physical HCA would have the limited one. To support this, we need both the limited and full membership P_Keys to be present in the master's (hypervisor physical port) P_Key table. The algorithm for handling P_Key tables which contain both the limited and the full membership versions of the same P_Key works as follows: When scanning the P_Key table for a 15-bit P_Key: A. If there is a full member version of that P_Key anywhere in the table, return its index (even if a limited-member version of the P_Key exists earlier in the table). B. If the full member version is not in the table, but the limited-member version is in the table, return the index of the limited P_Key. Signed-off-by: Liran Liss <liranl@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | RDMA/cma: Use consistent component mask for IPoIB port space multicast joinsDotan Barak2012-09-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CMA multicast joins for the IPoIB port space need to use the same component mask used by the ipoib driver. Otherwise, it's possible for the CMA to create a group to which a join made by ipoib will fail, or vise-versa. Some of the component mask fields set by ipoib weren't set by the CMA, fix that. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | IB/core: Remove unused variables in ucm/ucmaDotan Barak2012-09-30
| |/ | | | | | | | | | | | | | | | | Remove unused wait objects from ucm/ucma events flow. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds2012-10-02
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull networking changes from David Miller: 1) GRE now works over ipv6, from Dmitry Kozlov. 2) Make SCTP more network namespace aware, from Eric Biederman. 3) TEAM driver now works with non-ethernet devices, from Jiri Pirko. 4) Make openvswitch network namespace aware, from Pravin B Shelar. 5) IPV6 NAT implementation, from Patrick McHardy. 6) Server side support for TCP Fast Open, from Jerry Chu and others. 7) Packet BPF filter supports MOD and XOR, from Eric Dumazet and Daniel Borkmann. 8) Increate the loopback default MTU to 64K, from Eric Dumazet. 9) Use a per-task rather than per-socket page fragment allocator for outgoing networking traffic. This benefits processes that have very many mostly idle sockets, which is quite common. From Eric Dumazet. 10) Use up to 32K for page fragment allocations, with fallbacks to smaller sizes when higher order page allocations fail. Benefits are a) less segments for driver to process b) less calls to page allocator c) less waste of space. From Eric Dumazet. 11) Allow GRO to be used on GRE tunnels, from Eric Dumazet. 12) VXLAN device driver, one way to handle VLAN issues such as the limitation of 4096 VLAN IDs yet still have some level of isolation. From Stephen Hemminger. 13) As usual there is a large boatload of driver changes, with the scale perhaps tilted towards the wireless side this time around. Fix up various fairly trivial conflicts, mostly caused by the user namespace changes. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1012 commits) hyperv: Add buffer for extended info after the RNDIS response message. hyperv: Report actual status in receive completion packet hyperv: Remove extra allocated space for recv_pkt_list elements hyperv: Fix page buffer handling in rndis_filter_send_request() hyperv: Fix the missing return value in rndis_filter_set_packet_filter() hyperv: Fix the max_xfer_size in RNDIS initialization vxlan: put UDP socket in correct namespace vxlan: Depend on CONFIG_INET sfc: Fix the reported priorities of different filter types sfc: Remove EFX_FILTER_FLAG_RX_OVERRIDE_IP sfc: Fix loopback self-test with separate_tx_channels=1 sfc: Fix MCDI structure field lookup sfc: Add parentheses around use of bitfield macro arguments sfc: Fix null function pointer in efx_sriov_channel_type vxlan: virtual extensible lan igmp: export symbol ip_mc_leave_group netlink: add attributes to fdb interface tg3: unconditionally select HWMON support when tg3 is enabled. Revert "net: ti cpsw ethernet: allow reading phy interface mode from DT" gre: fix sparse warning ...
| * | netlink: hide struct module parameter in netlink_kernel_createPablo Neira Ayuso2012-09-08
| |/ | | | | | | | | | | | | | | | | | | | | This patch defines netlink_kernel_create as a wrapper function of __netlink_kernel_create to hide the struct module *me parameter (which seems to be THIS_MODULE in all existing netlink subsystems). Suggested by David S. Miller. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge branch 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds2012-10-02
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull workqueue changes from Tejun Heo: "This is workqueue updates for v3.7-rc1. A lot of activities this round including considerable API and behavior cleanups. * delayed_work combines a timer and a work item. The handling of the timer part has always been a bit clunky leading to confusing cancelation API with weird corner-case behaviors. delayed_work is updated to use new IRQ safe timer and cancelation now works as expected. * Another deficiency of delayed_work was lack of the counterpart of mod_timer() which led to cancel+queue combinations or open-coded timer+work usages. mod_delayed_work[_on]() are added. These two delayed_work changes make delayed_work provide interface and behave like timer which is executed with process context. * A work item could be executed concurrently on multiple CPUs, which is rather unintuitive and made flush_work() behavior confusing and half-broken under certain circumstances. This problem doesn't exist for non-reentrant workqueues. While non-reentrancy check isn't free, the overhead is incurred only when a work item bounces across different CPUs and even in simulated pathological scenario the overhead isn't too high. All workqueues are made non-reentrant. This removes the distinction between flush_[delayed_]work() and flush_[delayed_]_work_sync(). The former is now as strong as the latter and the specified work item is guaranteed to have finished execution of any previous queueing on return. * In addition to the various bug fixes, Lai redid and simplified CPU hotplug handling significantly. * Joonsoo introduced system_highpri_wq and used it during CPU hotplug. There are two merge commits - one to pull in IRQ safe timer from tip/timers/core and the other to pull in CPU hotplug fixes from wq/for-3.6-fixes as Lai's hotplug restructuring depended on them." Fixed a number of trivial conflicts, but the more interesting conflicts were silent ones where the deprecated interfaces had been used by new code in the merge window, and thus didn't cause any real data conflicts. Tejun pointed out a few of them, I fixed a couple more. * 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits) workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending() workqueue: use cwq_set_max_active() helper for workqueue_set_max_active() workqueue: introduce cwq_set_max_active() helper for thaw_workqueues() workqueue: remove @delayed from cwq_dec_nr_in_flight() workqueue: fix possible stall on try_to_grab_pending() of a delayed work item workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback() workqueue: use __cpuinit instead of __devinit for cpu callbacks workqueue: rename manager_mutex to assoc_mutex workqueue: WORKER_REBIND is no longer necessary for idle rebinding workqueue: WORKER_REBIND is no longer necessary for busy rebinding workqueue: reimplement idle worker rebinding workqueue: deprecate __cancel_delayed_work() workqueue: reimplement cancel_delayed_work() using try_to_grab_pending() workqueue: use mod_delayed_work() instead of __cancel + queue workqueue: use irqsafe timer for delayed_work workqueue: clean up delayed_work initializers and add missing one workqueue: make deferrable delayed_work initializer names consistent workqueue: cosmetic whitespace updates for macro definitions workqueue: deprecate system_nrt[_freezable]_wq workqueue: deprecate flush[_delayed]_work_sync() ...
| * workqueue: deprecate __cancel_delayed_work()Tejun Heo2012-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | Now that cancel_delayed_work() can be safely called from IRQ handlers, there's no reason to use __cancel_delayed_work(). Use cancel_delayed_work() instead of __cancel_delayed_work() and mark the latter deprecated. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Jens Axboe <axboe@kernel.dk> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Roland Dreier <roland@kernel.org> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
| * workqueue: use mod_delayed_work() instead of __cancel + queueTejun Heo2012-08-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that mod_delayed_work() is safe to call from IRQ handlers, __cancel_delayed_work() followed by queue_delayed_work() can be replaced with mod_delayed_work(). Most conversions are straight-forward except for the following. * net/core/link_watch.c: linkwatch_schedule_work() was doing a quite elaborate dancing around its delayed_work. Collapse it such that linkwatch_work is queued for immediate execution if LW_URGENT and existing timer is kept otherwise. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
| * workqueue: use mod_delayed_work() instead of cancel + queueTejun Heo2012-08-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert delayed_work users doing cancel_delayed_work() followed by queue_delayed_work() to mod_delayed_work(). Most conversions are straight-forward. Ones worth mentioning are, * drivers/edac/edac_mc.c: edac_mc_workq_setup() converted to always use mod_delayed_work() and cancel loop in edac_mc_reset_delay_period() is dropped. * drivers/platform/x86/thinkpad_acpi.c: No need to remember whether watchdog is active or not. @fan_watchdog_active and related code dropped. * drivers/power/charger-manager.c: Seemingly a lot of delayed_work_pending() abuse going on here. [delayed_]work_pending() are unsynchronized and racy when used like this. I converted one instance in fullbatt_handler(). Please conver the rest so that it invokes workqueue APIs for the intended target state rather than trying to game work item pending state transitions. e.g. if timer should be modified - call mod_delayed_work(), canceled - call cancel_delayed_work[_sync](). * drivers/thermal/thermal_sys.c: thermal_zone_device_set_polling() simplified. Note that round_jiffies() calls in this function are meaningless. round_jiffies() work on absolute jiffies not delta delay used by delayed_work. v2: Tomi pointed out that __cancel_delayed_work() users can't be safely converted to mod_delayed_work(). They could be calling it from irq context and if that happens while delayed_work_timer_fn() is running, it could deadlock. __cancel_delayed_work() users are dropped. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Anton Vorontsov <cbouatmailru@gmail.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Doug Thompson <dougthompson@xmission.com> Cc: David Airlie <airlied@linux.ie> Cc: Roland Dreier <roland@kernel.org> Cc: "John W. Linville" <linville@tuxdriver.com> Cc: Zhang Rui <rui.zhang@intel.com> Cc: Len Brown <len.brown@intel.com> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Johannes Berg <johannes@sipsolutions.net>
* | RDMA/ucma.c: Fix for events with wrong context on iWARPTatyana Nikolova2012-08-13
|/ | | | | | | | | | | | | It is possible for asynchronous RDMA_CM_EVENT_ESTABLISHED events to be generated with ctx->uid == 0, because ucma_set_event_context() copies ctx->uid to the event structure outside of ctx->file->mut. This leads to a crash in the userspace library, since it gets a bogus event. Fix this by taking the mutex a bit earlier in ucma_event_handler. Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com> Signed-off-by: Sean Hefty <Sean.Hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/ucma: Convert open-coded equivalent to memdup_user()Roland Dreier2012-07-27
| | | | | | | Suggested by scripts/coccinelle/api/memdup_user.cocci. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* RDMA/cma: Use PTR_RET rather than if (IS_ERR(...)) + PTR_ERRFengguang Wu2012-07-27
| | | | | | Suggested by scripts/coccinelle/api/ptr_ret.cocci. Signed-off-by: Roland Dreier <roland@purestorage.com>
* Merge tag 'rdma-for-3.6' of ↵Linus Torvalds2012-07-24
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA changes from Roland Dreier: - Updates to the qib low-level driver - First chunk of changes for SR-IOV support for mlx4 IB - RDMA CM support for IPv6-only binding - Other misc cleanups and fixes Fix up some add-add conflicts in include/linux/mlx4/device.h and drivers/net/ethernet/mellanox/mlx4/main.c * tag 'rdma-for-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (30 commits) IB/qib: checkpatch fixes IB/qib: Add congestion control agent implementation IB/qib: Reduce sdma_lock contention IB/qib: Fix an incorrect log message IB/qib: Fix QP RCU sparse warnings mlx4: Put physical GID and P_Key table sizes in mlx4_phys_caps struct and paravirtualize them mlx4_core: Allow guests to have IB ports mlx4_core: Implement mechanism for reserved Q_Keys net/mlx4_core: Free ICM table in case of error IB/cm: Destroy idr as part of the module init error flow mlx4_core: Remove double function declarations IB/mlx4: Fill the masked_atomic_cap attribute in query device IB/mthca: Fill in sq_sig_type in query QP IB/mthca: Warning about event for non-existent QPs should show event type IB/qib: Fix sparse RCU warnings in qib_keys.c net/mlx4_core: Initialize IB port capabilities for all slaves mlx4: Use port management change event instead of smp_snoop IB/qib: RCU locking for MR validation IB/qib: Avoid returning EBUSY from MR deregister IB/qib: Fix UC MR refs for immediate operations ...
| *-----. Merge branches 'cma', 'cxgb4', 'misc', 'mlx4-sriov', 'mlx-cleanups', ↵Roland Dreier2012-07-23
| |\ \ \ \ | | | | | | | | | | | | | | | | | | 'ocrdma' and 'qib' into for-linus
| | | | | * IB/cm: Destroy idr as part of the module init error flowDotan Barak2012-07-11
| | | |_|/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Clean the idr as part of the error flow since it is a resource too. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | * IB/core: Move CM_xxx_ATTR_ID macros from cm_msgs.h to ib_cm.hJack Morgenstein2012-07-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These macros will be reused by the mlx4 SRIOV-IB CM paravirtualization code, and there is no reason to have them declared both in the IB core in the mlx4 IB driver. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | | * IB/sa: Add GuidInfoRecord query supportErez Shitrit2012-07-08
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This query is needed for SRIOV alias GUID support. The query is implemented per the IB Spec definition in section 15.2.5.18 (GuidInfoRecord). Signed-off-by: Erez Shitrit <erezsh@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | * IB: Use IS_ENABLED(CONFIG_IPV6)Roland Dreier2012-07-08
| | |/ | | | | | | | | | | | | | | | Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | RDMA/cma: Allow user to restrict listens to bound address familySean Hefty2012-07-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide an option for the user to specify that listens should only accept connections where the incoming address family matches that of the locally bound address. This is used to support the equivalent of IPV6_V6ONLY socket option, which allows an app to only accept connection requests directed to IPv6 addresses. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | RDMA/cma: Listen on specific address familySean Hefty2012-07-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The rdma_cm maps IPv4 and IPv6 addresses to the same service ID. This prevents apps from listening only for IPv4 or IPv6 addresses. It also results in an app binding to an IPv4 address receiving connection requests for an IPv6 address. Change this to match socket behavior: restrict listens on IPv4 addresses to only IPv4 addresses, and if a listen is on an IPv6 address, allow it to receive either IPv4 or IPv6 addresses, based on its address family binding. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | RDMA/cma: Bind to a specific address familySean Hefty2012-07-08
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The RDMA CM uses a single port space for all associated (tcp, udp, etc.) port bindings, regardless of the address family that the user binds to. The result is that if a user binds to AF_INET, but does not specify an IP address, the bind will occur for AF_INET6. This causes an attempt to bind to the same port using AF_INET6 to fail, and connection requests to AF_INET6 will match with the AF_INET listener. Align the behavior with sockets and restrict the bind to AF_INET only. If a user binds to AF_INET6, we bind the port to AF_INET6 and AF_INET depending on the value of bindv6only. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | netlink: add netlink_kernel_cfg parameter to netlink_kernel_createPablo Neira Ayuso2012-06-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds the following structure: struct netlink_kernel_cfg { unsigned int groups; void (*input)(struct sk_buff *skb); struct mutex *cb_mutex; }; That can be passed to netlink_kernel_create to set optional configurations for netlink kernel sockets. I've populated this structure by looking for NULL and zero parameters at the existing code. The remaining parameters that always need to be set are still left in the original interface. That includes optional parameters for the netlink socket creation. This allows easy extensibility of this interface in the future. This patch also adapts all callers to use this new interface. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2012-06-28
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/caif/caif_hsi.c drivers/net/usb/qmi_wwan.c The qmi_wwan merge was trivial. The caif_hsi.c, on the other hand, was not. It's a conflict between 1c385f1fdf6f9c66d982802cd74349c040980b50 ("caif-hsi: Replace platform device with ops structure.") in the net-next tree and commit 39abbaef19cd0a30be93794aa4773c779c3eb1f3 ("caif-hsi: Postpone init of HIS until open()") in the net tree. I did my best with that one and will ask Sjur to check it out. Signed-off-by: David S. Miller <davem@davemloft.net>
| * RDMA/cma: QP type check on received REQs should be AND not ORSean Hefty2012-06-19
| | | | | | | | | | | | | | | | Change || check to the intended && when checking the QP type in a received connection request against the listening endpoint. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | infiniband: netlink: Move away from NLMSG_NEW().David S. Miller2012-06-27
|/ | | | | | And use nlmsg_data() while we're here too. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge tag 'rdma-for-3.5' of ↵Linus Torvalds2012-05-21
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull InfiniBand/RDMA changes from Roland Dreier: - Add ocrdma hardware driver for Emulex IB-over-Ethernet adapters - Add generic and mlx4 support for "raw" QPs: allow suitably privileged applications to send and receive arbitrary packets directly to/from the hardware - Add "doorbell drop" handling to the cxgb4 driver - A fairly large batch of qib hardware driver changes - A few fixes for lockdep-detected issues - A few other miscellaneous fixes and cleanups Fix up trivial conflict in drivers/net/ethernet/emulex/benet/be.h. * tag 'rdma-for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (53 commits) RDMA/cxgb4: Include vmalloc.h for vmalloc and vfree IB/mlx4: Fix mlx4_ib_add() error flow IB/core: Fix IB_SA_COMP_MASK macro IB/iser: Fix error flow in iser ep connection establishment IB/mlx4: Increase the number of vectors (EQs) available for ULPs RDMA/cxgb4: Add query_qp support RDMA/cxgb4: Remove kfifo usage RDMA/cxgb4: Use vmalloc() for debugfs QP dump RDMA/cxgb4: DB Drop Recovery for RDMA and LLD queues RDMA/cxgb4: Disable interrupts in c4iw_ev_dispatch() RDMA/cxgb4: Add DB Overflow Avoidance RDMA/cxgb4: Add debugfs RDMA memory stats cxgb4: DB Drop Recovery for RDMA and LLD queues cxgb4: Common platform specific changes for DB Drop Recovery cxgb4: Detect DB FULL events and notify RDMA ULD RDMA/cxgb4: Drop peer_abort when no endpoint found RDMA/cxgb4: Always wake up waiters in c4iw_peer_abort_intr() mlx4_core: Change bitmap allocator to work in round-robin fashion RDMA/nes: Don't call event handler if pointer is NULL RDMA/nes: Fix for the ORD value of the connecting peer ...
| *---. Merge branches 'core', 'cxgb4', 'ipath', 'iser', 'lockdep', 'mlx4', 'nes', ↵Roland Dreier2012-05-21
| |\ \ \ | | | | | | | | | | | | | | | 'ocrdma', 'qib' and 'raw-qp' into for-linus
| | | | * IB/core: Add raw packet QP typeOr Gerlitz2012-05-08
| | | |/ | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IB_QPT_RAW_PACKET allows applications to build a complete packet, including L2 headers, when sending; on the receive side, the HW will not strip any headers. This QP type is designed for userspace direct access to Ethernet; for example by applications that do TCP/IP themselves. Only processes with the NET_RAW capability are allowed to create raw packet QPs (the name "raw packet QP" is supposed to suggest an analogy to AF_PACKET / SOL_RAW sockets). Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | * RDMA/cma: Fix lockdep false positive recursive lockingSean Hefty2012-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following lockdep problem was reported by Or Gerlitz <ogerlitz@mellanox.com>: [ INFO: possible recursive locking detected ] 3.3.0-32035-g1b2649e-dirty #4 Not tainted --------------------------------------------- kworker/5:1/418 is trying to acquire lock: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0138a41>] rdma_destroy_i d+0x33/0x1f0 [rdma_cm] but task is already holding lock: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disable_ca llback+0x24/0x45 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/5:1/418: #0: (ib_cm){.+.+.+}, at: [<ffffffff81042ac1>] process_one_work+0x210/0x4a 6 #1: ((&(&work->work)->work)){+.+.+.}, at: [<ffffffff81042ac1>] process_on e_work+0x210/0x4a6 #2: (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0135130>] cma_disab le_callback+0x24/0x45 [rdma_cm] stack backtrace: Pid: 418, comm: kworker/5:1 Not tainted 3.3.0-32035-g1b2649e-dirty #4 Call Trace: [<ffffffff8102b0fb>] ? console_unlock+0x1f4/0x204 [<ffffffff81068771>] __lock_acquire+0x16b5/0x174e [<ffffffff8106461f>] ? save_trace+0x3f/0xb3 [<ffffffff810688fa>] lock_acquire+0xf0/0x116 [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffff81364351>] mutex_lock_nested+0x64/0x2ce [<ffffffffa0138a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155 [<ffffffff81065abc>] ? trace_hardirqs_on+0xd/0xf [<ffffffffa0138a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm] [<ffffffffa0139c02>] cma_req_handler+0x418/0x644 [rdma_cm] [<ffffffffa012ee88>] cm_process_work+0x32/0x119 [ib_cm] [<ffffffffa0130299>] cm_req_handler+0x928/0x982 [ib_cm] [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm] [<ffffffffa0130326>] cm_work_handler+0x33/0xfe5 [ib_cm] [<ffffffff81065a78>] ? trace_hardirqs_on_caller+0x11e/0x155 [<ffffffffa01302f3>] ? cm_req_handler+0x982/0x982 [ib_cm] [<ffffffff81042b6e>] process_one_work+0x2bd/0x4a6 [<ffffffff81042ac1>] ? process_one_work+0x210/0x4a6 [<ffffffff813669f3>] ? _raw_spin_unlock_irq+0x2b/0x40 [<ffffffff8104316e>] worker_thread+0x1d6/0x350 [<ffffffff81042f98>] ? rescuer_thread+0x241/0x241 [<ffffffff81046a32>] kthread+0x84/0x8c [<ffffffff8136e854>] kernel_thread_helper+0x4/0x10 [<ffffffff81366d59>] ? retint_restore_args+0xe/0xe [<ffffffff810469ae>] ? __init_kthread_worker+0x56/0x56 [<ffffffff8136e850>] ? gs_change+0xb/0xb The actual locking is fine, since we're dealing with different locks, but from the same lock class. cma_disable_callback() acquires the listening id mutex, whereas rdma_destroy_id() acquires the mutex for the new connection id. To fix this, delay the call to rdma_destroy_id() until we've released the listening id mutex. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | * IB/uverbs: Lock SRQ / CQ / PD objects in a consistent orderRoland Dreier2012-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since XRC support was added, the uverbs code has locked SRQ, CQ and PD objects needed during QP and SRQ creation in different orders depending on the the code path. This leads to the (at least theoretical) possibility of deadlock, and triggers the lockdep splat below. Fix this by making sure we always lock the SRQ first, then CQs and finally the PD. ====================================================== [ INFO: possible circular locking dependency detected ] 3.4.0-rc5+ #34 Not tainted ------------------------------------------------------- ibv_srq_pingpon/2484 is trying to acquire lock: (SRQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs] but task is already holding lock: (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (CQ-uobj){+++++.}: [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe [<ffffffff81384f28>] down_read+0x34/0x43 [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs] [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs] [<ffffffffa00b16c3>] ib_uverbs_create_qp+0x180/0x684 [ib_uverbs] [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs] [<ffffffff810fe47f>] vfs_write+0xa7/0xee [<ffffffff810fe65f>] sys_write+0x45/0x69 [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b -> #1 (PD-uobj){++++++}: [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe [<ffffffff81384f28>] down_read+0x34/0x43 [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs] [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs] [<ffffffffa00af8ad>] __uverbs_create_xsrq+0x96/0x386 [ib_uverbs] [<ffffffffa00b31b9>] ib_uverbs_detach_mcast+0x1cd/0x1e6 [ib_uverbs] [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs] [<ffffffff810fe47f>] vfs_write+0xa7/0xee [<ffffffff810fe65f>] sys_write+0x45/0x69 [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b -> #0 (SRQ-uobj){+++++.}: [<ffffffff81070898>] __lock_acquire+0xa29/0xd06 [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe [<ffffffff81384f28>] down_read+0x34/0x43 [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs] [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs] [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs] [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs] [<ffffffff810fe47f>] vfs_write+0xa7/0xee [<ffffffff810fe65f>] sys_write+0x45/0x69 [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b other info that might help us debug this: Chain exists of: SRQ-uobj --> PD-uobj --> CQ-uobj Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(CQ-uobj); lock(PD-uobj); lock(CQ-uobj); lock(SRQ-uobj); *** DEADLOCK *** 3 locks held by ibv_srq_pingpon/2484: #0: (QP-uobj){+.+...}, at: [<ffffffffa00b162c>] ib_uverbs_create_qp+0xe9/0x684 [ib_uverbs] #1: (PD-uobj){++++++}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs] #2: (CQ-uobj){+++++.}, at: [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs] stack backtrace: Pid: 2484, comm: ibv_srq_pingpon Not tainted 3.4.0-rc5+ #34 Call Trace: [<ffffffff8137eff0>] print_circular_bug+0x1f8/0x209 [<ffffffff81070898>] __lock_acquire+0xa29/0xd06 [<ffffffffa00af37c>] ? __idr_get_uobj+0x20/0x5e [ib_uverbs] [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs] [<ffffffff81070fd0>] lock_acquire+0xbf/0xfe [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs] [<ffffffff81070eee>] ? lock_release+0x166/0x189 [<ffffffff81384f28>] down_read+0x34/0x43 [<ffffffffa00af51b>] ? idr_read_uobj+0x2f/0x4d [ib_uverbs] [<ffffffffa00af51b>] idr_read_uobj+0x2f/0x4d [ib_uverbs] [<ffffffffa00af542>] idr_read_obj+0x9/0x19 [ib_uverbs] [<ffffffffa00b1728>] ib_uverbs_create_qp+0x1e5/0x684 [ib_uverbs] [<ffffffff81070fec>] ? lock_acquire+0xdb/0xfe [<ffffffff81070c09>] ? lock_release_non_nested+0x94/0x213 [<ffffffff810d470f>] ? might_fault+0x40/0x90 [<ffffffff810d470f>] ? might_fault+0x40/0x90 [<ffffffffa00ae3dd>] ib_uverbs_write+0xb7/0xc2 [ib_uverbs] [<ffffffff810fe47f>] vfs_write+0xa7/0xee [<ffffffff810ff736>] ? fget_light+0x3b/0x99 [<ffffffff810fe65f>] sys_write+0x45/0x69 [<ffffffff8138cdf9>] system_call_fastpath+0x16/0x1b Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | | * IB/uverbs: Make lockdep output more readableRoland Dreier2012-05-08
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add names for our lockdep classes, so instead of having to decipher lockdep output with mysterious names: Chain exists of: key#14 --> key#11 --> key#13 lockdep will give us something nicer: Chain exists of: SRQ-uobj --> PD-uobj --> CQ-uobj Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | IB/core: Fix mismatch between locked and pinned pagesYishai Hadas2012-05-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit bc3e53f682d9 ("mm: distinguish between mlocked and pinned pages") introduced a separate counter for pinned pages and used it in the IB stack. However, in ib_umem_get() the pinned counter is incremented, but ib_umem_release() wrongly decrements the locked counter. Fix this. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: <stable@vger.kernel.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
| * | IB/core: Use qp->usecnt to track multicast attach/detachOr Gerlitz2012-05-08
| |/ | | | | | | | | | | | | | | | | | | | | Just as we don't allow PDs, CQs, etc. to be destroyed if there are QPs that are attached to them, don't let a QP be destroyed if there are multicast group(s) attached to it. Use the existing usecnt field of struct ib_qp which was added by commit 0e0ec7e ("RDMA/core: Export ib_open_qp() to share XRC TGT QPs") to track this. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2012-05-07
|\| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: drivers/net/ethernet/intel/e1000e/param.c drivers/net/wireless/iwlwifi/iwl-agn-rx.c drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c drivers/net/wireless/iwlwifi/iwl-trans.h Resolved the iwlwifi conflict with mainline using 3-way diff posted by John Linville and Stephen Rothwell. In 'net' we added a bug fix to make iwlwifi report a more accurate skb->truesize but this conflicted with RX path changes that happened meanwhile in net-next. In e1000e a conflict arose in the validation code for settings of adapter->itr. 'net-next' had more sophisticated logic so that logic was used. Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge tag 'ib-fixes' of ↵Linus Torvalds2012-04-26
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband fixes from Roland Dreier: "A few fixes for regressions introduced in 3.4-rc1: - fix memory leak in mlx4 - fix two problems with new MAD response generation code" * tag 'ib-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: IB/mlx4: Fix memory leaks in ib_link_query_port() IB/mad: Don't send response for failed MADs IB/mad: Set 'D' bit in response for unhandled MADs
| | * IB/mad: Don't send response for failed MADsJack Morgenstein2012-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0b307043049f ("IB/mad: Return error response for unsupported MADs") does not failed MADs (eg those that return IB_MAD_RESULT_FAILURE) properly -- these MADs should be silently discarded. (We should not force the lower-layer drivers to return SUCCESS | CONSUMED in this case, since the MAD is NOT successful). Unsupported MADs are not failures -- they return SUCCESS, but with an "unsupported error" status value inside the response MAD. Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
| | * IB/mad: Set 'D' bit in response for unhandled MADsJack Morgenstein2012-04-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 0b307043049f ("IB/mad: Return error response for unsupported MADs") does not handle directed-route MADs properly -- it fails to set the 'D' bit in the response MAD status field. This is a problem for SmInfo MADs when the receiver does not have an SM running. Reviewed-by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Roland Dreier <roland@purestorage.com>
* | | net: Convert all sysctl registrations to register_net_sysctlEric W. Biederman2012-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This results in code with less boiler plate that is a bit easier to read. Additionally stops us from using compatibility code in the sysctl core, hastening the day when the compatibility code can be removed. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>