<feed xmlns='http://www.w3.org/2005/Atom'>
<title>litmus-rt.git/drivers/infiniband/hw, branch wip-binary-heap</title>
<subtitle>The LITMUS^RT kernel.</subtitle>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/'/>
<entry>
<title>Merge branches 'cxgb4' and 'qib' into for-next</title>
<updated>2011-06-17T18:57:55+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>roland@purestorage.com</email>
</author>
<published>2011-06-17T18:57:55+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=c7d74b090913102e7917dd02bb574ef060e1e930'/>
<id>c7d74b090913102e7917dd02bb574ef060e1e930</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>IB/qib: Ensure that LOS and DFE are being turned off</title>
<updated>2011-06-17T18:56:59+00:00</updated>
<author>
<name>Mitko Haralanov</name>
<email>mitko@qlogic.com</email>
</author>
<published>2011-06-09T20:27:26+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=3126448451105fae59de0058c68692aa09aa4c37'/>
<id>3126448451105fae59de0058c68692aa09aa4c37</id>
<content type='text'>
Due to timing, it is possible for the LOS and DFE to remain on. This
is due to the link progressing to LinkUP prior to the driver getting
the first Status Changed interrupt.  By expanding the conditions under
which LOS is turned off and DFE timeout is being set, timing is no
longer an issue.

Signed-off-by: Mitko Haralanov &lt;mitko@qlogic.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@qlogic.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Due to timing, it is possible for the LOS and DFE to remain on. This
is due to the link progressing to LinkUP prior to the driver getting
the first Status Changed interrupt.  By expanding the conditions under
which LOS is turned off and DFE timeout is being set, timing is no
longer an issue.

Signed-off-by: Mitko Haralanov &lt;mitko@qlogic.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@qlogic.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/cxgb4: Couple of abort fixes</title>
<updated>2011-06-17T18:54:56+00:00</updated>
<author>
<name>Steve Wise</name>
<email>swise@opengridcomputing.com</email>
</author>
<published>2011-06-14T20:59:27+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=8da7e7a55231543b84ac84e93ad5ca9d340773d7'/>
<id>8da7e7a55231543b84ac84e93ad5ca9d340773d7</id>
<content type='text'>
- fix a race where the driver could end up sending a close_con_req
  after an abort_rpl.  In c4iw_ep_disconnect(), send abort or close
  request with the ep mutex held.

- fix a hang where driver fails to wake up when a connection is reset
  during a normal close.  Wake up any waiters in the interrupt path,
  and correctly cleanup after rdma_fini() failures.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- fix a race where the driver could end up sending a close_con_req
  after an abort_rpl.  In c4iw_ep_disconnect(), send abort or close
  request with the ep mutex held.

- fix a hang where driver fails to wake up when a connection is reset
  during a normal close.  Wake up any waiters in the interrupt path,
  and correctly cleanup after rdma_fini() failures.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/cxgb4: Don't truncate MR lengths</title>
<updated>2011-06-17T18:54:50+00:00</updated>
<author>
<name>Steve Wise</name>
<email>swise@opengridcomputing.com</email>
</author>
<published>2011-06-14T20:59:21+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=301c2c3f039a1f9478f6cbef60f2ccd4da9bd4a1'/>
<id>301c2c3f039a1f9478f6cbef60f2ccd4da9bd4a1</id>
<content type='text'>
Remove left-over code from T3 that limited MR sizes to 32b.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Remove left-over code from T3 that limited MR sizes to 32b.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/cxgb4: Don't exceed hw IQ depth limit for user CQs</title>
<updated>2011-06-17T18:52:45+00:00</updated>
<author>
<name>Steve Wise</name>
<email>swise@opengridcomputing.com</email>
</author>
<published>2011-06-01T17:49:14+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=2ff7d09a1b0f20f2d9c1bde0e003d4e384de2313'/>
<id>2ff7d09a1b0f20f2d9c1bde0e003d4e384de2313</id>
<content type='text'>
Memory allocated for user CQs gets rounded up to the next page
boundary.  And after rounding, we recalculate the resulting IQ depth
and we need to make sure we don't exceed the HW limits.

This bug can result a much smaller CQ allocated than was expected if
the HW size field is exceeded, resulting in CQ overflow failures.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Memory allocated for user CQs gets rounded up to the next page
boundary.  And after rounding, we recalculate the resulting IQ depth
and we need to make sure we don't exceed the HW limits.

This bug can result a much smaller CQ allocated than was expected if
the HW size field is exceeded, resulting in CQ overflow failures.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband</title>
<updated>2011-05-26T19:13:57+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2011-05-26T19:13:57+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=4c171acc20794af16a27da25e11ec4e9cad5d9fa'/>
<id>4c171acc20794af16a27da25e11ec4e9cad5d9fa</id>
<content type='text'>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cma: Save PID of ID's owner
  RDMA/cma: Add support for netlink statistics export
  RDMA/cma: Pass QP type into rdma_create_id()
  RDMA: Update exported headers list
  RDMA/cma: Export enum cma_state in &lt;rdma/rdma_cm.h&gt;
  RDMA/nes: Add a check for strict_strtoul()
  RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
  RDMA/cxgb4: Use completion objects for event blocking
  IB/srp: Fix integer -&gt; pointer cast warnings
  IB: Add devnode methods to cm_class and umad_class
  IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
  IB/uverbs: Add devnode method to set path/mode
  RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
  RDMA: Add netlink infrastructure
  RDMA: Add error handling to ib_core_init()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  RDMA/cma: Save PID of ID's owner
  RDMA/cma: Add support for netlink statistics export
  RDMA/cma: Pass QP type into rdma_create_id()
  RDMA: Update exported headers list
  RDMA/cma: Export enum cma_state in &lt;rdma/rdma_cm.h&gt;
  RDMA/nes: Add a check for strict_strtoul()
  RDMA/cxgb3: Don't post zero-byte read if endpoint is going away
  RDMA/cxgb4: Use completion objects for event blocking
  IB/srp: Fix integer -&gt; pointer cast warnings
  IB: Add devnode methods to cm_class and umad_class
  IB/mad: Return EPROTONOSUPPORT when an RDMA device lacks the QP required
  IB/uverbs: Add devnode method to set path/mode
  RDMA/ucma: Add .nodename/.mode to tell userspace where to create device node
  RDMA: Add netlink infrastructure
  RDMA: Add error handling to ib_core_init()
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge branches 'cma', 'cxgb3', 'cxgb4', 'misc', 'nes', 'netlink', 'srp' and 'uverbs' into for-next</title>
<updated>2011-05-25T20:47:20+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>roland@purestorage.com</email>
</author>
<published>2011-05-25T20:47:20+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=8dc4abdf4c82d0e1c47f14b6615406d31975ea66'/>
<id>8dc4abdf4c82d0e1c47f14b6615406d31975ea66</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/nes: Add a check for strict_strtoul()</title>
<updated>2011-05-24T17:06:25+00:00</updated>
<author>
<name>Liu Yuan</name>
<email>tailai.ly@taobao.com</email>
</author>
<published>2011-04-01T06:23:49+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=52f81dbaf1378faf64c3ecea5129cebf826ef126'/>
<id>52f81dbaf1378faf64c3ecea5129cebf826ef126</id>
<content type='text'>
It should check if strict_strtoul() succeeds before using
'wqm_quanta_value'.

Signed-off-by: Liu Yuan &lt;tailai.ly@taobao.com&gt;

[ Convert to kstrtoul() directly while we're here.  - Roland ]

Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
It should check if strict_strtoul() succeeds before using
'wqm_quanta_value'.

Signed-off-by: Liu Yuan &lt;tailai.ly@taobao.com&gt;

[ Convert to kstrtoul() directly while we're here.  - Roland ]

Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/cxgb3: Don't post zero-byte read if endpoint is going away</title>
<updated>2011-05-24T17:01:04+00:00</updated>
<author>
<name>Steve Wise</name>
<email>swise@opengridcomputing.com</email>
</author>
<published>2011-05-13T18:37:18+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=807838686eb9e40d73b8a3f2384881358f51fff0'/>
<id>807838686eb9e40d73b8a3f2384881358f51fff0</id>
<content type='text'>
tx_ack() wasn't checking the endpoint state and consequently would
attempt to post the p2p 0B read on an endpoint/QP that is closing or
aborting.  This causes a NULL pointer dereference crash.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
tx_ack() wasn't checking the endpoint state and consequently would
attempt to post the p2p 0B read on an endpoint/QP that is closing or
aborting.  This causes a NULL pointer dereference crash.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>RDMA/cxgb4: Use completion objects for event blocking</title>
<updated>2011-05-24T16:47:38+00:00</updated>
<author>
<name>Steve Wise</name>
<email>swise@opengridcomputing.com</email>
</author>
<published>2011-05-20T16:25:05+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=c337374bf23b88620bcc66a7a09f141cc640f548'/>
<id>c337374bf23b88620bcc66a7a09f141cc640f548</id>
<content type='text'>
There exists a race condition when using wait_queue_head_t objects
that are declared on the stack.  This was being done in a few places
where we are sending work requests to the FW and awaiting replies, but
we don't have an endpoint structure with an embedded c4iw_wr_wait
struct.  So the code was allocating it locally on the stack.  Bad
design.  The race is:

  1) thread on cpuX declares the wait_queue_head_t on the stack, then
     posts a firmware WR with that wait object ptr as the cookie to be
     returned in the WR reply.  This thread will proceed to block in
     wait_event_timeout() but before it does:

  2) An interrupt runs on cpuY with the WR reply.  fw6_msg() handles
     this and calls c4iw_wake_up().  c4iw_wake_up() sets the condition
     variable in the c4iw_wr_wait object to TRUE and will call
     wake_up(), but before it calls wake_up():

  3) The thread on cpuX calls c4iw_wait_for_reply(), which calls
     wait_event_timeout().  The wait_event_timeout() macro checks the
     condition variable and returns immediately since it is TRUE.  So
     this thread never blocks/sleeps. The function then returns
     effectively deallocating the c4iw_wr_wait object that was on the
     stack.

  4) So at this point cpuY has a pointer to the c4iw_wr_wait object
     that is no longer valid.  Further its pointing to a stack frame
     that might now be in use by some other context/thread.  So cpuY
     continues execution and calls wake_up() on a ptr to a wait object
     that as been effectively deallocated.

This race, when it hits, can cause a crash in wake_up(), which I've
seen under heavy stress. It can also corrupt the referenced stack
which can cause any number of failures.

The fix:

Use struct completion, which supports on-stack declarations.
Completions use a spinlock around setting the condition to true and
the wake up so that steps 2 and 4 above are atomic and step 3 can
never happen in-between.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There exists a race condition when using wait_queue_head_t objects
that are declared on the stack.  This was being done in a few places
where we are sending work requests to the FW and awaiting replies, but
we don't have an endpoint structure with an embedded c4iw_wr_wait
struct.  So the code was allocating it locally on the stack.  Bad
design.  The race is:

  1) thread on cpuX declares the wait_queue_head_t on the stack, then
     posts a firmware WR with that wait object ptr as the cookie to be
     returned in the WR reply.  This thread will proceed to block in
     wait_event_timeout() but before it does:

  2) An interrupt runs on cpuY with the WR reply.  fw6_msg() handles
     this and calls c4iw_wake_up().  c4iw_wake_up() sets the condition
     variable in the c4iw_wr_wait object to TRUE and will call
     wake_up(), but before it calls wake_up():

  3) The thread on cpuX calls c4iw_wait_for_reply(), which calls
     wait_event_timeout().  The wait_event_timeout() macro checks the
     condition variable and returns immediately since it is TRUE.  So
     this thread never blocks/sleeps. The function then returns
     effectively deallocating the c4iw_wr_wait object that was on the
     stack.

  4) So at this point cpuY has a pointer to the c4iw_wr_wait object
     that is no longer valid.  Further its pointing to a stack frame
     that might now be in use by some other context/thread.  So cpuY
     continues execution and calls wake_up() on a ptr to a wait object
     that as been effectively deallocated.

This race, when it hits, can cause a crash in wake_up(), which I've
seen under heavy stress. It can also corrupt the referenced stack
which can cause any number of failures.

The fix:

Use struct completion, which supports on-stack declarations.
Completions use a spinlock around setting the condition to true and
the wake up so that steps 2 and 4 above are atomic and step 3 can
never happen in-between.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
