<feed xmlns='http://www.w3.org/2005/Atom'>
<title>litmus-rt.git/drivers/md, branch test</title>
<subtitle>The LITMUS^RT kernel.</subtitle>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/'/>
<entry>
<title>md: make sure MD_RECOVERY_DONE is clear before starting recovery/resync</title>
<updated>2015-06-12T10:16:33+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2015-06-12T10:05:04+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=ea358cd0d2c634ff1379a1392edcdf2289f31e13'/>
<id>ea358cd0d2c634ff1379a1392edcdf2289f31e13</id>
<content type='text'>
MD_RECOVERY_DONE is normally cleared by md_check_recovery after a
resync etc finished.  However it is possible for raid5_start_reshape
to race and start a reshape before MD_RECOVERY_DONE is cleared.  This
can lean to multiple reshapes running at the same time, which isn't
good.

To make sure it is cleared before starting a reshape, and also clear
it when reaping a thread, just to be safe.

Signed-off-by: NeilBrown  &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
MD_RECOVERY_DONE is normally cleared by md_check_recovery after a
resync etc finished.  However it is possible for raid5_start_reshape
to race and start a reshape before MD_RECOVERY_DONE is cleared.  This
can lean to multiple reshapes running at the same time, which isn't
good.

To make sure it is cleared before starting a reshape, and also clear
it when reaping a thread, just to be safe.

Signed-off-by: NeilBrown  &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: Close race when setting 'action' to 'idle'.</title>
<updated>2015-06-12T10:16:26+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2015-06-12T09:51:27+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=8e8e2518fceca407bb8fc2a6710d19d2e217892e'/>
<id>8e8e2518fceca407bb8fc2a6710d19d2e217892e</id>
<content type='text'>
Checking -&gt;sync_thread without holding the mddev_lock()
isn't really safe, even after flushing the workqueue which
ensures md_start_sync() has been run.

While this code is waiting for the lock, md_check_recovery could reap
the thread itself, and then start another thread (e.g. recovery might
finish, then reshape starts).  When this thread gets the lock
md_start_sync() hasn't run so it doesn't get reaped, but
MD_RECOVERY_RUNNING gets cleared.  This allows two threads to start
which leads to confusion.

So don't both if MD_RECOVERY_RUNNING isn't set, but if it is do
the flush and the test and the reap all under the mddev_lock to
avoid any race with md_check_recovery.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
Cc: stable@vger.kernel.org (v4.0+)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Checking -&gt;sync_thread without holding the mddev_lock()
isn't really safe, even after flushing the workqueue which
ensures md_start_sync() has been run.

While this code is waiting for the lock, md_check_recovery could reap
the thread itself, and then start another thread (e.g. recovery might
finish, then reshape starts).  When this thread gets the lock
md_start_sync() hasn't run so it doesn't get reaped, but
MD_RECOVERY_RUNNING gets cleared.  This allows two threads to start
which leads to confusion.

So don't both if MD_RECOVERY_RUNNING isn't set, but if it is do
the flush and the test and the reap all under the mddev_lock to
avoid any race with md_check_recovery.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
Cc: stable@vger.kernel.org (v4.0+)
</pre>
</div>
</content>
</entry>
<entry>
<title>md: don't return 0 from array_state_store</title>
<updated>2015-06-12T10:16:16+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2015-06-12T09:46:44+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=c008f1d356277a5b7561040596a073d87e56b0c8'/>
<id>c008f1d356277a5b7561040596a073d87e56b0c8</id>
<content type='text'>
Returning zero from a 'store' function is bad.
The return value should be either len length of the string
or an error.

So use 'len' if 'err' is zero.

Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: stable@vger.kernel (v4.0+)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Returning zero from a 'store' function is bad.
The return value should be either len length of the string
or an error.

So use 'len' if 'err' is zero.

Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Cc: stable@vger.kernel (v4.0+)
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'dm-4.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm</title>
<updated>2015-05-29T21:39:24+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-05-29T21:39:24+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=0f1e5b5d19f6c06fe2078f946377db9861f3910d'/>
<id>0f1e5b5d19f6c06fe2078f946377db9861f3910d</id>
<content type='text'>
Pull device-mapper fixes from Mike Snitzer:
 "Quite a few fixes for DM's blk-mq support thanks to extra DM multipath
  testing from Junichi Nomura and Bart Van Assche.

  Also fix a casting bug in dm_merge_bvec() that could cause only a
  single page to be added to a bio (Joe identified this while testing
  dm-cache writeback)"

* tag 'dm-4.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: fix casting bug in dm_merge_bvec()
  dm: fix reload failure of 0 path multipath mapping on blk-mq devices
  dm: fix false warning in free_rq_clone() for unmapped requests
  dm: requeue from blk-mq dm_mq_queue_rq() using BLK_MQ_RQ_QUEUE_BUSY
  dm mpath: fix leak of dm_mpath_io structure in blk-mq .queue_rq error path
  dm: fix NULL pointer when clone_and_map_rq returns !DM_MAPIO_REMAPPED
  dm: run queue on re-queue
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull device-mapper fixes from Mike Snitzer:
 "Quite a few fixes for DM's blk-mq support thanks to extra DM multipath
  testing from Junichi Nomura and Bart Van Assche.

  Also fix a casting bug in dm_merge_bvec() that could cause only a
  single page to be added to a bio (Joe identified this while testing
  dm-cache writeback)"

* tag 'dm-4.1-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm: fix casting bug in dm_merge_bvec()
  dm: fix reload failure of 0 path multipath mapping on blk-mq devices
  dm: fix false warning in free_rq_clone() for unmapped requests
  dm: requeue from blk-mq dm_mq_queue_rq() using BLK_MQ_RQ_QUEUE_BUSY
  dm mpath: fix leak of dm_mpath_io structure in blk-mq .queue_rq error path
  dm: fix NULL pointer when clone_and_map_rq returns !DM_MAPIO_REMAPPED
  dm: run queue on re-queue
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: fix casting bug in dm_merge_bvec()</title>
<updated>2015-05-29T17:41:16+00:00</updated>
<author>
<name>Joe Thornber</name>
<email>ejt@redhat.com</email>
</author>
<published>2015-05-29T13:52:51+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=1c220c69ce0dcc0f234a9f263ad9c0864f971852'/>
<id>1c220c69ce0dcc0f234a9f263ad9c0864f971852</id>
<content type='text'>
dm_merge_bvec() was originally added in f6fccb ("dm: introduce
merge_bvec_fn").  In that commit a value in sectors is converted to
bytes using &lt;&lt; 9, and then assigned to an int.  This code made
assumptions about the value of BIO_MAX_SECTORS.

A later commit 148e51 ("dm: improve documentation and code clarity in
dm_merge_bvec") was meant to have no functional change but it removed
the use of BIO_MAX_SECTORS in favor of using queue_max_sectors().  At
this point the cast from sector_t to int resulted in a zero value.  The
fallout being dm_merge_bvec() would only allow a single page to be added
to a bio.

This interim fix is minimal for the benefit of stable@ because the more
comprehensive cleanup of passing a sector_t to all DM targets' merge
function will impact quite a few DM targets.

Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: stable@vger.kernel.org # 3.19+
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
dm_merge_bvec() was originally added in f6fccb ("dm: introduce
merge_bvec_fn").  In that commit a value in sectors is converted to
bytes using &lt;&lt; 9, and then assigned to an int.  This code made
assumptions about the value of BIO_MAX_SECTORS.

A later commit 148e51 ("dm: improve documentation and code clarity in
dm_merge_bvec") was meant to have no functional change but it removed
the use of BIO_MAX_SECTORS in favor of using queue_max_sectors().  At
this point the cast from sector_t to int resulted in a zero value.  The
fallout being dm_merge_bvec() would only allow a single page to be added
to a bio.

This interim fix is minimal for the benefit of stable@ because the more
comprehensive cleanup of passing a sector_t to all DM targets' merge
function will impact quite a few DM targets.

Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: stable@vger.kernel.org # 3.19+
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: fix reload failure of 0 path multipath mapping on blk-mq devices</title>
<updated>2015-05-29T17:41:16+00:00</updated>
<author>
<name>Junichi Nomura</name>
<email>j-nomura@ce.jp.nec.com</email>
</author>
<published>2015-05-29T08:51:03+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=15b94a690470038aa08247eedbebbe7e2218d5ee'/>
<id>15b94a690470038aa08247eedbebbe7e2218d5ee</id>
<content type='text'>
dm-multipath accepts 0 path mapping.

  # echo '0 2097152 multipath 0 0 0 0' | dmsetup create newdev

Such a mapping can be used to release underlying devices while still
holding requests in its queue until working paths come back.

However, once the multipath device is created over blk-mq devices,
it rejects reloading of 0 path mapping:

  # echo '0 2097152 multipath 0 0 1 1 queue-length 0 1 1 /dev/sda 1' \
      | dmsetup create mpath1
  # echo '0 2097152 multipath 0 0 0 0' | dmsetup load mpath1
  device-mapper: reload ioctl on mpath1 failed: Invalid argument
  Command failed

With following kernel message:
  device-mapper: ioctl: can't change device type after initial table load.

DM tries to inherit the current table type using dm_table_set_type()
but it doesn't work as expected because of unnecessary check about
whether the target type is hybrid or not.

Hybrid type is for targets that work as either request-based or bio-based
and not required for blk-mq or non blk-mq checking.

Fixes: 65803c205983 ("dm table: train hybrid target type detection to select blk-mq if appropriate")
Signed-off-by: Jun'ichi Nomura &lt;j-nomura@ce.jp.nec.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
dm-multipath accepts 0 path mapping.

  # echo '0 2097152 multipath 0 0 0 0' | dmsetup create newdev

Such a mapping can be used to release underlying devices while still
holding requests in its queue until working paths come back.

However, once the multipath device is created over blk-mq devices,
it rejects reloading of 0 path mapping:

  # echo '0 2097152 multipath 0 0 1 1 queue-length 0 1 1 /dev/sda 1' \
      | dmsetup create mpath1
  # echo '0 2097152 multipath 0 0 0 0' | dmsetup load mpath1
  device-mapper: reload ioctl on mpath1 failed: Invalid argument
  Command failed

With following kernel message:
  device-mapper: ioctl: can't change device type after initial table load.

DM tries to inherit the current table type using dm_table_set_type()
but it doesn't work as expected because of unnecessary check about
whether the target type is hybrid or not.

Hybrid type is for targets that work as either request-based or bio-based
and not required for blk-mq or non blk-mq checking.

Fixes: 65803c205983 ("dm table: train hybrid target type detection to select blk-mq if appropriate")
Signed-off-by: Jun'ichi Nomura &lt;j-nomura@ce.jp.nec.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Merge tag 'md/4.1-rc5-fixes' of git://neil.brown.name/md</title>
<updated>2015-05-29T17:35:21+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-05-29T17:35:21+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=c492e2d464c82fdb82002569033325b4e1a7e5eb'/>
<id>c492e2d464c82fdb82002569033325b4e1a7e5eb</id>
<content type='text'>
Pull m,ore md bugfixes gfrom Neil Brown:
 "Assorted fixes for new RAID5 stripe-batching functionality.

  Unfortunately this functionality was merged a little prematurely.  The
  necessary testing and code review is now complete (or as complete as
  it can be) and to code passes a variety of tests and looks quite
  sensible.

  Also a fix for some recent locking changes - a race was introduced
  which causes a reshape request to sometimes fail.  No data safety
  issues"

* tag 'md/4.1-rc5-fixes' of git://neil.brown.name/md:
  md: fix race when unfreezing sync_action
  md/raid5: break stripe-batches when the array has failed.
  md/raid5: call break_stripe_batch_list from handle_stripe_clean_event
  md/raid5: be more selective about distributing flags across batch.
  md/raid5: add handle_flags arg to break_stripe_batch_list.
  md/raid5: duplicate some more handle_stripe_clean_event code in break_stripe_batch_list
  md/raid5: remove condition test from check_break_stripe_batch_list.
  md/raid5: Ensure a batch member is not handled prematurely.
  md/raid5: close race between STRIPE_BIT_DELAY and batching.
  md/raid5: ensure whole batch is delayed for all required bitmap updates.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Pull m,ore md bugfixes gfrom Neil Brown:
 "Assorted fixes for new RAID5 stripe-batching functionality.

  Unfortunately this functionality was merged a little prematurely.  The
  necessary testing and code review is now complete (or as complete as
  it can be) and to code passes a variety of tests and looks quite
  sensible.

  Also a fix for some recent locking changes - a race was introduced
  which causes a reshape request to sometimes fail.  No data safety
  issues"

* tag 'md/4.1-rc5-fixes' of git://neil.brown.name/md:
  md: fix race when unfreezing sync_action
  md/raid5: break stripe-batches when the array has failed.
  md/raid5: call break_stripe_batch_list from handle_stripe_clean_event
  md/raid5: be more selective about distributing flags across batch.
  md/raid5: add handle_flags arg to break_stripe_batch_list.
  md/raid5: duplicate some more handle_stripe_clean_event code in break_stripe_batch_list
  md/raid5: remove condition test from check_break_stripe_batch_list.
  md/raid5: Ensure a batch member is not handled prematurely.
  md/raid5: close race between STRIPE_BIT_DELAY and batching.
  md/raid5: ensure whole batch is delayed for all required bitmap updates.
</pre>
</div>
</content>
</entry>
<entry>
<title>dm: fix false warning in free_rq_clone() for unmapped requests</title>
<updated>2015-05-29T15:07:36+00:00</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2015-05-28T19:12:52+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=e5d8de32cc02a259e1a237ab57cba00f2930fa6a'/>
<id>e5d8de32cc02a259e1a237ab57cba00f2930fa6a</id>
<content type='text'>
When stacking request-based dm device on non blk-mq device and
device-mapper target could not map the request (error target is used,
multipath target with all paths down, etc), the WARN_ON_ONCE() in
free_rq_clone() will trigger when it shouldn't.

The warning was added by commit aa6df8d ("dm: fix free_rq_clone() NULL
pointer when requeueing unmapped request").  But free_rq_clone() with
clone-&gt;q == NULL is valid usage for the case where
dm_kill_unmapped_request() initiates request cleanup.

Fix this false warning by just removing the WARN_ON -- it only generated
false positives and was never useful in catching the intended case
(completing clone request not being mapped e.g. clone-&gt;q being NULL).

Fixes: aa6df8d ("dm: fix free_rq_clone() NULL pointer when requeueing unmapped request")
Reported-by: Bart Van Assche &lt;bart.vanassche@sandisk.com&gt;
Reported-by: Jun'ichi Nomura &lt;j-nomura@ce.jp.nec.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When stacking request-based dm device on non blk-mq device and
device-mapper target could not map the request (error target is used,
multipath target with all paths down, etc), the WARN_ON_ONCE() in
free_rq_clone() will trigger when it shouldn't.

The warning was added by commit aa6df8d ("dm: fix free_rq_clone() NULL
pointer when requeueing unmapped request").  But free_rq_clone() with
clone-&gt;q == NULL is valid usage for the case where
dm_kill_unmapped_request() initiates request cleanup.

Fix this false warning by just removing the WARN_ON -- it only generated
false positives and was never useful in catching the intended case
(completing clone request not being mapped e.g. clone-&gt;q being NULL).

Fixes: aa6df8d ("dm: fix free_rq_clone() NULL pointer when requeueing unmapped request")
Reported-by: Bart Van Assche &lt;bart.vanassche@sandisk.com&gt;
Reported-by: Jun'ichi Nomura &lt;j-nomura@ce.jp.nec.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>md: fix race when unfreezing sync_action</title>
<updated>2015-05-28T08:04:45+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2015-05-28T07:53:29+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=56ccc1125bc141cf63927eda7febff4216dea2d3'/>
<id>56ccc1125bc141cf63927eda7febff4216dea2d3</id>
<content type='text'>
A recent change removed the need for locking around writing
to "sync_action" (and various other places), but introduced a
subtle race.
When e.g. setting 'reshape' on a 'frozen' array, the 'frozen'
flag is cleared before 'reshape' is set, so the md thread can
get in and start trying recovery - which isn't wanted.

So instead of clearing MD_RECOVERY_FROZEN for any command
except 'frozen', only clear it when each specific command
is parsed.  This allows the handling of 'reshape' to clear
the bit while a lock is held.

Also remove some places where we set MD_RECOVERY_NEEDED,
as it is always set on non-error exit of the function.


Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A recent change removed the need for locking around writing
to "sync_action" (and various other places), but introduced a
subtle race.
When e.g. setting 'reshape' on a 'frozen' array, the 'frozen'
flag is cleared before 'reshape' is set, so the md thread can
get in and start trying recovery - which isn't wanted.

So instead of clearing MD_RECOVERY_FROZEN for any command
except 'frozen', only clear it when each specific command
is parsed.  This allows the handling of 'reshape' to clear
the bit while a lock is held.

Also remove some places where we set MD_RECOVERY_NEEDED,
as it is always set on non-error exit of the function.


Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
</pre>
</div>
</content>
</entry>
<entry>
<title>md/raid5: break stripe-batches when the array has failed.</title>
<updated>2015-05-28T01:48:59+00:00</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2015-05-22T04:03:10+00:00</published>
<link rel='alternate' type='text/html' href='http://rtsrv.cs.unc.edu/cgit/cgit.cgi/litmus-rt.git/commit/?id=626f2092c85ac847bb80b3257eb6a565dec32278'/>
<id>626f2092c85ac847bb80b3257eb6a565dec32278</id>
<content type='text'>
Once the array has too much failure, we need to break
stripe-batches up so they can all be dealt with.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Once the array has too much failure, we need to break
stripe-batches up so they can all be dealt with.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</pre>
</div>
</content>
</entry>
</feed>
