aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/md
Commit message (Collapse)AuthorAge
* [PATCH] md: fix --re-add for raid1 and raid6NeilBrown2005-11-28
| | | | | | | | | | | | | | | | If you have an array with a write-intent-bitmap, and you remove a device, then re-add it, a full recovery isn't needed. We detect a re-add by looking at saved_raid_disk. For raid1, it doesn't matter which disk it was, only whether or not it was an active device. The old code being removed set a value of 'mirror' which was then ignored, so it can go. The changed code performs the correct check. For raid6, if there are two missing devices, make sure we chose the right slot on --re-add rather than always the first slot. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: set default_bitmap_offset properly in set_array_infoNeilBrown2005-11-28
| | | | | | | | | | If an array is created using set_array_info, default_bitmap_offset isn't set properly meaning that an internal bitmap cannot be hot-added until the array is stopped and re-assembled. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: fix problem with raid6 intent bitmapNeilBrown2005-11-28
| | | | | | | | | | When doing a recovery, we need to know whether the array will still be degraded after the recovery has finished, so we can know whether bits can be clearred yet or not. This patch performs the required check. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: fix locking problem in r5/r6NeilBrown2005-11-28
| | | | | | | | | bitmap_unplug actually writes data (bits) to storage, so we shouldn't be holding a spinlock... Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: improve read speed to raid10 arrays using 'far copies'NeilBrown2005-11-28
| | | | | | | | | | | | | | | | | | | | | | raid10 has two different layouts. One uses near-copies (so multiple copies of a block are at the same or similar offsets of different devices) and the other uses far-copies (so multiple copies of a block are stored a greatly different offsets on different devices). The point of far-copies is that it allows the first section (normally first half) to be layed out in normal raid0 style, and thus provide raid0 sequential read performance. Unfortunately, the read balancing in raid10 makes some poor decisions for far-copies arrays and you don't get the desired performance. So turn off that bad bit of read_balance for far-copies arrays. With this patch, read speed of an 'f2' array is comparable with a raid0 with the same number of devices, though write speed is ofcourse still very slow. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] device-mapper raid1: drop mark_region spinlock fixJonathan E Brassow2005-11-22
| | | | | | | | | | | | | | | The spinlock region_lock is held while calling mark_region which can sleep. Drop the spinlock before calling that function. A region's state and inclusion in the clean list are altered by rh_inc and rh_dec. The state variable is set to RH_CLEAN in rh_dec, but only if 'pending' is zero. It is set to RH_DIRTY in rh_inc, but not if it is already so. The changes to 'pending', the state, and the region's inclusion in the clean list need to be atomicly. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] device-mapper snapshot: bio_list fixjblunck@suse.de2005-11-22
| | | | | | | | bio_list_merge() should do nothing if the second list is empty - not oops. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] device-mapper dm-mpath: endio spinlock fixStefan Bader2005-11-22
| | | | | | | | do_end_io() can be called without interrupts blocked. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] device-mapper: mirror log bitset fixAlasdair G Kergon2005-11-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The linux bitset operators (test_bit, set_bit etc) work on arrays of "unsigned long". dm-log uses such bitsets but treats them as arrays of uint32_t, only allocating and zeroing a multiple of 4 bytes (as 'clean_bits' is a uint32_t). The patch below fixes this problem. The problem is specific to 64-bit big endian machines such as s390x or ppc-64 and can prevent pvmove terminating. In the simplest case, if "region_count" were (say) 30, then bitset_size (below) would be 4 and bitset_uint32_count would be 1. Thus the memory for this butset, after allocation and zeroing would be 0 0 0 0 X X X X On a bigendian 64bit machine, bit 0 for this bitset is in the 8th byte! (and every bit that dm-log would use would be in the X area). 0 0 0 0 X X X X ^ here which hasn't been cleared properly. As the dm-raid1 code only syncs and counts regions which have a 0 in the 'sync_bits' bitset, and only finishes when it has counted high enough, a large number of 1's among those 'X's will cause the sync to not complete. It is worth noting that the code uses the same bitsets for in-memory and on-disk logs. As these bitsets are host-endian and host-sized, this means that they cannot safely be moved between computers with Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] device-mapper: list_versions fixAlasdair G Kergon2005-11-22
| | | | | | | | | | | | | | | | | In some circumstances the LIST_VERSIONS output is truncated because the size calculation forgets about a 'uint32_t' in each structure - but the inclusion of the whole of ALIGN_MASK frequently compensates for the omission. This is a quick workaround to use an upper bound. (The code ought to be fixed to supply the actual size.) Running 'dmsetup targets' may demonstrate the problem: when I run it, the last line comes out as 'erro' instead of 'error'. Consequently, 'lvcreate --type error' doesn't work. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] device-mapper dm-ioctl: missing put in table load error caseKiyoshi Ueda2005-11-22
| | | | | | | | | An error path in table_load() forgets to release a table that won't now be referenced. Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: fix is_mddev_idle calculation now that disk/sector accounting ↵NeilBrown2005-11-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | happens when request completes md needs to monitor the rate of requests to its devices when doing resync/recovery so that it can back-off when there is non-resync IO. It does this by comparing resync IO, which it counts, with total IO which is taken from disk_stats. disk_stats were recently changed to account sectors when a request completes instead of when it is queued. This upsets md's calculations. We could do the sync_io accounting at the end of requests too, but that has problems. If an underlying device is an md array, the accounting will still be done when the request is submitted. This could be changed for some raid levels, but it cannot be changed for raid0 or linear without substantial code changes. So instead, we increase the error that is_mddev_idle allows, up to the maximum amount of resync IO that can be in flight at any time. The calculation is current fragile as each personality as different limits for in-flight resync. This should be fixed up. For now, this simple patch fixes the problem. Increasing the error margin decreases the sensitivity to non-resync IO. To partially compensate for this, the time to wait when non-resync IO is detected is increased so that less steady IO is required to keep the resync at bay. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: don't pass a NULL file* into ->prepare_write()Neil Brown2005-11-18
| | | | | | | | Some filesystems go oops. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: make md threads interruptible againNeilBrown2005-11-15
| | | | | | | | | | | | | Despite the fact that md threads don't need to be signalled, and won't respond to signals anyway, we need to have an 'interruptible' wait, else they stay in 'D' state and add to the load average. (akpm: the signal_pending() test is unneeded - we'll fix that up in the next round. For now, leave it there because that's how the code used to be). Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: mark START_ARRAY deprecated with a dateNeilBrown2005-11-15
| | | | | | | | | | This was marked deprecated "after 2.6" back in the 2.5 days. But now it seems there isn't going to be any "after 2.6", and we deprecate by date now. So set a date. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: document sysfs usage of md, and make a couple of small refinementsNeilBrown2005-11-09
| | | | | | | | | | | Document in Documentation/md.txt the files that now appear in sysfs, and make a couple of small refinements to exactly when 'level' and 'raid_disks' are empty, to make it match the documentation. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: improve 'scan_mode' and rename it to 'sync_action'NeilBrown2005-11-09
| | | | | | | | | | | | | | | | | | | | | | The current sync_action for an array can be one of idle - nothing happening resync - reduncancy being recalcualted recover - missing device being recoverred to spare check - user initiated check of redundancy repair - like resync but user-initiated and ignores bitmap optimisation. Each of these strings can also be written to the 'sync_action' file to cause that action to happen (if appropriate). While 'sync' is not technically correct, as a recovery is *not* a 'sync', I think it is the most servicable word here. Also 'action' is a strong word than 'mode'. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: complete conversion of md to use kthreadsNeilBrown2005-11-09
| | | | | | | | | | | | | | | | | | | | There are a few loose ends following the conversion of md to use kthreads: - Some fields in mdk_thread_t that aren't needed (kthreads does it's own completion and manages it's own name). - thread->run is now never NULL, so no need to check - Some tests for signal_pending that aren't needed (As we don't use signals to stop threads any more) - Some flush_signals are not needed - Some waits are interruptible and don't need to be. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: ignore auto-readonly flag for arrays where it isn't meaningfulNeilBrown2005-11-09
| | | | | | | | | | | | | The 'auto-readonly' flag (which suppresses resync and superblock updates until the first write) is not meaningful for personalities that don't support resync or superblock writes (raid0, linear, etc). So clear the setting early to avoid it confusing anything - e.g. appearing in /proc/mdstat Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: only try to print recovery/resync status for personalities that ↵NeilBrown2005-11-09
| | | | | | | | | | | | | support recovery The introduction of 'resync=PENDING' (for read-only devices) caused that message to appear for non-syncable arrays like raid0 and linear. Simplest thing is to not try to print any resync info unless the personality clearly supports it. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: split off some md attributes in sysfs to a separate groupNeilBrown2005-11-09
| | | | | | | | | | | | | | | | | Some, but not all, md array support data redundancy and hence support checking and restoring that redundancy (resync, rebuild). Some attributes apply specifically to functions involving this redundancy, and so should only appear for md arrays for which they are meaningful. i.e. they should not appear for raid0, linear, multpath, faulty. This patch separates these into a distinct group and creates the group only if the personality supports sync_request. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: fix some locking and module refcounting issues with md's use of ↵NeilBrown2005-11-09
| | | | | | | | | | | | | | | | | | | | | | | | | sysfs 1/ I really should be using the __ATTR macros for defining attributes, so that the .owner field get set properly, otherwise modules can be removed while sysfs files are open. This also involves some name changes of _show routines. 2/ Always lock the mddev (against reconfiguration) for all sysfs attribute access. This easily avoid certain races and is completely consistant with other interfaces (ioctl and /proc/mdstat both always lock against reconfiguration). 3/ raid5 attributes must check that the 'conf' structure actually exists (the array could have been stopped while an attribute file was open). 4/ A missing 'kfree' from when the raid5_conf_t was converted to have a kobject embedded, and then converted back again. Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: make sure a user-request sync of raid5 ignores intent bitmapNeilBrown2005-11-09
| | | | | | | | | A sync of raid5 usually ignore blocks which the bitmap says are in-sync. But a user-request check or repair should not ignore these. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: make manual repair work for raid1NeilBrown2005-11-09
| | | | | | | | | | Raid1 currently optimises resync using the intent bitmap etc. This optimisation is not wanted when we explicitly request a repair through sysfs, so add appropriate checks. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: make sure /block link in /sys/.../md/ goes to correct devicesNeilBrown2005-11-09
| | | | | | | | | | | | | | | | If a block_device is a partition, then it's kobject is bdev->bd_part->kobj otherwise (if it is a full device), the kobject is bdev->bd_disk->kobj As md wants back-links to the correct object (whether partition or not), we need to respect this difference... (Thus current code shows a link to the whole device, whether we are using a partition or not, which is wrong). Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: allow md arrays to be started read-only (module parameter).NeilBrown2005-11-09
| | | | | | | | | | | | | | | | | | | | | When an md array is started, the superblock will be written, and resync may commense. This is not good if you want to be completely read-only as, for example, when preparing to resume from a suspend-to-disk image. So introduce a module parameter "start_ro" which can be set to '1' at boot, at module load, or via /sys/module/md_mod/parameters/start_ro When this is set, new arrays get an 'auto-ro' mode, which disables all internal io (superblock updates, resync, recovery) and is automatically switched to 'rw' when the first write request arrives. The array can be set to true 'ro' mode using 'mdadm -r' before the first write request, or resync can be started without a write using 'mdadm -w'. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: Remove attempt to use dynamic names in sysfs for component ↵NeilBrown2005-11-09
| | | | | | | | | | | | | | | | | | | | devices on an MD array. With version-0.90 superblock, component devices on an md device to not have any stable name related to the array -(version-1 assigns a fixed index when a device is added to an array, and this remains despit any hot-swap). The intial code for making these devices appear in sysfs used dynamic names, which would change whenever a hot-spare was swapped for a failed or missing device. This turns out not to be practical in sysfs for a number of reasons. This patch changes then naming of component devices to be based on the result of 'bdevname'. This is stable and should be unique. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: support BIO_RW_BARRIER for md/raid1NeilBrown2005-11-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can only accept BARRIER requests if all slaves handle barriers, and that can, of course, change with time.... So we keep track of whether the whole array seems safe for barriers, and also whether each individual rdev handles barriers. We initially assumes barriers are OK. When writing the superblock we try a barrier, and if that fails, we flag things for no-barriers. This will usually clear the flags fairly quickly. If writing the superblock finds that BIO_RW_BARRIER is -ENOTSUPP, we need to resubmit, so introduce function "md_super_wait" which waits for requests to finish, and retries ENOTSUPP requests without the barrier flag. When writing the real raid1, write requests which were BIO_RW_BARRIER but which aresn't supported need to be retried. So raid1d is enhanced to do this, and when any bio write completes (i.e. no retry needed) we remove it from the r1bio, so that devices needing retry are easy to find. We should hardly ever get -ENOTSUPP errors when writing data to the raid. It should only happen if: 1/ the device used to support BARRIER, but now doesn't. Few devices change like this, though raid1 can! or 2/ the array has no persistent superblock, so there was no opportunity to pre-test for barriers when writing the superblock. Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: make md on-disk bitmaps not host-endianNeilBrown2005-11-09
| | | | | | | | | | | Current bitmaps use set_bit et.al and so are host-endian, which means not-portable. Oops. Define a new version number (4) for which bitmaps are little-endian. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: convert 'faulty' and 'in_sync' fields to bits in 'flags' fieldNeilBrown2005-11-09
| | | | | | | | | This has the advantage of removing the confusion caused by 'rdev_t' and 'mddev_t' both having 'in_sync' fields. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: improvements to raid5 handling of read errorsNeilBrown2005-11-09
| | | | | | | | | | | | | Two refinements to the 'attempt-overwrite-on-read-error' mechanism. 1/ If the array is read-only, don't attempt an over-write. 2/ If there are more than max_nr_stripes read errors on a device with no success, fail the drive. This will make sure a dead drive will be eventually kicked even when we aren't trying to rewrite (which would normally kick a dead drive more quickly. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: change raid5 sysfs attribute to not create a new directoryNeilBrown2005-11-09
| | | | | | | | | | | | | | | There isn't really a need for raid5 attributes to be an a subdirectory, so this patch moves them from /sys/block/mdX/md/raid5/attribute to /sys/block/mdX/md/attribute This suggests that all md personalities should co-operate about namespace usage, but that shouldn't be a problem. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: minor MD fixesNeilBrown2005-11-09
| | | | | | | | | | | | | 1/ Use reduce stack usage, because 'gcc' apparently doesn't overlay different variables that are in separate scopes... 2/ Use test_bit instead of ( .. & 1<< ..) which in this case is buggy. Thanks to Andrew Morton Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: fix ref-counting problems with kobjects in mdNeilBrown2005-11-09
| | | | | | | | | Thanks Greg. Cc: Greg KH <greg@kroah.com> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: provide proper rcu_dereference / rcu_assign_pointer annotations ↵Suzanne Wood2005-11-09
| | | | | | | | | | in md Acked-by: <paulmck@us.ibm.com> Signed-off-by: Suzanne Wood <suzannew@cs.pdx.edu> Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: teach raid5 the difference between 'check' and 'repair'.NeilBrown2005-11-09
| | | | | | | | | | | With this, raid5 can be asked to check parity without repairing it. It also keeps a count of the number of incorrect parity blocks found (mismatches) and reports them through sysfs. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: allow a manual resync with mdNeilBrown2005-11-09
| | | | | | | | | | | | | | | | | | You can trigger a 'check' with echo check > /sys/block/mdX/md/scan_mode or a check-and-repair errors with echo repair > /sys/block/mdX/md/scan_mode and read the current state from the same file. Note: personalities need to know the different between 'check' and 'repair', but don't yet. Until they do, 'check' will be the same as 'repair' and will just do a normal resync pass. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: add kobject/sysfs support to raid5NeilBrown2005-11-09
| | | | | | | | | | | | | | | /sys/block/mdX/md/raid5/ contains raid5-related attributes. Currently stripe_cache_size is number of entries in stripe cache, and is settable. stripe_cache_active is number of active entries, and in only readable. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: extend md sysfs support to component devices.NeilBrown2005-11-09
| | | | | | | | | | | | | | | | | Each device in an md array how has a corresponding /sys/block/mdX/md/devNN/ directory which can contain attributes. Currently there is only 'state' which summarises the state, nd 'super' which has a copy of the superblock, and 'block' which is a symlink to the block device. Also, /sys/block/mdX/md/rdNN represents slot 'NN' in the array, and is a symlink to the relevant 'devNN'. Obviously spare devices do not have a slot in the array, and so don't have such a symlink. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: initial sysfs support for mdNeilBrown2005-11-09
| | | | | | | | | | | Start using kobjects in mddevs, and provide a couple of simple attributes (level and disks). Attributes live in /sys/block/mdX/md/attr-name Signed-off-by: Neil Brown <neilb@suse.de> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: better handling of readerrors with raid5.NeilBrown2005-11-09
| | | | | | | | | | | | | | | | | This patch changes the behaviour of raid5 when it gets a read error. Instead of just failing the device, it tried to find out what should have been there, and writes it over the bad block. For some media-errors, this has a reasonable chance of fixing the error. If the write succeeds, and a subsequent read succeeds as well, raid5 decided the address is OK and conitnues. Instead of failing a drive on read-error, we attempt to re-write the block, and then re-read. If that all works, we allow the device to remain in the array. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] changing CONFIG_LOCALVERSION rebuilds too much, for no good reasonOlaf Hering2005-11-09
| | | | | | | | | | | | | | | | | | | | | This patch removes almost all inclusions of linux/version.h. The 3 #defines are unused in most of the touched files. A few drivers use the simple KERNEL_VERSION(a,b,c) macro, which is unfortunatly in linux/version.h. There are also lots of #ifdef for long obsolete kernels, this was not touched. In a few places, the linux/version.h include was move to where the LINUX_VERSION_CODE was used. quilt vi `find * -type f -name "*.[ch]"|xargs grep -El '(UTS_RELEASE|LINUX_VERSION_CODE|KERNEL_VERSION|linux/version.h)'|grep -Ev '(/(boot|coda|drm)/|~$)'` search pattern: /UTS_RELEASE\|LINUX_VERSION_CODE\|KERNEL_VERSION\|linux\/\(utsname\|version\).h Signed-off-by: Olaf Hering <olh@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] drivers/md: fix-up schedule_timeout() usageNishanth Aravamudan2005-11-07
| | | | | | | | | | Use schedule_timeout_interruptible() instead of set_current_state()/schedule_timeout() to reduce kernel size. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [BLOCK] Unify the seperate read/write io stat fields into arraysJens Axboe2005-11-01
| | | | | | | | | | Instead of having ->read_sectors and ->write_sectors, combine the two into ->sectors[2] and similar for the other fields. This saves a branch several places in the io path, since we don't have to care for what the actual io direction is. On my x86-64 box, that's 200 bytes less text in just the core (not counting the various drivers). Signed-off-by: Jens Axboe <axboe@suse.de>
* [PATCH] Use sg_set_buf/sg_init_one where applicableDavid Hardeman2005-10-29
| | | | | | | | | | | | | This patch uses sg_set_buf/sg_init_one in some places where it was duplicated. Signed-off-by: David Hardeman <david@2gen.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Greg KH <greg@kroah.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* [PATCH] gfp_t: remaining bits of drivers/*Al Viro2005-10-28
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] md: make sure mdthreads will always respond to kthread_stopNeilBrown2005-10-26
| | | | | | | | | | | There are still a couple of cases where md threads (the resync/recovery thread) is not interruptible since the change to use kthreads. All places there it tests "signal_pending", it should also test kthread_should_stop, as with this patch. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] Three one-liners in md.cNeilBrown2005-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main problem fixes is that in certain situations stopping md arrays may take longer than you expect, or may require multiple attempts. This would only happen when resync/recovery is happening. This patch fixes three vaguely related bugs. 1/ The recent change to use kthreads got the setting of the process name wrong. This fixes it. 2/ The recent change to use kthreads lost the ability for md threads to be signalled with SIG_KILL. This restores that. 3/ There is a long standing bug in that if: - An array needs recovery (onto a hot-spare) and - The recovery is being blocked because some other array being recovered shares a physical device and - The recovery thread is killed with SIG_KILL Then the recovery will appear to have completed with no IO being done, which can cause data corruption. This patch makes sure that incomplete recovery will be treated as incomplete. Note that any kernel affected by bug 2 will not suffer the problem of bug 3, as the signal can never be delivered. Thus the current 2.6.14-rc kernels are not susceptible to data corruption. Note also that if arrays are shutdown (with "mdadm -S" or "raidstop") then the problem doesn't occur. It only happens if a SIGKILL is independently delivered as done by 'init' when shutting down. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] gfp flags annotations - part 1Al Viro2005-10-08
| | | | | | | | | | | | - added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] device-mapper: Fix queue_if_no_path initialisationAlasdair G Kergon2005-09-28
| | | | | | | | | | | | | | | | | | When creating a multipath device, if the queue_if_no_path parameter is specified it gets ignored. While the queue_if_no_path variable is correctly set to 1, the saved_queue_if_no_path gets set to 0. When the device is subsequently made live (resumed), the saved value (0) always overwrites the live value (1) so the option *always* gets turned off. The fix adds a parameter to the queue_if_no_path() function to indicate whether the previous value should be preserved or not - if not, as when the device is being set up, the saved value is set to the new value (1). Signed-Off-By: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>