aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMarc MERLIN <marc@merlins.org>2016-03-12 02:04:19 -0500
committerJonathan Corbet <corbet@lwn.net>2016-06-23 09:57:03 -0400
commitc9b2ffc02287771e70eb27132c47df6d5c7d91fb (patch)
tree97f4caf20b461288499f88456b4eb5c512b4ab8c
parentebc88ef05c825024a5d95285459b8c842c095c0f (diff)
bcache: documentation updates and corrections
Bcache documentation updates: - Added new HOWTO/COOKBOOK section - fixed a few typos - /sys/block/bcache0/cache_mode is /sys/block/bcache0/bcache/cache_mode Signed-off-by: Marc MERLIN <marc@merlins.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
-rw-r--r--Documentation/bcache.txt160
1 files changed, 152 insertions, 8 deletions
diff --git a/Documentation/bcache.txt b/Documentation/bcache.txt
index 32b6c3189d98..0aba405d3368 100644
--- a/Documentation/bcache.txt
+++ b/Documentation/bcache.txt
@@ -1,4 +1,4 @@
1Say you've got a big slow raid 6, and an X-25E or three. Wouldn't it be 1Say you've got a big slow raid 6, and an ssd or three. Wouldn't it be
2nice if you could use them as cache... Hence bcache. 2nice if you could use them as cache... Hence bcache.
3 3
4Wiki and git repositories are at: 4Wiki and git repositories are at:
@@ -8,7 +8,7 @@ Wiki and git repositories are at:
8 8
9It's designed around the performance characteristics of SSDs - it only allocates 9It's designed around the performance characteristics of SSDs - it only allocates
10in erase block sized buckets, and it uses a hybrid btree/log to track cached 10in erase block sized buckets, and it uses a hybrid btree/log to track cached
11extants (which can be anywhere from a single sector to the bucket size). It's 11extents (which can be anywhere from a single sector to the bucket size). It's
12designed to avoid random writes at all costs; it fills up an erase block 12designed to avoid random writes at all costs; it fills up an erase block
13sequentially, then issues a discard before reusing it. 13sequentially, then issues a discard before reusing it.
14 14
@@ -55,7 +55,10 @@ immediately. Without udev, you can manually register devices like this:
55Registering the backing device makes the bcache device show up in /dev; you can 55Registering the backing device makes the bcache device show up in /dev; you can
56now format it and use it as normal. But the first time using a new bcache 56now format it and use it as normal. But the first time using a new bcache
57device, it'll be running in passthrough mode until you attach it to a cache. 57device, it'll be running in passthrough mode until you attach it to a cache.
58See the section on attaching. 58If you are thinking about using bcache later, it is recommended to setup all your
59slow devices as bcache backing devices without a cache, and you can choose to add
60a caching device later.
61See 'ATTACHING' section below.
59 62
60The devices show up as: 63The devices show up as:
61 64
@@ -72,12 +75,14 @@ To get started:
72 mount /dev/bcache0 /mnt 75 mount /dev/bcache0 /mnt
73 76
74You can control bcache devices through sysfs at /sys/block/bcache<N>/bcache . 77You can control bcache devices through sysfs at /sys/block/bcache<N>/bcache .
78You can also control them through /sys/fs//bcache/<cset-uuid>/ .
75 79
76Cache devices are managed as sets; multiple caches per set isn't supported yet 80Cache devices are managed as sets; multiple caches per set isn't supported yet
77but will allow for mirroring of metadata and dirty data in the future. Your new 81but will allow for mirroring of metadata and dirty data in the future. Your new
78cache set shows up as /sys/fs/bcache/<UUID> 82cache set shows up as /sys/fs/bcache/<UUID>
79 83
80ATTACHING: 84ATTACHING
85---------
81 86
82After your cache device and backing device are registered, the backing device 87After your cache device and backing device are registered, the backing device
83must be attached to your cache set to enable caching. Attaching a backing 88must be attached to your cache set to enable caching. Attaching a backing
@@ -105,7 +110,8 @@ but all the cached data will be invalidated. If there was dirty data in the
105cache, don't expect the filesystem to be recoverable - you will have massive 110cache, don't expect the filesystem to be recoverable - you will have massive
106filesystem corruption, though ext4's fsck does work miracles. 111filesystem corruption, though ext4's fsck does work miracles.
107 112
108ERROR HANDLING: 113ERROR HANDLING
114--------------
109 115
110Bcache tries to transparently handle IO errors to/from the cache device without 116Bcache tries to transparently handle IO errors to/from the cache device without
111affecting normal operation; if it sees too many errors (the threshold is 117affecting normal operation; if it sees too many errors (the threshold is
@@ -127,7 +133,143 @@ the backing devices to passthrough mode.
127 writeback mode). It currently doesn't do anything intelligent if it fails to 133 writeback mode). It currently doesn't do anything intelligent if it fails to
128 read some of the dirty data, though. 134 read some of the dirty data, though.
129 135
130TROUBLESHOOTING PERFORMANCE: 136
137HOWTO/COOKBOOK
138--------------
139
140A) Your bcache doesn't start.
141 Starting and starting a bcache with a missing caching device
142
143Registering the backing device doesn't help, it's already there, you just need
144to force it to run without the cache:
145host:~# echo /dev/sdb1 > /sys/fs/bcache/register
146[ 119.844831] bcache: register_bcache() error opening /dev/sdb1: device already registered
147
148Next, you try to register your caching device if it's present. However if it's
149absent, or registration fails for some reason, you can still start your bcache
150without its cache, like so:
151host:/sys/block/sdb/sdb1/bcache# echo 1 > running
152
153
154B) Bcache not finding its cache and not starting
155
156This does not work:
157host:/sys/block/md5/bcache# echo 0226553a-37cf-41d5-b3ce-8b1e944543a8 > attach
158[ 1933.455082] bcache: bch_cached_dev_attach() Couldn't find uuid for md5 in set
159[ 1933.478179] bcache: __cached_dev_store() Can't attach 0226553a-37cf-41d5-b3ce-8b1e944543a8
160[ 1933.478179] : cache set not found
161
162In this case, the caching device was simply not registered at boot or
163disappeared and came back, and needs to be (re-)registered:
164host:/sys/block/md5/bcache# echo /dev/sdh2 > /sys/fs/bcache/register
165
166
167C) Corrupt bcache caching device crashes the kernel on startup/boot
168
169You'll have to wipe the caching device, start the backing device without the
170cache, and you can re-attach the cleaned up caching device then. This does
171require booting with a kernel/rescue media where bcache is disabled
172since it will otherwise try to access your device and probably crash
173again before you have a chance to wipe it.
174(or if you plan ahead, compile a backup kernel with bcache disabled and keep it
175in your grub config for a rainy day)
176If bcache is not available in the kernel, a filesystem on the backing device is
177still available at an 8KiB offset. So either via a loopdev of the backing device
178created with --offset 8K or by temporarily increasing the start sector of the
179partition by 16 (512byte sectors).
180
181This is how you wipe the caching device:
182host:~# wipefs -a /dev/sdh2
18316 bytes were erased at offset 0x1018 (bcache)
184they were: c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
185
186After you boot back with bcache enabled, you recreate the cache and attach it:
187host:~# make-bcache -C /dev/sdh2
188UUID: 7be7e175-8f4c-4f99-94b2-9c904d227045
189Set UUID: 5bc072a8-ab17-446d-9744-e247949913c1
190version: 0
191nbuckets: 106874
192block_size: 1
193bucket_size: 1024
194nr_in_set: 1
195nr_this_dev: 0
196first_bucket: 1
197[ 650.511912] bcache: run_cache_set() invalidating existing data
198[ 650.549228] bcache: register_cache() registered cache device sdh2
199
200start backing device with missing cache:
201host:/sys/block/md5/bcache# echo 1 > running
202
203attach new cache:
204host:/sys/block/md5/bcache# echo 5bc072a8-ab17-446d-9744-e247949913c1 > attach
205[ 865.276616] bcache: bch_cached_dev_attach() Caching md5 as bcache0 on set 5bc072a8-ab17-446d-9744-e247949913c1
206
207
208D) Remove or replace a caching device
209
210host:/sys/block/sda/sda7/bcache# echo 1 > detach
211[ 695.872542] bcache: cached_dev_detach_finish() Caching disabled for sda7
212
213host:~# wipefs -a /dev/nvme0n1p4
214wipefs: error: /dev/nvme0n1p4: probing initialization failed: Device or resource busy
215Ooops, it's disabled, but not unregistered, so it's still protected
216
217We need to go and unregister it:
218host:/sys/fs/bcache/b7ba27a1-2398-4649-8ae3-0959f57ba128# ls -l cache0
219lrwxrwxrwx 1 root root 0 Feb 25 18:33 cache0 -> ../../../devices/pci0000:00/0000:00:1d.0/0000:70:00.0/nvme/nvme0/nvme0n1/nvme0n1p4/bcache/
220host:/sys/fs/bcache/b7ba27a1-2398-4649-8ae3-0959f57ba128# echo 1 > stop
221kernel: [ 917.041908] bcache: cache_set_free() Cache set b7ba27a1-2398-4649-8ae3-0959f57ba128 unregistered
222
223Now we can wipe it:
224host:~# wipefs -a /dev/nvme0n1p4
225/dev/nvme0n1p4: 16 bytes were erased at offset 0x00001018 (bcache): c6 85 73 f6 4e 1a 45 ca 82 65 f5 7f 48 ba 6d 81
226
227
228E) dmcrypt and bcache
229
230First setup bcache unencrypted and then install dmcrypt on top of /dev/bcache<N>
231This will work faster than if you dmcrypt both the backing and caching
232devices and then install bcache on top.
233
234
235F) Stop/free a registered bcache to wipe and/or recreate it
236(or maybe you need to free up all bcache references so that you can have fdisk
237run and re-register a changed partition table, which won't work if there are any
238active backing or caching devices left on it)
239
2401) Is it present in /dev/bcache* ? (there are times where it won't be)
241If so, it's easy:
242host:/sys/block/bcache0/bcache# echo 1 > stop
243
2442) But if your backing device is gone, this won't work:
245host:/sys/block/bcache0# cd bcache
246bash: cd: bcache: No such file or directory
247
248In this case, you may have to unregister the dmcrypt block device that
249references this bcache to free it up:
250host:~# dmsetup remove oldds1
251bcache: bcache_device_free() bcache0 stopped
252bcache: cache_set_free() Cache set 5bc072a8-ab17-446d-9744-e247949913c1 unregistered
253
254This causes the backing bcache to be removed from /sys/fs/bcache and then it can
255be reused
256
2573) In other cases, you can also look in /sys/fs/bcache/:
258host:/sys/fs/bcache# ls -l */{cache?,bdev?}
259lrwxrwxrwx 1 root root 0 Mar 5 09:39 0226553a-37cf-41d5-b3ce-8b1e944543a8/bdev1 -> ../../../devices/virtual/block/dm-1/bcache/
260lrwxrwxrwx 1 root root 0 Mar 5 09:39 0226553a-37cf-41d5-b3ce-8b1e944543a8/cache0 -> ../../../devices/virtual/block/dm-4/bcache/
261lrwxrwxrwx 1 root root 0 Mar 5 09:39 5bc072a8-ab17-446d-9744-e247949913c1/cache0 -> ../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0/ata10/host9/target9:0:0/9:0:0:0/block/sdl/sdl2/bcache/
262
263The device names will show which UUID is relevant, cd in that directory
264and stop the cache:
265host:/sys/fs/bcache/5bc072a8-ab17-446d-9744-e247949913c1# echo 1 > stop
266this will free up bcache references and let you reuse the partition for other
267purposes.
268
269
270
271TROUBLESHOOTING PERFORMANCE
272---------------------------
131 273
132Bcache has a bunch of config options and tunables. The defaults are intended to 274Bcache has a bunch of config options and tunables. The defaults are intended to
133be reasonable for typical desktop and server workloads, but they're not what you 275be reasonable for typical desktop and server workloads, but they're not what you
@@ -140,7 +282,7 @@ want for getting the best possible numbers when benchmarking.
140 maturity, but simply because in writeback mode you'll lose data if something 282 maturity, but simply because in writeback mode you'll lose data if something
141 happens to your SSD) 283 happens to your SSD)
142 284
143 # echo writeback > /sys/block/bcache0/cache_mode 285 # echo writeback > /sys/block/bcache0/bcache/cache_mode
144 286
145 - Bad performance, or traffic not going to the SSD that you'd expect 287 - Bad performance, or traffic not going to the SSD that you'd expect
146 288
@@ -193,7 +335,9 @@ want for getting the best possible numbers when benchmarking.
193 Solution: warm the cache by doing writes, or use the testing branch (there's 335 Solution: warm the cache by doing writes, or use the testing branch (there's
194 a fix for the issue there). 336 a fix for the issue there).
195 337
196SYSFS - BACKING DEVICE: 338
339SYSFS - BACKING DEVICE
340----------------------
197 341
198Available at /sys/block/<bdev>/bcache, /sys/block/bcache*/bcache and 342Available at /sys/block/<bdev>/bcache, /sys/block/bcache*/bcache and
199(if attached) /sys/fs/bcache/<cset-uuid>/bdev* 343(if attached) /sys/fs/bcache/<cset-uuid>/bdev*