diff options
author | Christoph Hellwig <hch@lst.de> | 2010-09-03 05:56:17 -0400 |
---|---|---|
committer | Jens Axboe <jaxboe@fusionio.com> | 2010-09-10 06:35:37 -0400 |
commit | 04ccc65cd1f57aee861708e08cd2272c5a0d088c (patch) | |
tree | 60b72d9acc0f94f9abfd3a3cee7de2e58549c9eb /Documentation/block/writeback_cache_control.txt | |
parent | 09d60c701b64b509f328cac72970eb894f485b9e (diff) |
block: update documentation for REQ_FLUSH / REQ_FUA
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Diffstat (limited to 'Documentation/block/writeback_cache_control.txt')
-rw-r--r-- | Documentation/block/writeback_cache_control.txt | 86 |
1 files changed, 86 insertions, 0 deletions
diff --git a/Documentation/block/writeback_cache_control.txt b/Documentation/block/writeback_cache_control.txt new file mode 100644 index 000000000000..83407d36630a --- /dev/null +++ b/Documentation/block/writeback_cache_control.txt | |||
@@ -0,0 +1,86 @@ | |||
1 | |||
2 | Explicit volatile write back cache control | ||
3 | ===================================== | ||
4 | |||
5 | Introduction | ||
6 | ------------ | ||
7 | |||
8 | Many storage devices, especially in the consumer market, come with volatile | ||
9 | write back caches. That means the devices signal I/O completion to the | ||
10 | operating system before data actually has hit the non-volatile storage. This | ||
11 | behavior obviously speeds up various workloads, but it means the operating | ||
12 | system needs to force data out to the non-volatile storage when it performs | ||
13 | a data integrity operation like fsync, sync or an unmount. | ||
14 | |||
15 | The Linux block layer provides two simple mechanisms that let filesystems | ||
16 | control the caching behavior of the storage device. These mechanisms are | ||
17 | a forced cache flush, and the Force Unit Access (FUA) flag for requests. | ||
18 | |||
19 | |||
20 | Explicit cache flushes | ||
21 | ---------------------- | ||
22 | |||
23 | The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from | ||
24 | the filesystem and will make sure the volatile cache of the storage device | ||
25 | has been flushed before the actual I/O operation is started. This explicitly | ||
26 | guarantees that previously completed write requests are on non-volatile | ||
27 | storage before the flagged bio starts. In addition the REQ_FLUSH flag can be | ||
28 | set on an otherwise empty bio structure, which causes only an explicit cache | ||
29 | flush without any dependent I/O. It is recommend to use | ||
30 | the blkdev_issue_flush() helper for a pure cache flush. | ||
31 | |||
32 | |||
33 | Forced Unit Access | ||
34 | ----------------- | ||
35 | |||
36 | The REQ_FUA flag can be OR ed into the r/w flags of a bio submitted from the | ||
37 | filesystem and will make sure that I/O completion for this request is only | ||
38 | signaled after the data has been committed to non-volatile storage. | ||
39 | |||
40 | |||
41 | Implementation details for filesystems | ||
42 | -------------------------------------- | ||
43 | |||
44 | Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to | ||
45 | worry if the underlying devices need any explicit cache flushing and how | ||
46 | the Forced Unit Access is implemented. The REQ_FLUSH and REQ_FUA flags | ||
47 | may both be set on a single bio. | ||
48 | |||
49 | |||
50 | Implementation details for make_request_fn based block drivers | ||
51 | -------------------------------------------------------------- | ||
52 | |||
53 | These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit | ||
54 | directly below the submit_bio interface. For remapping drivers the REQ_FUA | ||
55 | bits need to be propagated to underlying devices, and a global flush needs | ||
56 | to be implemented for bios with the REQ_FLUSH bit set. For real device | ||
57 | drivers that do not have a volatile cache the REQ_FLUSH and REQ_FUA bits | ||
58 | on non-empty bios can simply be ignored, and REQ_FLUSH requests without | ||
59 | data can be completed successfully without doing any work. Drivers for | ||
60 | devices with volatile caches need to implement the support for these | ||
61 | flags themselves without any help from the block layer. | ||
62 | |||
63 | |||
64 | Implementation details for request_fn based block drivers | ||
65 | -------------------------------------------------------------- | ||
66 | |||
67 | For devices that do not support volatile write caches there is no driver | ||
68 | support required, the block layer completes empty REQ_FLUSH requests before | ||
69 | entering the driver and strips off the REQ_FLUSH and REQ_FUA bits from | ||
70 | requests that have a payload. For devices with volatile write caches the | ||
71 | driver needs to tell the block layer that it supports flushing caches by | ||
72 | doing: | ||
73 | |||
74 | blk_queue_flush(sdkp->disk->queue, REQ_FLUSH); | ||
75 | |||
76 | and handle empty REQ_FLUSH requests in its prep_fn/request_fn. Note that | ||
77 | REQ_FLUSH requests with a payload are automatically turned into a sequence | ||
78 | of an empty REQ_FLUSH request followed by the actual write by the block | ||
79 | layer. For devices that also support the FUA bit the block layer needs | ||
80 | to be told to pass through the REQ_FUA bit using: | ||
81 | |||
82 | blk_queue_flush(sdkp->disk->queue, REQ_FLUSH | REQ_FUA); | ||
83 | |||
84 | and the driver must handle write requests that have the REQ_FUA bit set | ||
85 | in prep_fn/request_fn. If the FUA bit is not natively supported the block | ||
86 | layer turns it into an empty REQ_FLUSH request after the actual write. | ||