aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2012-10-10 20:04:23 -0400
committerLinus Torvalds <torvalds@linux-foundation.org>2012-10-10 20:04:23 -0400
commitce40be7a820bb393ac4ac69865f018d2f4038cf0 (patch)
treeb1fe5a93346eb06f22b1c303d63ec5456d7212ab /Documentation
parentba0a5a36f60e4c1152af3a2ae2813251974405bf (diff)
parent02f3939e1a9357b7c370a4a69717cf9c02452737 (diff)
Merge branch 'for-3.7/core' of git://git.kernel.dk/linux-block
Pull block IO update from Jens Axboe: "Core block IO bits for 3.7. Not a huge round this time, it contains: - First series from Kent cleaning up and generalizing bio allocation and freeing. - WRITE_SAME support from Martin. - Mikulas patches to prevent O_DIRECT crashes when someone changes the block size of a device. - Make bio_split() work on data-less bio's (like trim/discards). - A few other minor fixups." Fixed up silent semantic mis-merge as per Mikulas Patocka and Andrew Morton. It is due to the VM no longer using a prio-tree (see commit 6b2dbba8b6ac: "mm: replace vma prio_tree with an interval tree"). So make set_blocksize() use mapping_mapped() instead of open-coding the internal VM knowledge that has changed. * 'for-3.7/core' of git://git.kernel.dk/linux-block: (26 commits) block: makes bio_split support bio without data scatterlist: refactor the sg_nents scatterlist: add sg_nents fs: fix include/percpu-rwsem.h export error percpu-rw-semaphore: fix documentation typos fs/block_dev.c:1644:5: sparse: symbol 'blkdev_mmap' was not declared blockdev: turn a rw semaphore into a percpu rw semaphore Fix a crash when block device is read and block size is changed at the same time block: fix request_queue->flags initialization block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() block: ioctl to zero block ranges block: Make blkdev_issue_zeroout use WRITE SAME block: Implement support for WRITE SAME block: Consolidate command flag and queue limit checks for merges block: Clean up special command handling logic block/blk-tag.c: Remove useless kfree block: remove the duplicated setting for congestion_threshold block: reject invalid queue attribute values block: Add bio_clone_bioset(), bio_clone_kmalloc() block: Consolidate bio_alloc_bioset(), bio_kmalloc() ...
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/ABI/testing/sysfs-block14
-rw-r--r--Documentation/block/biodoc.txt5
-rw-r--r--Documentation/percpu-rw-semaphore.txt27
3 files changed, 41 insertions, 5 deletions
diff --git a/Documentation/ABI/testing/sysfs-block b/Documentation/ABI/testing/sysfs-block
index c1eb41cb9876..279da08f7541 100644
--- a/Documentation/ABI/testing/sysfs-block
+++ b/Documentation/ABI/testing/sysfs-block
@@ -206,3 +206,17 @@ Description:
206 when a discarded area is read the discard_zeroes_data 206 when a discarded area is read the discard_zeroes_data
207 parameter will be set to one. Otherwise it will be 0 and 207 parameter will be set to one. Otherwise it will be 0 and
208 the result of reading a discarded area is undefined. 208 the result of reading a discarded area is undefined.
209
210What: /sys/block/<disk>/queue/write_same_max_bytes
211Date: January 2012
212Contact: Martin K. Petersen <martin.petersen@oracle.com>
213Description:
214 Some devices support a write same operation in which a
215 single data block can be written to a range of several
216 contiguous blocks on storage. This can be used to wipe
217 areas on disk or to initialize drives in a RAID
218 configuration. write_same_max_bytes indicates how many
219 bytes can be written in a single write same command. If
220 write_same_max_bytes is 0, write same is not supported
221 by the device.
222
diff --git a/Documentation/block/biodoc.txt b/Documentation/block/biodoc.txt
index e418dc0a7086..8df5e8e6dceb 100644
--- a/Documentation/block/biodoc.txt
+++ b/Documentation/block/biodoc.txt
@@ -465,7 +465,6 @@ struct bio {
465 bio_end_io_t *bi_end_io; /* bi_end_io (bio) */ 465 bio_end_io_t *bi_end_io; /* bi_end_io (bio) */
466 atomic_t bi_cnt; /* pin count: free when it hits zero */ 466 atomic_t bi_cnt; /* pin count: free when it hits zero */
467 void *bi_private; 467 void *bi_private;
468 bio_destructor_t *bi_destructor; /* bi_destructor (bio) */
469}; 468};
470 469
471With this multipage bio design: 470With this multipage bio design:
@@ -647,10 +646,6 @@ for a non-clone bio. There are the 6 pools setup for different size biovecs,
647so bio_alloc(gfp_mask, nr_iovecs) will allocate a vec_list of the 646so bio_alloc(gfp_mask, nr_iovecs) will allocate a vec_list of the
648given size from these slabs. 647given size from these slabs.
649 648
650The bi_destructor() routine takes into account the possibility of the bio
651having originated from a different source (see later discussions on
652n/w to block transfers and kvec_cb)
653
654The bio_get() routine may be used to hold an extra reference on a bio prior 649The bio_get() routine may be used to hold an extra reference on a bio prior
655to i/o submission, if the bio fields are likely to be accessed after the 650to i/o submission, if the bio fields are likely to be accessed after the
656i/o is issued (since the bio may otherwise get freed in case i/o completion 651i/o is issued (since the bio may otherwise get freed in case i/o completion
diff --git a/Documentation/percpu-rw-semaphore.txt b/Documentation/percpu-rw-semaphore.txt
new file mode 100644
index 000000000000..7d3c82431909
--- /dev/null
+++ b/Documentation/percpu-rw-semaphore.txt
@@ -0,0 +1,27 @@
1Percpu rw semaphores
2--------------------
3
4Percpu rw semaphores is a new read-write semaphore design that is
5optimized for locking for reading.
6
7The problem with traditional read-write semaphores is that when multiple
8cores take the lock for reading, the cache line containing the semaphore
9is bouncing between L1 caches of the cores, causing performance
10degradation.
11
12Locking for reading is very fast, it uses RCU and it avoids any atomic
13instruction in the lock and unlock path. On the other hand, locking for
14writing is very expensive, it calls synchronize_rcu() that can take
15hundreds of milliseconds.
16
17The lock is declared with "struct percpu_rw_semaphore" type.
18The lock is initialized percpu_init_rwsem, it returns 0 on success and
19-ENOMEM on allocation failure.
20The lock must be freed with percpu_free_rwsem to avoid memory leak.
21
22The lock is locked for read with percpu_down_read, percpu_up_read and
23for write with percpu_down_write, percpu_up_write.
24
25The idea of using RCU for optimized rw-lock was introduced by
26Eric Dumazet <eric.dumazet@gmail.com>.
27The code was written by Mikulas Patocka <mpatocka@redhat.com>