diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2017-07-04 00:13:25 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2017-07-04 00:13:25 -0400 |
commit | 650fc870a2ef35b83397eebd35b8c8df211bff78 (patch) | |
tree | 14a293fa894d0f166aa60f1f5ca672a2bdb312c0 /Documentation/filesystems | |
parent | f4dd029ee0b92b77769a1ac6dce03e829e74763e (diff) | |
parent | 1cb566ba5634d7593b8b2a0a5c83f1c9e14b2e09 (diff) |
Merge tag 'docs-4.13' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"There has been a fair amount of activity in the docs tree this time
around. Highlights include:
- Conversion of a bunch of security documentation into RST
- The conversion of the remaining DocBook templates by The Amazing
Mauro Machine. We can now drop the entire DocBook build chain.
- The usual collection of fixes and minor updates"
* tag 'docs-4.13' of git://git.lwn.net/linux: (90 commits)
scripts/kernel-doc: handle DECLARE_HASHTABLE
Documentation: atomic_ops.txt is core-api/atomic_ops.rst
Docs: clean up some DocBook loose ends
Make the main documentation title less Geocities
Docs: Use kernel-figure in vidioc-g-selection.rst
Docs: fix table problems in ras.rst
Docs: Fix breakage with Sphinx 1.5 and upper
Docs: Include the Latex "ifthen" package
doc/kokr/howto: Only send regression fixes after -rc1
docs-rst: fix broken links to dynamic-debug-howto in kernel-parameters
doc: Document suitability of IBM Verse for kernel development
Doc: fix a markup error in coding-style.rst
docs: driver-api: i2c: remove some outdated information
Documentation: DMA API: fix a typo in a function name
Docs: Insert missing space to separate link from text
doc/ko_KR/memory-barriers: Update control-dependencies example
Documentation, kbuild: fix typo "minimun" -> "minimum"
docs: Fix some formatting issues in request-key.rst
doc: ReSTify keys-trusted-encrypted.txt
doc: ReSTify keys-request-key.txt
...
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r-- | Documentation/filesystems/conf.py | 10 | ||||
-rw-r--r-- | Documentation/filesystems/index.rst | 317 | ||||
-rw-r--r-- | Documentation/filesystems/nfs/idmapper.txt | 2 |
3 files changed, 328 insertions, 1 deletions
diff --git a/Documentation/filesystems/conf.py b/Documentation/filesystems/conf.py new file mode 100644 index 000000000000..ea44172af5c4 --- /dev/null +++ b/Documentation/filesystems/conf.py | |||
@@ -0,0 +1,10 @@ | |||
1 | # -*- coding: utf-8; mode: python -*- | ||
2 | |||
3 | project = "Linux Filesystems API" | ||
4 | |||
5 | tags.add("subproject") | ||
6 | |||
7 | latex_documents = [ | ||
8 | ('index', 'filesystems.tex', project, | ||
9 | 'The kernel development community', 'manual'), | ||
10 | ] | ||
diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst new file mode 100644 index 000000000000..256e10eedba4 --- /dev/null +++ b/Documentation/filesystems/index.rst | |||
@@ -0,0 +1,317 @@ | |||
1 | ===================== | ||
2 | Linux Filesystems API | ||
3 | ===================== | ||
4 | |||
5 | The Linux VFS | ||
6 | ============= | ||
7 | |||
8 | The Filesystem types | ||
9 | -------------------- | ||
10 | |||
11 | .. kernel-doc:: include/linux/fs.h | ||
12 | :internal: | ||
13 | |||
14 | The Directory Cache | ||
15 | ------------------- | ||
16 | |||
17 | .. kernel-doc:: fs/dcache.c | ||
18 | :export: | ||
19 | |||
20 | .. kernel-doc:: include/linux/dcache.h | ||
21 | :internal: | ||
22 | |||
23 | Inode Handling | ||
24 | -------------- | ||
25 | |||
26 | .. kernel-doc:: fs/inode.c | ||
27 | :export: | ||
28 | |||
29 | .. kernel-doc:: fs/bad_inode.c | ||
30 | :export: | ||
31 | |||
32 | Registration and Superblocks | ||
33 | ---------------------------- | ||
34 | |||
35 | .. kernel-doc:: fs/super.c | ||
36 | :export: | ||
37 | |||
38 | File Locks | ||
39 | ---------- | ||
40 | |||
41 | .. kernel-doc:: fs/locks.c | ||
42 | :export: | ||
43 | |||
44 | .. kernel-doc:: fs/locks.c | ||
45 | :internal: | ||
46 | |||
47 | Other Functions | ||
48 | --------------- | ||
49 | |||
50 | .. kernel-doc:: fs/mpage.c | ||
51 | :export: | ||
52 | |||
53 | .. kernel-doc:: fs/namei.c | ||
54 | :export: | ||
55 | |||
56 | .. kernel-doc:: fs/buffer.c | ||
57 | :export: | ||
58 | |||
59 | .. kernel-doc:: block/bio.c | ||
60 | :export: | ||
61 | |||
62 | .. kernel-doc:: fs/seq_file.c | ||
63 | :export: | ||
64 | |||
65 | .. kernel-doc:: fs/filesystems.c | ||
66 | :export: | ||
67 | |||
68 | .. kernel-doc:: fs/fs-writeback.c | ||
69 | :export: | ||
70 | |||
71 | .. kernel-doc:: fs/block_dev.c | ||
72 | :export: | ||
73 | |||
74 | The proc filesystem | ||
75 | =================== | ||
76 | |||
77 | sysctl interface | ||
78 | ---------------- | ||
79 | |||
80 | .. kernel-doc:: kernel/sysctl.c | ||
81 | :export: | ||
82 | |||
83 | proc filesystem interface | ||
84 | ------------------------- | ||
85 | |||
86 | .. kernel-doc:: fs/proc/base.c | ||
87 | :internal: | ||
88 | |||
89 | Events based on file descriptors | ||
90 | ================================ | ||
91 | |||
92 | .. kernel-doc:: fs/eventfd.c | ||
93 | :export: | ||
94 | |||
95 | The Filesystem for Exporting Kernel Objects | ||
96 | =========================================== | ||
97 | |||
98 | .. kernel-doc:: fs/sysfs/file.c | ||
99 | :export: | ||
100 | |||
101 | .. kernel-doc:: fs/sysfs/symlink.c | ||
102 | :export: | ||
103 | |||
104 | The debugfs filesystem | ||
105 | ====================== | ||
106 | |||
107 | debugfs interface | ||
108 | ----------------- | ||
109 | |||
110 | .. kernel-doc:: fs/debugfs/inode.c | ||
111 | :export: | ||
112 | |||
113 | .. kernel-doc:: fs/debugfs/file.c | ||
114 | :export: | ||
115 | |||
116 | The Linux Journalling API | ||
117 | ========================= | ||
118 | |||
119 | Overview | ||
120 | -------- | ||
121 | |||
122 | Details | ||
123 | ~~~~~~~ | ||
124 | |||
125 | The journalling layer is easy to use. You need to first of all create a | ||
126 | journal_t data structure. There are two calls to do this dependent on | ||
127 | how you decide to allocate the physical media on which the journal | ||
128 | resides. The :c:func:`jbd2_journal_init_inode` call is for journals stored in | ||
129 | filesystem inodes, or the :c:func:`jbd2_journal_init_dev` call can be used | ||
130 | for journal stored on a raw device (in a continuous range of blocks). A | ||
131 | journal_t is a typedef for a struct pointer, so when you are finally | ||
132 | finished make sure you call :c:func:`jbd2_journal_destroy` on it to free up | ||
133 | any used kernel memory. | ||
134 | |||
135 | Once you have got your journal_t object you need to 'mount' or load the | ||
136 | journal file. The journalling layer expects the space for the journal | ||
137 | was already allocated and initialized properly by the userspace tools. | ||
138 | When loading the journal you must call :c:func:`jbd2_journal_load` to process | ||
139 | journal contents. If the client file system detects the journal contents | ||
140 | does not need to be processed (or even need not have valid contents), it | ||
141 | may call :c:func:`jbd2_journal_wipe` to clear the journal contents before | ||
142 | calling :c:func:`jbd2_journal_load`. | ||
143 | |||
144 | Note that jbd2_journal_wipe(..,0) calls | ||
145 | :c:func:`jbd2_journal_skip_recovery` for you if it detects any outstanding | ||
146 | transactions in the journal and similarly :c:func:`jbd2_journal_load` will | ||
147 | call :c:func:`jbd2_journal_recover` if necessary. I would advise reading | ||
148 | :c:func:`ext4_load_journal` in fs/ext4/super.c for examples on this stage. | ||
149 | |||
150 | Now you can go ahead and start modifying the underlying filesystem. | ||
151 | Almost. | ||
152 | |||
153 | You still need to actually journal your filesystem changes, this is done | ||
154 | by wrapping them into transactions. Additionally you also need to wrap | ||
155 | the modification of each of the buffers with calls to the journal layer, | ||
156 | so it knows what the modifications you are actually making are. To do | ||
157 | this use :c:func:`jbd2_journal_start` which returns a transaction handle. | ||
158 | |||
159 | :c:func:`jbd2_journal_start` and its counterpart :c:func:`jbd2_journal_stop`, | ||
160 | which indicates the end of a transaction are nestable calls, so you can | ||
161 | reenter a transaction if necessary, but remember you must call | ||
162 | :c:func:`jbd2_journal_stop` the same number of times as | ||
163 | :c:func:`jbd2_journal_start` before the transaction is completed (or more | ||
164 | accurately leaves the update phase). Ext4/VFS makes use of this feature to | ||
165 | simplify handling of inode dirtying, quota support, etc. | ||
166 | |||
167 | Inside each transaction you need to wrap the modifications to the | ||
168 | individual buffers (blocks). Before you start to modify a buffer you | ||
169 | need to call :c:func:`jbd2_journal_get_create_access()` / | ||
170 | :c:func:`jbd2_journal_get_write_access()` / | ||
171 | :c:func:`jbd2_journal_get_undo_access()` as appropriate, this allows the | ||
172 | journalling layer to copy the unmodified | ||
173 | data if it needs to. After all the buffer may be part of a previously | ||
174 | uncommitted transaction. At this point you are at last ready to modify a | ||
175 | buffer, and once you are have done so you need to call | ||
176 | :c:func:`jbd2_journal_dirty_metadata`. Or if you've asked for access to a | ||
177 | buffer you now know is now longer required to be pushed back on the | ||
178 | device you can call :c:func:`jbd2_journal_forget` in much the same way as you | ||
179 | might have used :c:func:`bforget` in the past. | ||
180 | |||
181 | A :c:func:`jbd2_journal_flush` may be called at any time to commit and | ||
182 | checkpoint all your transactions. | ||
183 | |||
184 | Then at umount time , in your :c:func:`put_super` you can then call | ||
185 | :c:func:`jbd2_journal_destroy` to clean up your in-core journal object. | ||
186 | |||
187 | Unfortunately there a couple of ways the journal layer can cause a | ||
188 | deadlock. The first thing to note is that each task can only have a | ||
189 | single outstanding transaction at any one time, remember nothing commits | ||
190 | until the outermost :c:func:`jbd2_journal_stop`. This means you must complete | ||
191 | the transaction at the end of each file/inode/address etc. operation you | ||
192 | perform, so that the journalling system isn't re-entered on another | ||
193 | journal. Since transactions can't be nested/batched across differing | ||
194 | journals, and another filesystem other than yours (say ext4) may be | ||
195 | modified in a later syscall. | ||
196 | |||
197 | The second case to bear in mind is that :c:func:`jbd2_journal_start` can block | ||
198 | if there isn't enough space in the journal for your transaction (based | ||
199 | on the passed nblocks param) - when it blocks it merely(!) needs to wait | ||
200 | for transactions to complete and be committed from other tasks, so | ||
201 | essentially we are waiting for :c:func:`jbd2_journal_stop`. So to avoid | ||
202 | deadlocks you must treat :c:func:`jbd2_journal_start` / | ||
203 | :c:func:`jbd2_journal_stop` as if they were semaphores and include them in | ||
204 | your semaphore ordering rules to prevent | ||
205 | deadlocks. Note that :c:func:`jbd2_journal_extend` has similar blocking | ||
206 | behaviour to :c:func:`jbd2_journal_start` so you can deadlock here just as | ||
207 | easily as on :c:func:`jbd2_journal_start`. | ||
208 | |||
209 | Try to reserve the right number of blocks the first time. ;-). This will | ||
210 | be the maximum number of blocks you are going to touch in this | ||
211 | transaction. I advise having a look at at least ext4_jbd.h to see the | ||
212 | basis on which ext4 uses to make these decisions. | ||
213 | |||
214 | Another wriggle to watch out for is your on-disk block allocation | ||
215 | strategy. Why? Because, if you do a delete, you need to ensure you | ||
216 | haven't reused any of the freed blocks until the transaction freeing | ||
217 | these blocks commits. If you reused these blocks and crash happens, | ||
218 | there is no way to restore the contents of the reallocated blocks at the | ||
219 | end of the last fully committed transaction. One simple way of doing | ||
220 | this is to mark blocks as free in internal in-memory block allocation | ||
221 | structures only after the transaction freeing them commits. Ext4 uses | ||
222 | journal commit callback for this purpose. | ||
223 | |||
224 | With journal commit callbacks you can ask the journalling layer to call | ||
225 | a callback function when the transaction is finally committed to disk, | ||
226 | so that you can do some of your own management. You ask the journalling | ||
227 | layer for calling the callback by simply setting | ||
228 | ``journal->j_commit_callback`` function pointer and that function is | ||
229 | called after each transaction commit. You can also use | ||
230 | ``transaction->t_private_list`` for attaching entries to a transaction | ||
231 | that need processing when the transaction commits. | ||
232 | |||
233 | JBD2 also provides a way to block all transaction updates via | ||
234 | :c:func:`jbd2_journal_lock_updates()` / | ||
235 | :c:func:`jbd2_journal_unlock_updates()`. Ext4 uses this when it wants a | ||
236 | window with a clean and stable fs for a moment. E.g. | ||
237 | |||
238 | :: | ||
239 | |||
240 | |||
241 | jbd2_journal_lock_updates() //stop new stuff happening.. | ||
242 | jbd2_journal_flush() // checkpoint everything. | ||
243 | ..do stuff on stable fs | ||
244 | jbd2_journal_unlock_updates() // carry on with filesystem use. | ||
245 | |||
246 | The opportunities for abuse and DOS attacks with this should be obvious, | ||
247 | if you allow unprivileged userspace to trigger codepaths containing | ||
248 | these calls. | ||
249 | |||
250 | Summary | ||
251 | ~~~~~~~ | ||
252 | |||
253 | Using the journal is a matter of wrapping the different context changes, | ||
254 | being each mount, each modification (transaction) and each changed | ||
255 | buffer to tell the journalling layer about them. | ||
256 | |||
257 | Data Types | ||
258 | ---------- | ||
259 | |||
260 | The journalling layer uses typedefs to 'hide' the concrete definitions | ||
261 | of the structures used. As a client of the JBD2 layer you can just rely | ||
262 | on the using the pointer as a magic cookie of some sort. Obviously the | ||
263 | hiding is not enforced as this is 'C'. | ||
264 | |||
265 | Structures | ||
266 | ~~~~~~~~~~ | ||
267 | |||
268 | .. kernel-doc:: include/linux/jbd2.h | ||
269 | :internal: | ||
270 | |||
271 | Functions | ||
272 | --------- | ||
273 | |||
274 | The functions here are split into two groups those that affect a journal | ||
275 | as a whole, and those which are used to manage transactions | ||
276 | |||
277 | Journal Level | ||
278 | ~~~~~~~~~~~~~ | ||
279 | |||
280 | .. kernel-doc:: fs/jbd2/journal.c | ||
281 | :export: | ||
282 | |||
283 | .. kernel-doc:: fs/jbd2/recovery.c | ||
284 | :internal: | ||
285 | |||
286 | Transasction Level | ||
287 | ~~~~~~~~~~~~~~~~~~ | ||
288 | |||
289 | .. kernel-doc:: fs/jbd2/transaction.c | ||
290 | |||
291 | See also | ||
292 | -------- | ||
293 | |||
294 | `Journaling the Linux ext2fs Filesystem, LinuxExpo 98, Stephen | ||
295 | Tweedie <http://kernel.org/pub/linux/kernel/people/sct/ext3/journal-design.ps.gz>`__ | ||
296 | |||
297 | `Ext3 Journalling FileSystem, OLS 2000, Dr. Stephen | ||
298 | Tweedie <http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html>`__ | ||
299 | |||
300 | splice API | ||
301 | ========== | ||
302 | |||
303 | splice is a method for moving blocks of data around inside the kernel, | ||
304 | without continually transferring them between the kernel and user space. | ||
305 | |||
306 | .. kernel-doc:: fs/splice.c | ||
307 | |||
308 | pipes API | ||
309 | ========= | ||
310 | |||
311 | Pipe interfaces are all for in-kernel (builtin image) use. They are not | ||
312 | exported for use by modules. | ||
313 | |||
314 | .. kernel-doc:: include/linux/pipe_fs_i.h | ||
315 | :internal: | ||
316 | |||
317 | .. kernel-doc:: fs/pipe.c | ||
diff --git a/Documentation/filesystems/nfs/idmapper.txt b/Documentation/filesystems/nfs/idmapper.txt index fe03d10bb79a..b86831acd583 100644 --- a/Documentation/filesystems/nfs/idmapper.txt +++ b/Documentation/filesystems/nfs/idmapper.txt | |||
@@ -55,7 +55,7 @@ request-key will find the first matching line and corresponding program. In | |||
55 | this case, /some/other/program will handle all uid lookups and | 55 | this case, /some/other/program will handle all uid lookups and |
56 | /usr/sbin/nfs.idmap will handle gid, user, and group lookups. | 56 | /usr/sbin/nfs.idmap will handle gid, user, and group lookups. |
57 | 57 | ||
58 | See <file:Documentation/security/keys-request-key.txt> for more information | 58 | See <file:Documentation/security/keys/request-key.rst> for more information |
59 | about the request-key function. | 59 | about the request-key function. |
60 | 60 | ||
61 | 61 | ||