diff options
author | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
commit | 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (patch) | |
tree | 0bba044c4ce775e45a88a51686b5d9f90697ea9d /Documentation/rpc-cache.txt |
Linux-2.6.12-rc2v2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
Diffstat (limited to 'Documentation/rpc-cache.txt')
-rw-r--r-- | Documentation/rpc-cache.txt | 171 |
1 files changed, 171 insertions, 0 deletions
diff --git a/Documentation/rpc-cache.txt b/Documentation/rpc-cache.txt new file mode 100644 index 000000000000..2b5d4434fa5a --- /dev/null +++ b/Documentation/rpc-cache.txt | |||
@@ -0,0 +1,171 @@ | |||
1 | This document gives a brief introduction to the caching | ||
2 | mechanisms in the sunrpc layer that is used, in particular, | ||
3 | for NFS authentication. | ||
4 | |||
5 | CACHES | ||
6 | ====== | ||
7 | The caching replaces the old exports table and allows for | ||
8 | a wide variety of values to be caches. | ||
9 | |||
10 | There are a number of caches that are similar in structure though | ||
11 | quite possibly very different in content and use. There is a corpus | ||
12 | of common code for managing these caches. | ||
13 | |||
14 | Examples of caches that are likely to be needed are: | ||
15 | - mapping from IP address to client name | ||
16 | - mapping from client name and filesystem to export options | ||
17 | - mapping from UID to list of GIDs, to work around NFS's limitation | ||
18 | of 16 gids. | ||
19 | - mappings between local UID/GID and remote UID/GID for sites that | ||
20 | do not have uniform uid assignment | ||
21 | - mapping from network identify to public key for crypto authentication. | ||
22 | |||
23 | The common code handles such things as: | ||
24 | - general cache lookup with correct locking | ||
25 | - supporting 'NEGATIVE' as well as positive entries | ||
26 | - allowing an EXPIRED time on cache items, and removing | ||
27 | items after they expire, and are no longe in-use. | ||
28 | |||
29 | Future code extensions are expect to handle | ||
30 | - making requests to user-space to fill in cache entries | ||
31 | - allowing user-space to directly set entries in the cache | ||
32 | - delaying RPC requests that depend on as-yet incomplete | ||
33 | cache entries, and replaying those requests when the cache entry | ||
34 | is complete. | ||
35 | - maintaining last-access times on cache entries | ||
36 | - clean out old entries when the caches become full | ||
37 | |||
38 | The code for performing a cache lookup is also common, but in the form | ||
39 | of a template. i.e. a #define. | ||
40 | Each cache defines a lookup function by using the DefineCacheLookup | ||
41 | macro, or the simpler DefineSimpleCacheLookup macro | ||
42 | |||
43 | Creating a Cache | ||
44 | ---------------- | ||
45 | |||
46 | 1/ A cache needs a datum to cache. This is in the form of a | ||
47 | structure definition that must contain a | ||
48 | struct cache_head | ||
49 | as an element, usually the first. | ||
50 | It will also contain a key and some content. | ||
51 | Each cache element is reference counted and contains | ||
52 | expiry and update times for use in cache management. | ||
53 | 2/ A cache needs a "cache_detail" structure that | ||
54 | describes the cache. This stores the hash table, and some | ||
55 | parameters for cache management. | ||
56 | 3/ A cache needs a lookup function. This is created using | ||
57 | the DefineCacheLookup macro. This lookup function is used both | ||
58 | to find entries and to update entries. The normal mode for | ||
59 | updating an entry is to replace the old entry with a new | ||
60 | entry. However it is possible to allow update-in-place | ||
61 | for those caches where it makes sense (no atomicity issues | ||
62 | or indirect reference counting issue) | ||
63 | 4/ A cache needs to be registered using cache_register(). This | ||
64 | includes in on a list of caches that will be regularly | ||
65 | cleaned to discard old data. For this to work, some | ||
66 | thread must periodically call cache_clean | ||
67 | |||
68 | Using a cache | ||
69 | ------------- | ||
70 | |||
71 | To find a value in a cache, call the lookup function passing it a the | ||
72 | datum which contains key, and possibly content, and a flag saying | ||
73 | whether to update the cache with new data from the datum. Depending | ||
74 | on how the cache lookup function was defined, it may take an extra | ||
75 | argument to identify the particular cache in question. | ||
76 | |||
77 | Except in cases of kmalloc failure, the lookup function | ||
78 | will return a new datum which will store the key and | ||
79 | may contain valid content, or may not. | ||
80 | This datum is typically passed to cache_check which determines the | ||
81 | validity of the datum and may later initiate an upcall to fill | ||
82 | in the data. | ||
83 | |||
84 | cache_check can be passed a "struct cache_req *". This structure is | ||
85 | typically embedded in the actual request and can be used to create a | ||
86 | deferred copy of the request (struct cache_deferred_req). This is | ||
87 | done when the found cache item is not uptodate, but the is reason to | ||
88 | believe that userspace might provide information soon. When the cache | ||
89 | item does become valid, the deferred copy of the request will be | ||
90 | revisited (->revisit). It is expected that this method will | ||
91 | reschedule the request for processing. | ||
92 | |||
93 | |||
94 | Populating a cache | ||
95 | ------------------ | ||
96 | |||
97 | Each cache has a name, and when the cache is registered, a directory | ||
98 | with that name is created in /proc/net/rpc | ||
99 | |||
100 | This directory contains a file called 'channel' which is a channel | ||
101 | for communicating between kernel and user for populating the cache. | ||
102 | This directory may later contain other files of interacting | ||
103 | with the cache. | ||
104 | |||
105 | The 'channel' works a bit like a datagram socket. Each 'write' is | ||
106 | passed as a whole to the cache for parsing and interpretation. | ||
107 | Each cache can treat the write requests differently, but it is | ||
108 | expected that a message written will contain: | ||
109 | - a key | ||
110 | - an expiry time | ||
111 | - a content. | ||
112 | with the intention that an item in the cache with the give key | ||
113 | should be create or updated to have the given content, and the | ||
114 | expiry time should be set on that item. | ||
115 | |||
116 | Reading from a channel is a bit more interesting. When a cache | ||
117 | lookup fail, or when it suceeds but finds an entry that may soon | ||
118 | expiry, a request is lodged for that cache item to be updated by | ||
119 | user-space. These requests appear in the channel file. | ||
120 | |||
121 | Successive reads will return successive requests. | ||
122 | If there are no more requests to return, read will return EOF, but a | ||
123 | select or poll for read will block waiting for another request to be | ||
124 | added. | ||
125 | |||
126 | Thus a user-space helper is likely to: | ||
127 | open the channel. | ||
128 | select for readable | ||
129 | read a request | ||
130 | write a response | ||
131 | loop. | ||
132 | |||
133 | If it dies and needs to be restarted, any requests that have not be | ||
134 | answered will still appear in the file and will be read by the new | ||
135 | instance of the helper. | ||
136 | |||
137 | Each cache should define a "cache_parse" method which takes a message | ||
138 | written from user-space and processes it. It should return an error | ||
139 | (which propagates back to the write syscall) or 0. | ||
140 | |||
141 | Each cache should also define a "cache_request" method which | ||
142 | takes a cache item and encodes a request into the buffer | ||
143 | provided. | ||
144 | |||
145 | |||
146 | Note: If a cache has no active readers on the channel, and has had not | ||
147 | active readers for more than 60 seconds, further requests will not be | ||
148 | added to the channel but instead all looks that do not find a valid | ||
149 | entry will fail. This is partly for backward compatibility: The | ||
150 | previous nfs exports table was deemed to be authoritative and a | ||
151 | failed lookup meant a definite 'no'. | ||
152 | |||
153 | request/response format | ||
154 | ----------------------- | ||
155 | |||
156 | While each cache is free to use it's own format for requests | ||
157 | and responses over channel, the following is recommended are | ||
158 | appropriate and support routines are available to help: | ||
159 | Each request or response record should be printable ASCII | ||
160 | with precisely one newline character which should be at the end. | ||
161 | Fields within the record should be separated by spaces, normally one. | ||
162 | If spaces, newlines, or nul characters are needed in a field they | ||
163 | much be quotes. two mechanisms are available: | ||
164 | 1/ If a field begins '\x' then it must contain an even number of | ||
165 | hex digits, and pairs of these digits provide the bytes in the | ||
166 | field. | ||
167 | 2/ otherwise a \ in the field must be followed by 3 octal digits | ||
168 | which give the code for a byte. Other characters are treated | ||
169 | as them selves. At the very least, space, newlines nul, and | ||
170 | '\' must be quoted in this way. | ||
171 | |||