diff options
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r-- | Documentation/filesystems/00-INDEX | 2 | ||||
-rw-r--r-- | Documentation/filesystems/seq_file.txt | 283 |
2 files changed, 285 insertions, 0 deletions
diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX index e68021c08fbd..e731196410b3 100644 --- a/Documentation/filesystems/00-INDEX +++ b/Documentation/filesystems/00-INDEX | |||
@@ -82,6 +82,8 @@ relay.txt | |||
82 | - info on relay, for efficient streaming from kernel to user space. | 82 | - info on relay, for efficient streaming from kernel to user space. |
83 | romfs.txt | 83 | romfs.txt |
84 | - description of the ROMFS filesystem. | 84 | - description of the ROMFS filesystem. |
85 | seq_file.txt | ||
86 | - how to use the seq_file API | ||
85 | sharedsubtree.txt | 87 | sharedsubtree.txt |
86 | - a description of shared subtrees for namespaces. | 88 | - a description of shared subtrees for namespaces. |
87 | smbfs.txt | 89 | smbfs.txt |
diff --git a/Documentation/filesystems/seq_file.txt b/Documentation/filesystems/seq_file.txt new file mode 100644 index 000000000000..92975ee7942c --- /dev/null +++ b/Documentation/filesystems/seq_file.txt | |||
@@ -0,0 +1,283 @@ | |||
1 | The seq_file interface | ||
2 | |||
3 | Copyright 2003 Jonathan Corbet <corbet@lwn.net> | ||
4 | This file is originally from the LWN.net Driver Porting series at | ||
5 | http://lwn.net/Articles/driver-porting/ | ||
6 | |||
7 | |||
8 | There are numerous ways for a device driver (or other kernel component) to | ||
9 | provide information to the user or system administrator. One useful | ||
10 | technique is the creation of virtual files, in debugfs, /proc or elsewhere. | ||
11 | Virtual files can provide human-readable output that is easy to get at | ||
12 | without any special utility programs; they can also make life easier for | ||
13 | script writers. It is not surprising that the use of virtual files has | ||
14 | grown over the years. | ||
15 | |||
16 | Creating those files correctly has always been a bit of a challenge, | ||
17 | however. It is not that hard to make a virtual file which returns a | ||
18 | string. But life gets trickier if the output is long - anything greater | ||
19 | than an application is likely to read in a single operation. Handling | ||
20 | multiple reads (and seeks) requires careful attention to the reader's | ||
21 | position within the virtual file - that position is, likely as not, in the | ||
22 | middle of a line of output. The kernel has traditionally had a number of | ||
23 | implementations that got this wrong. | ||
24 | |||
25 | The 2.6 kernel contains a set of functions (implemented by Alexander Viro) | ||
26 | which are designed to make it easy for virtual file creators to get it | ||
27 | right. | ||
28 | |||
29 | The seq_file interface is available via <linux/seq_file.h>. There are | ||
30 | three aspects to seq_file: | ||
31 | |||
32 | * An iterator interface which lets a virtual file implementation | ||
33 | step through the objects it is presenting. | ||
34 | |||
35 | * Some utility functions for formatting objects for output without | ||
36 | needing to worry about things like output buffers. | ||
37 | |||
38 | * A set of canned file_operations which implement most operations on | ||
39 | the virtual file. | ||
40 | |||
41 | We'll look at the seq_file interface via an extremely simple example: a | ||
42 | loadable module which creates a file called /proc/sequence. The file, when | ||
43 | read, simply produces a set of increasing integer values, one per line. The | ||
44 | sequence will continue until the user loses patience and finds something | ||
45 | better to do. The file is seekable, in that one can do something like the | ||
46 | following: | ||
47 | |||
48 | dd if=/proc/sequence of=out1 count=1 | ||
49 | dd if=/proc/sequence skip=1 out=out2 count=1 | ||
50 | |||
51 | Then concatenate the output files out1 and out2 and get the right | ||
52 | result. Yes, it is a thoroughly useless module, but the point is to show | ||
53 | how the mechanism works without getting lost in other details. (Those | ||
54 | wanting to see the full source for this module can find it at | ||
55 | http://lwn.net/Articles/22359/). | ||
56 | |||
57 | |||
58 | The iterator interface | ||
59 | |||
60 | Modules implementing a virtual file with seq_file must implement a simple | ||
61 | iterator object that allows stepping through the data of interest. | ||
62 | Iterators must be able to move to a specific position - like the file they | ||
63 | implement - but the interpretation of that position is up to the iterator | ||
64 | itself. A seq_file implementation that is formatting firewall rules, for | ||
65 | example, could interpret position N as the Nth rule in the chain. | ||
66 | Positioning can thus be done in whatever way makes the most sense for the | ||
67 | generator of the data, which need not be aware of how a position translates | ||
68 | to an offset in the virtual file. The one obvious exception is that a | ||
69 | position of zero should indicate the beginning of the file. | ||
70 | |||
71 | The /proc/sequence iterator just uses the count of the next number it | ||
72 | will output as its position. | ||
73 | |||
74 | Four functions must be implemented to make the iterator work. The first, | ||
75 | called start() takes a position as an argument and returns an iterator | ||
76 | which will start reading at that position. For our simple sequence example, | ||
77 | the start() function looks like: | ||
78 | |||
79 | static void *ct_seq_start(struct seq_file *s, loff_t *pos) | ||
80 | { | ||
81 | loff_t *spos = kmalloc(sizeof(loff_t), GFP_KERNEL); | ||
82 | if (! spos) | ||
83 | return NULL; | ||
84 | *spos = *pos; | ||
85 | return spos; | ||
86 | } | ||
87 | |||
88 | The entire data structure for this iterator is a single loff_t value | ||
89 | holding the current position. There is no upper bound for the sequence | ||
90 | iterator, but that will not be the case for most other seq_file | ||
91 | implementations; in most cases the start() function should check for a | ||
92 | "past end of file" condition and return NULL if need be. | ||
93 | |||
94 | For more complicated applications, the private field of the seq_file | ||
95 | structure can be used. There is also a special value whch can be returned | ||
96 | by the start() function called SEQ_START_TOKEN; it can be used if you wish | ||
97 | to instruct your show() function (described below) to print a header at the | ||
98 | top of the output. SEQ_START_TOKEN should only be used if the offset is | ||
99 | zero, however. | ||
100 | |||
101 | The next function to implement is called, amazingly, next(); its job is to | ||
102 | move the iterator forward to the next position in the sequence. The | ||
103 | example module can simply increment the position by one; more useful | ||
104 | modules will do what is needed to step through some data structure. The | ||
105 | next() function returns a new iterator, or NULL if the sequence is | ||
106 | complete. Here's the example version: | ||
107 | |||
108 | static void *ct_seq_next(struct seq_file *s, void *v, loff_t *pos) | ||
109 | { | ||
110 | loff_t *spos = (loff_t *) v; | ||
111 | *pos = ++(*spos); | ||
112 | return spos; | ||
113 | } | ||
114 | |||
115 | The stop() function is called when iteration is complete; its job, of | ||
116 | course, is to clean up. If dynamic memory is allocated for the iterator, | ||
117 | stop() is the place to free it. | ||
118 | |||
119 | static void ct_seq_stop(struct seq_file *s, void *v) | ||
120 | { | ||
121 | kfree(v); | ||
122 | } | ||
123 | |||
124 | Finally, the show() function should format the object currently pointed to | ||
125 | by the iterator for output. It should return zero, or an error code if | ||
126 | something goes wrong. The example module's show() function is: | ||
127 | |||
128 | static int ct_seq_show(struct seq_file *s, void *v) | ||
129 | { | ||
130 | loff_t *spos = (loff_t *) v; | ||
131 | seq_printf(s, "%Ld\n", *spos); | ||
132 | return 0; | ||
133 | } | ||
134 | |||
135 | We will look at seq_printf() in a moment. But first, the definition of the | ||
136 | seq_file iterator is finished by creating a seq_operations structure with | ||
137 | the four functions we have just defined: | ||
138 | |||
139 | static struct seq_operations ct_seq_ops = { | ||
140 | .start = ct_seq_start, | ||
141 | .next = ct_seq_next, | ||
142 | .stop = ct_seq_stop, | ||
143 | .show = ct_seq_show | ||
144 | }; | ||
145 | |||
146 | This structure will be needed to tie our iterator to the /proc file in | ||
147 | a little bit. | ||
148 | |||
149 | It's worth noting that the interator value returned by start() and | ||
150 | manipulated by the other functions is considered to be completely opaque by | ||
151 | the seq_file code. It can thus be anything that is useful in stepping | ||
152 | through the data to be output. Counters can be useful, but it could also be | ||
153 | a direct pointer into an array or linked list. Anything goes, as long as | ||
154 | the programmer is aware that things can happen between calls to the | ||
155 | iterator function. However, the seq_file code (by design) will not sleep | ||
156 | between the calls to start() and stop(), so holding a lock during that time | ||
157 | is a reasonable thing to do. The seq_file code will also avoid taking any | ||
158 | other locks while the iterator is active. | ||
159 | |||
160 | |||
161 | Formatted output | ||
162 | |||
163 | The seq_file code manages positioning within the output created by the | ||
164 | iterator and getting it into the user's buffer. But, for that to work, that | ||
165 | output must be passed to the seq_file code. Some utility functions have | ||
166 | been defined which make this task easy. | ||
167 | |||
168 | Most code will simply use seq_printf(), which works pretty much like | ||
169 | printk(), but which requires the seq_file pointer as an argument. It is | ||
170 | common to ignore the return value from seq_printf(), but a function | ||
171 | producing complicated output may want to check that value and quit if | ||
172 | something non-zero is returned; an error return means that the seq_file | ||
173 | buffer has been filled and further output will be discarded. | ||
174 | |||
175 | For straight character output, the following functions may be used: | ||
176 | |||
177 | int seq_putc(struct seq_file *m, char c); | ||
178 | int seq_puts(struct seq_file *m, const char *s); | ||
179 | int seq_escape(struct seq_file *m, const char *s, const char *esc); | ||
180 | |||
181 | The first two output a single character and a string, just like one would | ||
182 | expect. seq_escape() is like seq_puts(), except that any character in s | ||
183 | which is in the string esc will be represented in octal form in the output. | ||
184 | |||
185 | There is also a function for printing filenames: | ||
186 | |||
187 | int seq_path(struct seq_file *m, struct path *path, char *esc); | ||
188 | |||
189 | Here, path indicates the file of interest, and esc is a set of characters | ||
190 | which should be escaped in the output. | ||
191 | |||
192 | |||
193 | Making it all work | ||
194 | |||
195 | So far, we have a nice set of functions which can produce output within the | ||
196 | seq_file system, but we have not yet turned them into a file that a user | ||
197 | can see. Creating a file within the kernel requires, of course, the | ||
198 | creation of a set of file_operations which implement the operations on that | ||
199 | file. The seq_file interface provides a set of canned operations which do | ||
200 | most of the work. The virtual file author still must implement the open() | ||
201 | method, however, to hook everything up. The open function is often a single | ||
202 | line, as in the example module: | ||
203 | |||
204 | static int ct_open(struct inode *inode, struct file *file) | ||
205 | { | ||
206 | return seq_open(file, &ct_seq_ops); | ||
207 | }; | ||
208 | |||
209 | Here, the call to seq_open() takes the seq_operations structure we created | ||
210 | before, and gets set up to iterate through the virtual file. | ||
211 | |||
212 | On a successful open, seq_open() stores the struct seq_file pointer in | ||
213 | file->private_data. If you have an application where the same iterator can | ||
214 | be used for more than one file, you can store an arbitrary pointer in the | ||
215 | private field of the seq_file structure; that value can then be retrieved | ||
216 | by the iterator functions. | ||
217 | |||
218 | The other operations of interest - read(), llseek(), and release() - are | ||
219 | all implemented by the seq_file code itself. So a virtual file's | ||
220 | file_operations structure will look like: | ||
221 | |||
222 | static struct file_operations ct_file_ops = { | ||
223 | .owner = THIS_MODULE, | ||
224 | .open = ct_open, | ||
225 | .read = seq_read, | ||
226 | .llseek = seq_lseek, | ||
227 | .release = seq_release | ||
228 | }; | ||
229 | |||
230 | There is also a seq_release_private() which passes the contents of the | ||
231 | seq_file private field to kfree() before releasing the structure. | ||
232 | |||
233 | The final step is the creation of the /proc file itself. In the example | ||
234 | code, that is done in the initialization code in the usual way: | ||
235 | |||
236 | static int ct_init(void) | ||
237 | { | ||
238 | struct proc_dir_entry *entry; | ||
239 | |||
240 | entry = create_proc_entry("sequence", 0, NULL); | ||
241 | if (entry) | ||
242 | entry->proc_fops = &ct_file_ops; | ||
243 | return 0; | ||
244 | } | ||
245 | |||
246 | module_init(ct_init); | ||
247 | |||
248 | And that is pretty much it. | ||
249 | |||
250 | |||
251 | seq_list | ||
252 | |||
253 | If your file will be iterating through a linked list, you may find these | ||
254 | routines useful: | ||
255 | |||
256 | struct list_head *seq_list_start(struct list_head *head, | ||
257 | loff_t pos); | ||
258 | struct list_head *seq_list_start_head(struct list_head *head, | ||
259 | loff_t pos); | ||
260 | struct list_head *seq_list_next(void *v, struct list_head *head, | ||
261 | loff_t *ppos); | ||
262 | |||
263 | These helpers will interpret pos as a position within the list and iterate | ||
264 | accordingly. Your start() and next() functions need only invoke the | ||
265 | seq_list_* helpers with a pointer to the appropriate list_head structure. | ||
266 | |||
267 | |||
268 | The extra-simple version | ||
269 | |||
270 | For extremely simple virtual files, there is an even easier interface. A | ||
271 | module can define only the show() function, which should create all the | ||
272 | output that the virtual file will contain. The file's open() method then | ||
273 | calls: | ||
274 | |||
275 | int single_open(struct file *file, | ||
276 | int (*show)(struct seq_file *m, void *p), | ||
277 | void *data); | ||
278 | |||
279 | When output time comes, the show() function will be called once. The data | ||
280 | value given to single_open() can be found in the private field of the | ||
281 | seq_file structure. When using single_open(), the programmer should use | ||
282 | single_release() instead of seq_release() in the file_operations structure | ||
283 | to avoid a memory leak. | ||