diff options
author | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-04-16 18:20:36 -0400 |
commit | 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 (patch) | |
tree | 0bba044c4ce775e45a88a51686b5d9f90697ea9d /Documentation/nommu-mmap.txt |
Linux-2.6.12-rc2
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
Diffstat (limited to 'Documentation/nommu-mmap.txt')
-rw-r--r-- | Documentation/nommu-mmap.txt | 198 |
1 files changed, 198 insertions, 0 deletions
diff --git a/Documentation/nommu-mmap.txt b/Documentation/nommu-mmap.txt new file mode 100644 index 00000000000..b88ebe4d808 --- /dev/null +++ b/Documentation/nommu-mmap.txt | |||
@@ -0,0 +1,198 @@ | |||
1 | ============================= | ||
2 | NO-MMU MEMORY MAPPING SUPPORT | ||
3 | ============================= | ||
4 | |||
5 | The kernel has limited support for memory mapping under no-MMU conditions, such | ||
6 | as are used in uClinux environments. From the userspace point of view, memory | ||
7 | mapping is made use of in conjunction with the mmap() system call, the shmat() | ||
8 | call and the execve() system call. From the kernel's point of view, execve() | ||
9 | mapping is actually performed by the binfmt drivers, which call back into the | ||
10 | mmap() routines to do the actual work. | ||
11 | |||
12 | Memory mapping behaviour also involves the way fork(), vfork(), clone() and | ||
13 | ptrace() work. Under uClinux there is no fork(), and clone() must be supplied | ||
14 | the CLONE_VM flag. | ||
15 | |||
16 | The behaviour is similar between the MMU and no-MMU cases, but not identical; | ||
17 | and it's also much more restricted in the latter case: | ||
18 | |||
19 | (*) Anonymous mapping, MAP_PRIVATE | ||
20 | |||
21 | In the MMU case: VM regions backed by arbitrary pages; copy-on-write | ||
22 | across fork. | ||
23 | |||
24 | In the no-MMU case: VM regions backed by arbitrary contiguous runs of | ||
25 | pages. | ||
26 | |||
27 | (*) Anonymous mapping, MAP_SHARED | ||
28 | |||
29 | These behave very much like private mappings, except that they're | ||
30 | shared across fork() or clone() without CLONE_VM in the MMU case. Since | ||
31 | the no-MMU case doesn't support these, behaviour is identical to | ||
32 | MAP_PRIVATE there. | ||
33 | |||
34 | (*) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, !PROT_WRITE | ||
35 | |||
36 | In the MMU case: VM regions backed by pages read from file; changes to | ||
37 | the underlying file are reflected in the mapping; copied across fork. | ||
38 | |||
39 | In the no-MMU case: | ||
40 | |||
41 | - If one exists, the kernel will re-use an existing mapping to the | ||
42 | same segment of the same file if that has compatible permissions, | ||
43 | even if this was created by another process. | ||
44 | |||
45 | - If possible, the file mapping will be directly on the backing device | ||
46 | if the backing device has the BDI_CAP_MAP_DIRECT capability and | ||
47 | appropriate mapping protection capabilities. Ramfs, romfs, cramfs | ||
48 | and mtd might all permit this. | ||
49 | |||
50 | - If the backing device device can't or won't permit direct sharing, | ||
51 | but does have the BDI_CAP_MAP_COPY capability, then a copy of the | ||
52 | appropriate bit of the file will be read into a contiguous bit of | ||
53 | memory and any extraneous space beyond the EOF will be cleared | ||
54 | |||
55 | - Writes to the file do not affect the mapping; writes to the mapping | ||
56 | are visible in other processes (no MMU protection), but should not | ||
57 | happen. | ||
58 | |||
59 | (*) File, MAP_PRIVATE, PROT_READ / PROT_EXEC, PROT_WRITE | ||
60 | |||
61 | In the MMU case: like the non-PROT_WRITE case, except that the pages in | ||
62 | question get copied before the write actually happens. From that point | ||
63 | on writes to the file underneath that page no longer get reflected into | ||
64 | the mapping's backing pages. The page is then backed by swap instead. | ||
65 | |||
66 | In the no-MMU case: works much like the non-PROT_WRITE case, except | ||
67 | that a copy is always taken and never shared. | ||
68 | |||
69 | (*) Regular file / blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE | ||
70 | |||
71 | In the MMU case: VM regions backed by pages read from file; changes to | ||
72 | pages written back to file; writes to file reflected into pages backing | ||
73 | mapping; shared across fork. | ||
74 | |||
75 | In the no-MMU case: not supported. | ||
76 | |||
77 | (*) Memory backed regular file, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE | ||
78 | |||
79 | In the MMU case: As for ordinary regular files. | ||
80 | |||
81 | In the no-MMU case: The filesystem providing the memory-backed file | ||
82 | (such as ramfs or tmpfs) may choose to honour an open, truncate, mmap | ||
83 | sequence by providing a contiguous sequence of pages to map. In that | ||
84 | case, a shared-writable memory mapping will be possible. It will work | ||
85 | as for the MMU case. If the filesystem does not provide any such | ||
86 | support, then the mapping request will be denied. | ||
87 | |||
88 | (*) Memory backed blockdev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE | ||
89 | |||
90 | In the MMU case: As for ordinary regular files. | ||
91 | |||
92 | In the no-MMU case: As for memory backed regular files, but the | ||
93 | blockdev must be able to provide a contiguous run of pages without | ||
94 | truncate being called. The ramdisk driver could do this if it allocated | ||
95 | all its memory as a contiguous array upfront. | ||
96 | |||
97 | (*) Memory backed chardev, MAP_SHARED, PROT_READ / PROT_EXEC / PROT_WRITE | ||
98 | |||
99 | In the MMU case: As for ordinary regular files. | ||
100 | |||
101 | In the no-MMU case: The character device driver may choose to honour | ||
102 | the mmap() by providing direct access to the underlying device if it | ||
103 | provides memory or quasi-memory that can be accessed directly. Examples | ||
104 | of such are frame buffers and flash devices. If the driver does not | ||
105 | provide any such support, then the mapping request will be denied. | ||
106 | |||
107 | |||
108 | ============================ | ||
109 | FURTHER NOTES ON NO-MMU MMAP | ||
110 | ============================ | ||
111 | |||
112 | (*) A request for a private mapping of less than a page in size may not return | ||
113 | a page-aligned buffer. This is because the kernel calls kmalloc() to | ||
114 | allocate the buffer, not get_free_page(). | ||
115 | |||
116 | (*) A list of all the mappings on the system is visible through /proc/maps in | ||
117 | no-MMU mode. | ||
118 | |||
119 | (*) Supplying MAP_FIXED or a requesting a particular mapping address will | ||
120 | result in an error. | ||
121 | |||
122 | (*) Files mapped privately usually have to have a read method provided by the | ||
123 | driver or filesystem so that the contents can be read into the memory | ||
124 | allocated if mmap() chooses not to map the backing device directly. An | ||
125 | error will result if they don't. This is most likely to be encountered | ||
126 | with character device files, pipes, fifos and sockets. | ||
127 | |||
128 | ============================================ | ||
129 | PROVIDING SHAREABLE CHARACTER DEVICE SUPPORT | ||
130 | ============================================ | ||
131 | |||
132 | To provide shareable character device support, a driver must provide a | ||
133 | file->f_op->get_unmapped_area() operation. The mmap() routines will call this | ||
134 | to get a proposed address for the mapping. This may return an error if it | ||
135 | doesn't wish to honour the mapping because it's too long, at a weird offset, | ||
136 | under some unsupported combination of flags or whatever. | ||
137 | |||
138 | The driver should also provide backing device information with capabilities set | ||
139 | to indicate the permitted types of mapping on such devices. The default is | ||
140 | assumed to be readable and writable, not executable, and only shareable | ||
141 | directly (can't be copied). | ||
142 | |||
143 | The file->f_op->mmap() operation will be called to actually inaugurate the | ||
144 | mapping. It can be rejected at that point. Returning the ENOSYS error will | ||
145 | cause the mapping to be copied instead if BDI_CAP_MAP_COPY is specified. | ||
146 | |||
147 | The vm_ops->close() routine will be invoked when the last mapping on a chardev | ||
148 | is removed. An existing mapping will be shared, partially or not, if possible | ||
149 | without notifying the driver. | ||
150 | |||
151 | It is permitted also for the file->f_op->get_unmapped_area() operation to | ||
152 | return -ENOSYS. This will be taken to mean that this operation just doesn't | ||
153 | want to handle it, despite the fact it's got an operation. For instance, it | ||
154 | might try directing the call to a secondary driver which turns out not to | ||
155 | implement it. Such is the case for the framebuffer driver which attempts to | ||
156 | direct the call to the device-specific driver. Under such circumstances, the | ||
157 | mapping request will be rejected if BDI_CAP_MAP_COPY is not specified, and a | ||
158 | copy mapped otherwise. | ||
159 | |||
160 | IMPORTANT NOTE: | ||
161 | |||
162 | Some types of device may present a different appearance to anyone | ||
163 | looking at them in certain modes. Flash chips can be like this; for | ||
164 | instance if they're in programming or erase mode, you might see the | ||
165 | status reflected in the mapping, instead of the data. | ||
166 | |||
167 | In such a case, care must be taken lest userspace see a shared or a | ||
168 | private mapping showing such information when the driver is busy | ||
169 | controlling the device. Remember especially: private executable | ||
170 | mappings may still be mapped directly off the device under some | ||
171 | circumstances! | ||
172 | |||
173 | |||
174 | ============================================== | ||
175 | PROVIDING SHAREABLE MEMORY-BACKED FILE SUPPORT | ||
176 | ============================================== | ||
177 | |||
178 | Provision of shared mappings on memory backed files is similar to the provision | ||
179 | of support for shared mapped character devices. The main difference is that the | ||
180 | filesystem providing the service will probably allocate a contiguous collection | ||
181 | of pages and permit mappings to be made on that. | ||
182 | |||
183 | It is recommended that a truncate operation applied to such a file that | ||
184 | increases the file size, if that file is empty, be taken as a request to gather | ||
185 | enough pages to honour a mapping. This is required to support POSIX shared | ||
186 | memory. | ||
187 | |||
188 | Memory backed devices are indicated by the mapping's backing device info having | ||
189 | the memory_backed flag set. | ||
190 | |||
191 | |||
192 | ======================================== | ||
193 | PROVIDING SHAREABLE BLOCK DEVICE SUPPORT | ||
194 | ======================================== | ||
195 | |||
196 | Provision of shared mappings on block device files is exactly the same as for | ||
197 | character devices. If there isn't a real device underneath, then the driver | ||
198 | should allocate sufficient contiguous memory to honour any supported mapping. | ||