aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems
diff options
context:
space:
mode:
authorNeil Brown <neilb@suse.de>2011-06-01 04:57:15 -0400
committerLeann Ogasawara <leann.ogasawara@canonical.com>2011-08-30 13:17:34 -0400
commit0d9f38ceaefbe456768eb9e162160cae4746c756 (patch)
tree359f716f6bbadd546cc084b7d5435f2b4ea11d1b /Documentation/filesystems
parent5b4152ab30cffad224f72a26b2f7bf1277d1f51c (diff)
UBUNTU: ubuntu: overlayfs -- overlay: overlay filesystem documentation
Document the overlay filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andy Whitcroft <apw@canonical.com>
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/overlayfs.txt167
1 files changed, 167 insertions, 0 deletions
diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
new file mode 100644
index 00000000000..4bc0b343398
--- /dev/null
+++ b/Documentation/filesystems/overlayfs.txt
@@ -0,0 +1,167 @@
1Written by: Neil Brown <neilb@suse.de>
2
3Overlay Filesystem
4==================
5
6This document describes a prototype for a new approach to providing
7overlay-filesystem functionality in Linux (sometimes referred to as
8union-filesystems). An overlay-filesystem tries to present a
9filesystem which is the result over overlaying one filesystem on top
10of the other.
11
12The result will inevitably fail to look exactly like a normal
13filesystem for various technical reasons. The expectation is that
14many use cases will be able to ignore these differences.
15
16This approach is 'hybrid' because the objects that appear in the
17filesystem do not all appear to belong to that filesystem. In many
18cases an object accessed in the union will be indistinguishable
19from accessing the corresponding object from the original filesystem.
20This is most obvious from the 'st_dev' field returned by stat(2).
21
22While directories will report an st_dev from the overlay-filesystem,
23all non-directory objects will report an st_dev from the lower or
24upper filesystem that is providing the object. Similarly st_ino will
25only be unique when combined with st_dev, and both of these can change
26over the lifetime of a non-directory object. Many applications and
27tools ignore these values and will not be affected.
28
29Upper and Lower
30---------------
31
32An overlay filesystem combines two filesystems - an 'upper' filesystem
33and a 'lower' filesystem. When a name exists in both filesystems, the
34object in the 'upper' filesystem is visible while the object in the
35'lower' filesystem is either hidden or, in the case of directories,
36merged with the 'upper' object.
37
38It would be more correct to refer to an upper and lower 'directory
39tree' rather than 'filesystem' as it is quite possible for both
40directory trees to be in the same filesystem and there is no
41requirement that the root of a filesystem be given for either upper or
42lower.
43
44The lower filesystem can be any filesystem supported by Linux and does
45not need to be writable. The lower filesystem can even be another
46overlayfs. The upper filesystem will normally be writable and if it
47is it must support the creation of trusted.* extended attributes, and
48must provide valid d_type in readdir responses, at least for symbolic
49links - so NFS is not suitable.
50
51A read-only overlay of two read-only filesystems may use any
52filesystem type.
53
54Directories
55-----------
56
57Overlaying mainly involved directories. If a given name appears in both
58upper and lower filesystems and refers to a non-directory in either,
59then the lower object is hidden - the name refers only to the upper
60object.
61
62Where both upper and lower objects are directories, a merged directory
63is formed.
64
65At mount time, the two directories given as mount options are combined
66into a merged directory:
67
68 mount -t overlayfs overlayfs -olowerdir=/lower,upperdir=/upper /overlay
69
70Then whenever a lookup is requested in such a merged directory, the
71lookup is performed in each actual directory and the combined result
72is cached in the dentry belonging to the overlay filesystem. If both
73actual lookups find directories, both are stored and a merged
74directory is created, otherwise only one is stored: the upper if it
75exists, else the lower.
76
77Only the lists of names from directories are merged. Other content
78such as metadata and extended attributes are reported for the upper
79directory only. These attributes of the lower directory are hidden.
80
81whiteouts and opaque directories
82--------------------------------
83
84In order to support rm and rmdir without changing the lower
85filesystem, an overlay filesystem needs to record in the upper filesystem
86that files have been removed. This is done using whiteouts and opaque
87directories (non-directories are always opaque).
88
89The overlay filesystem uses extended attributes with a
90"trusted.overlay." prefix to record these details.
91
92A whiteout is created as a symbolic link with target
93"(overlay-whiteout)" and with xattr "trusted.overlay.whiteout" set to "y".
94When a whiteout is found in the upper level of a merged directory, any
95matching name in the lower level is ignored, and the whiteout itself
96is also hidden.
97
98A directory is made opaque by setting the xattr "trusted.overlay.opaque"
99to "y". Where the upper filesystem contains an opaque directory, any
100directory in the lower filesystem with the same name is ignored.
101
102readdir
103-------
104
105When a 'readdir' request is made on a merged directory, the upper and
106lower directories are each read and the name lists merged in the
107obvious way (upper is read first, then lower - entries that already
108exist are not re-added). This merged name list is cached in the
109'struct file' and so remains as long as the file is kept open. If the
110directory is opened and read by two processes at the same time, they
111will each have separate caches. A seekdir to the start of the
112directory (offset 0) followed by a readdir will cause the cache to be
113discarded and rebuilt.
114
115This means that changes to the merged directory do not appear while a
116directory is being read. This is unlikely to be noticed by many
117programs.
118
119seek offsets are assigned sequentially when the directories are read.
120Thus if
121 - read part of a directory
122 - remember an offset, and close the directory
123 - re-open the directory some time later
124 - seek to the remembered offset
125
126there may be little correlation between the old and new locations in
127the list of filenames, particularly if anything has changed in the
128directory.
129
130Readdir on directories that are not merged is simply handled by the
131underlying directory (upper or lower).
132
133
134Non-directories
135---------------
136
137Objects that are not directories (files, symlinks, device-special
138files etc.) are presented either from the upper or lower filesystem as
139appropriate. When a file in the lower filesystem is accessed in a way
140the requires write-access, such as opening for write access, changing
141some metadata etc., the file is first copied from the lower filesystem
142to the upper filesystem (copy_up). Note that creating a hard-link
143also requires copy_up, though of course creation of a symlink does
144not.
145
146The copy_up process first makes sure that the containing directory
147exists in the upper filesystem - creating it and any parents as
148necessary. It then creates the object with the same metadata (owner,
149mode, mtime, symlink-target etc.) and then if the object is a file, the
150data is copied from the lower to the upper filesystem. Finally any
151extended attributes are copied up.
152
153Once the copy_up is complete, the overlay filesystem simply
154provides direct access to the newly created file in the upper
155filesystem - future operations on the file are barely noticed by the
156overlay filesystem (though an operation on the name of the file such as
157rename or unlink will of course be noticed and handled).
158
159Changes to underlying filesystems
160---------------------------------
161
162Offline changes, when the overlay is not mounted, are allowed to either
163the upper or the lower trees.
164
165Changes to the underlying filesystems while part of a mounted overlay
166filesystem are not allowed. This is not yet enforced, but will be in
167the future.