aboutsummaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems/ext4.txt
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems/ext4.txt')
-rw-r--r--Documentation/filesystems/ext4.txt85
1 files changed, 67 insertions, 18 deletions
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt
index 174eaff7ded9..cec829bc7291 100644
--- a/Documentation/filesystems/ext4.txt
+++ b/Documentation/filesystems/ext4.txt
@@ -58,13 +58,22 @@ Note: More extensive information for getting started with ext4 can be
58 58
59 # mount -t ext4 /dev/hda1 /wherever 59 # mount -t ext4 /dev/hda1 /wherever
60 60
61 - When comparing performance with other filesystems, remember that 61 - When comparing performance with other filesystems, it's always
62 ext3/4 by default offers higher data integrity guarantees than most. 62 important to try multiple workloads; very often a subtle change in a
63 So when comparing with a metadata-only journalling filesystem, such 63 workload parameter can completely change the ranking of which
64 as ext3, use `mount -o data=writeback'. And you might as well use 64 filesystems do well compared to others. When comparing versus ext3,
65 `mount -o nobh' too along with it. Making the journal larger than 65 note that ext4 enables write barriers by default, while ext3 does
66 the mke2fs default often helps performance with metadata-intensive 66 not enable write barriers by default. So it is useful to use
67 workloads. 67 explicitly specify whether barriers are enabled or not when via the
68 '-o barriers=[0|1]' mount option for both ext3 and ext4 filesystems
69 for a fair comparison. When tuning ext3 for best benchmark numbers,
70 it is often worthwhile to try changing the data journaling mode; '-o
71 data=writeback,nobh' can be faster for some workloads. (Note
72 however that running mounted with data=writeback can potentially
73 leave stale data exposed in recently written files in case of an
74 unclean shutdown, which could be a security exposure in some
75 situations.) Configuring the filesystem with a large journal can
76 also be helpful for metadata-intensive workloads.
68 77
692. Features 782. Features
70=========== 79===========
@@ -74,7 +83,7 @@ Note: More extensive information for getting started with ext4 can be
74* ability to use filesystems > 16TB (e2fsprogs support not available yet) 83* ability to use filesystems > 16TB (e2fsprogs support not available yet)
75* extent format reduces metadata overhead (RAM, IO for access, transactions) 84* extent format reduces metadata overhead (RAM, IO for access, transactions)
76* extent format more robust in face of on-disk corruption due to magics, 85* extent format more robust in face of on-disk corruption due to magics,
77* internal redunancy in tree 86* internal redundancy in tree
78* improved file allocation (multi-block alloc) 87* improved file allocation (multi-block alloc)
79* fix 32000 subdirectory limit 88* fix 32000 subdirectory limit
80* nsec timestamps for mtime, atime, ctime, create time 89* nsec timestamps for mtime, atime, ctime, create time
@@ -116,10 +125,11 @@ grouping of bitmaps and inode tables. Some test results available here:
116When mounting an ext4 filesystem, the following option are accepted: 125When mounting an ext4 filesystem, the following option are accepted:
117(*) == default 126(*) == default
118 127
119extents (*) ext4 will use extents to address file data. The 128ro Mount filesystem read only. Note that ext4 will
120 file system will no longer be mountable by ext3. 129 replay the journal (and thus write to the
121 130 partition) even when mounted "read only". The
122noextents ext4 will not use extents for newly created files 131 mount options "ro,noload" can be used to prevent
132 writes to the filesystem.
123 133
124journal_checksum Enable checksumming of the journal transactions. 134journal_checksum Enable checksumming of the journal transactions.
125 This will allow the recovery code in e2fsck and the 135 This will allow the recovery code in e2fsck and the
@@ -134,17 +144,17 @@ journal_async_commit Commit block can be written to disk without waiting
134journal=update Update the ext4 file system's journal to the current 144journal=update Update the ext4 file system's journal to the current
135 format. 145 format.
136 146
137journal=inum When a journal already exists, this option is ignored.
138 Otherwise, it specifies the number of the inode which
139 will represent the ext4 file system's journal file.
140
141journal_dev=devnum When the external journal device's major/minor numbers 147journal_dev=devnum When the external journal device's major/minor numbers
142 have changed, this option allows the user to specify 148 have changed, this option allows the user to specify
143 the new journal location. The journal device is 149 the new journal location. The journal device is
144 identified through its new major/minor numbers encoded 150 identified through its new major/minor numbers encoded
145 in devnum. 151 in devnum.
146 152
147noload Don't load the journal on mounting. 153noload Don't load the journal on mounting. Note that
154 if the filesystem was not unmounted cleanly,
155 skipping the journal replay will lead to the
156 filesystem containing inconsistencies that can
157 lead to any number of problems.
148 158
149data=journal All data are committed into the journal prior to being 159data=journal All data are committed into the journal prior to being
150 written into the main file system. 160 written into the main file system.
@@ -219,9 +229,12 @@ minixdf Make 'df' act like Minix.
219 229
220debug Extra debugging information is sent to syslog. 230debug Extra debugging information is sent to syslog.
221 231
222errors=remount-ro(*) Remount the filesystem read-only on an error. 232errors=remount-ro Remount the filesystem read-only on an error.
223errors=continue Keep going on a filesystem error. 233errors=continue Keep going on a filesystem error.
224errors=panic Panic and halt the machine if an error occurs. 234errors=panic Panic and halt the machine if an error occurs.
235 (These mount options override the errors behavior
236 specified in the superblock, which can be configured
237 using tune2fs)
225 238
226data_err=ignore(*) Just print an error message if an error occurs 239data_err=ignore(*) Just print an error message if an error occurs
227 in a file data buffer in ordered mode. 240 in a file data buffer in ordered mode.
@@ -261,6 +274,42 @@ delalloc (*) Deferring block allocation until write-out time.
261nodelalloc Disable delayed allocation. Blocks are allocation 274nodelalloc Disable delayed allocation. Blocks are allocation
262 when data is copied from user to page cache. 275 when data is copied from user to page cache.
263 276
277max_batch_time=usec Maximum amount of time ext4 should wait for
278 additional filesystem operations to be batch
279 together with a synchronous write operation.
280 Since a synchronous write operation is going to
281 force a commit and then a wait for the I/O
282 complete, it doesn't cost much, and can be a
283 huge throughput win, we wait for a small amount
284 of time to see if any other transactions can
285 piggyback on the synchronous write. The
286 algorithm used is designed to automatically tune
287 for the speed of the disk, by measuring the
288 amount of time (on average) that it takes to
289 finish committing a transaction. Call this time
290 the "commit time". If the time that the
291 transactoin has been running is less than the
292 commit time, ext4 will try sleeping for the
293 commit time to see if other operations will join
294 the transaction. The commit time is capped by
295 the max_batch_time, which defaults to 15000us
296 (15ms). This optimization can be turned off
297 entirely by setting max_batch_time to 0.
298
299min_batch_time=usec This parameter sets the commit time (as
300 described above) to be at least min_batch_time.
301 It defaults to zero microseconds. Increasing
302 this parameter may improve the throughput of
303 multi-threaded, synchronous workloads on very
304 fast disks, at the cost of increasing latency.
305
306journal_ioprio=prio The I/O priority (from 0 to 7, where 0 is the
307 highest priorty) which should be used for I/O
308 operations submitted by kjournald2 during a
309 commit operation. This defaults to 3, which is
310 a slightly higher priority than the default I/O
311 priority.
312
264Data Mode 313Data Mode
265========= 314=========
266There are 3 different data modes: 315There are 3 different data modes: