diff options
author | Jose R. Santos <jrs@us.ibm.com> | 2008-07-11 19:27:31 -0400 |
---|---|---|
committer | Theodore Ts'o <tytso@mit.edu> | 2008-07-11 19:27:31 -0400 |
commit | 93e3270c87549dc531a0b0e5d06362d998d810cb (patch) | |
tree | 2eed9efa6734780d3adbe85dd12c4a470d93072a /Documentation | |
parent | 5f21b0e642d7bf6fe4434c9ba12bc9cb96b17cf7 (diff) |
ext4: Documentation updates.
Some of the information in Documentation/filesystems/ext4.txt is out
of date and in need of an update.
Signed-off-by: Jose R. Santos <jrs@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/filesystems/ext4.txt | 106 |
1 files changed, 62 insertions, 44 deletions
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 0c5086db8352..7e940c64be47 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt | |||
@@ -13,72 +13,89 @@ Mailing list: linux-ext4@vger.kernel.org | |||
13 | 1. Quick usage instructions: | 13 | 1. Quick usage instructions: |
14 | =========================== | 14 | =========================== |
15 | 15 | ||
16 | - Grab updated e2fsprogs from | 16 | - Compile and install the latest version of e2fsprogs (as of this |
17 | ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs-interim/ | 17 | writing version 1.41) from: |
18 | This is a patchset on top of e2fsprogs-1.39, which can be found at | 18 | |
19 | http://sourceforge.net/project/showfiles.php?group_id=2406 | ||
20 | |||
21 | or | ||
22 | |||
19 | ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs/ | 23 | ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/e2fsprogs/ |
20 | 24 | ||
21 | - It's still mke2fs -j /dev/hda1 | 25 | or grab the latest git repository from: |
26 | |||
27 | git://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git | ||
28 | |||
29 | - Create a new filesystem using the ext4dev filesystem type: | ||
30 | |||
31 | # mke2fs -t ext4dev /dev/hda1 | ||
32 | |||
33 | Or configure an existing ext3 filesystem to support extents and set | ||
34 | the test_fs flag to indicate that it's ok for an in-development | ||
35 | filesystem to touch this filesystem: | ||
22 | 36 | ||
23 | - mount /dev/hda1 /wherever -t ext4dev | 37 | # tune2fs -O extents -E test_fs /dev/hda1 |
24 | 38 | ||
25 | - To enable extents, | 39 | If the filesystem was created with 128 byte inodes, it can be |
40 | converted to use 256 byte for greater efficiency via: | ||
26 | 41 | ||
27 | mount /dev/hda1 /wherever -t ext4dev -o extents | 42 | # tune2fs -I 256 /dev/hda1 |
28 | 43 | ||
29 | - The filesystem is compatible with the ext3 driver until you add a file | 44 | (Note: we currently do not have tools to convert an ext4dev |
30 | which has extents (ie: `mount -o extents', then create a file). | 45 | filesystem back to ext3; so please do not do try this on production |
46 | filesystems.) | ||
31 | 47 | ||
32 | NOTE: The "extents" mount flag is temporary. It will soon go away and | 48 | - Mounting: |
33 | extents will be enabled by the "-o extents" flag to mke2fs or tune2fs | 49 | |
50 | # mount -t ext4dev /dev/hda1 /wherever | ||
34 | 51 | ||
35 | - When comparing performance with other filesystems, remember that | 52 | - When comparing performance with other filesystems, remember that |
36 | ext3/4 by default offers higher data integrity guarantees than most. So | 53 | ext3/4 by default offers higher data integrity guarantees than most. |
37 | when comparing with a metadata-only journalling filesystem, use `mount -o | 54 | So when comparing with a metadata-only journalling filesystem, such |
38 | data=writeback'. And you might as well use `mount -o nobh' too along | 55 | as ext3, use `mount -o data=writeback'. And you might as well use |
39 | with it. Making the journal larger than the mke2fs default often helps | 56 | `mount -o nobh' too along with it. Making the journal larger than |
40 | performance with metadata-intensive workloads. | 57 | the mke2fs default often helps performance with metadata-intensive |
58 | workloads. | ||
41 | 59 | ||
42 | 2. Features | 60 | 2. Features |
43 | =========== | 61 | =========== |
44 | 62 | ||
45 | 2.1 Currently available | 63 | 2.1 Currently available |
46 | 64 | ||
47 | * ability to use filesystems > 16TB | 65 | * ability to use filesystems > 16TB (e2fsprogs support not available yet) |
48 | * extent format reduces metadata overhead (RAM, IO for access, transactions) | 66 | * extent format reduces metadata overhead (RAM, IO for access, transactions) |
49 | * extent format more robust in face of on-disk corruption due to magics, | 67 | * extent format more robust in face of on-disk corruption due to magics, |
50 | * internal redunancy in tree | 68 | * internal redunancy in tree |
51 | 69 | * improved file allocation (multi-block alloc, delayed alloc) | |
52 | 2.1 Previously available, soon to be enabled by default by "mkefs.ext4": | 70 | * fix 32000 subdirectory limit |
53 | 71 | * nsec timestamps for mtime, atime, ctime, create time | |
54 | * dir_index and resize inode will be on by default | 72 | * inode version field on disk (NFSv4, Lustre) |
55 | * large inodes will be used by default for fast EAs, nsec timestamps, etc | 73 | * reduced e2fsck time via uninit_bg feature |
74 | * journal checksumming for robustness, performance | ||
75 | * persistent file preallocation (e.g for streaming media, databases) | ||
76 | * ability to pack bitmaps and inode tables into larger virtual groups via the | ||
77 | flex_bg feature | ||
78 | * large file support | ||
79 | * Inode allocation using large virtual block groups via flex_bg | ||
56 | 80 | ||
57 | 2.2 Candidate features for future inclusion | 81 | 2.2 Candidate features for future inclusion |
58 | 82 | ||
59 | There are several under discussion, whether they all make it in is | 83 | * Online defrag (patches available but not well tested) |
60 | partly a function of how much time everyone has to work on them: | 84 | * reduced mke2fs time via lazy itable initialization in conjuction with |
85 | the uninit_bg feature (capability to do this is available in e2fsprogs | ||
86 | but a kernel thread to do lazy zeroing of unused inode table blocks | ||
87 | after filesystem is first mounted is required for safety) | ||
61 | 88 | ||
62 | * improved file allocation (multi-block alloc, delayed alloc; basically done) | 89 | There are several others under discussion, whether they all make it in is |
63 | * fix 32000 subdirectory limit (patch exists, needs some e2fsck work) | 90 | partly a function of how much time everyone has to work on them. Features like |
64 | * nsec timestamps for mtime, atime, ctime, create time (patch exists, | 91 | metadata checksumming have been discussed and planned for a bit but no patches |
65 | needs some e2fsck work) | 92 | exist yet so I'm not sure they're in the near-term roadmap. |
66 | * inode version field on disk (NFSv4, Lustre; prototype exists) | ||
67 | * reduced mke2fs/e2fsck time via uninitialized groups (prototype exists) | ||
68 | * journal checksumming for robustness, performance (prototype exists) | ||
69 | * persistent file preallocation (e.g for streaming media, databases) | ||
70 | 93 | ||
71 | Features like metadata checksumming have been discussed and planned for | 94 | The big performance win will come with mballoc, delalloc and flex_bg |
72 | a bit but no patches exist yet so I'm not sure they're in the near-term | 95 | grouping of bitmaps and inode tables. Some test results available here: |
73 | roadmap. | ||
74 | 96 | ||
75 | The big performance win will come with mballoc and delalloc. CFS has | 97 | - http://www.bullopensource.org/ext4/20080530/ffsb-write-2.6.26-rc2.html |
76 | been using mballoc for a few years already with Lustre, and IBM + Bull | 98 | - http://www.bullopensource.org/ext4/20080530/ffsb-readwrite-2.6.26-rc2.html |
77 | did a lot of benchmarking on it. The reason it isn't in the first set of | ||
78 | patches is partly a manageability issue, and partly because it doesn't | ||
79 | directly affect the on-disk format (outside of much better allocation) | ||
80 | so it isn't critical to get into the first round of changes. I believe | ||
81 | Alex is working on a new set of patches right now. | ||
82 | 99 | ||
83 | 3. Options | 100 | 3. Options |
84 | ========== | 101 | ========== |
@@ -224,7 +241,7 @@ stripe=n Number of filesystem blocks that mballoc will try | |||
224 | disks * RAID chunk size in file system blocks. | 241 | disks * RAID chunk size in file system blocks. |
225 | 242 | ||
226 | Data Mode | 243 | Data Mode |
227 | --------- | 244 | ========= |
228 | There are 3 different data modes: | 245 | There are 3 different data modes: |
229 | 246 | ||
230 | * writeback mode | 247 | * writeback mode |
@@ -256,7 +273,8 @@ kernel source: <file:fs/ext4/> | |||
256 | <file:fs/jbd2/> | 273 | <file:fs/jbd2/> |
257 | 274 | ||
258 | programs: http://e2fsprogs.sourceforge.net/ | 275 | programs: http://e2fsprogs.sourceforge.net/ |
259 | http://ext2resize.sourceforge.net | ||
260 | 276 | ||
261 | useful links: http://fedoraproject.org/wiki/ext3-devel | 277 | useful links: http://fedoraproject.org/wiki/ext3-devel |
262 | http://www.bullopensource.org/ext4/ | 278 | http://www.bullopensource.org/ext4/ |
279 | http://ext4.wiki.kernel.org/index.php/Main_Page | ||
280 | http://fedoraproject.org/wiki/Features/Ext4 | ||