diff options
author | Lukas Czerner <lczerner@redhat.com> | 2011-02-21 20:16:21 -0500 |
---|---|---|
committer | Theodore Ts'o <tytso@mit.edu> | 2011-02-21 20:16:21 -0500 |
commit | 6f9524e9e118929f1de02840dffe858f99685aea (patch) | |
tree | 6d0b62df535a97299317af1038e4afacc3ed4e3c /Documentation | |
parent | 3abb17e82f08628b59e20d8cbcb55e2204180f69 (diff) |
ext4: update ext4 documentation
Add documentation for mount options and ioctls to
Documentation/filesystem/ext4.txt, which has not been udpated for some
time. Also add for ext4 sysfs tunables to the
Documentation/ABI/testing/sysfs-fs-ext4 file, and fix a few
typographical errors in that file.
https://bugzilla.kernel.org/show_bug.cgi?id=9423
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/ABI/testing/sysfs-fs-ext4 | 13 | ||||
-rw-r--r-- | Documentation/filesystems/ext4.txt | 207 |
2 files changed, 216 insertions, 4 deletions
diff --git a/Documentation/ABI/testing/sysfs-fs-ext4 b/Documentation/ABI/testing/sysfs-fs-ext4 index 5fb709997d96..f22ac0872ae8 100644 --- a/Documentation/ABI/testing/sysfs-fs-ext4 +++ b/Documentation/ABI/testing/sysfs-fs-ext4 | |||
@@ -48,7 +48,7 @@ Description: | |||
48 | will have its blocks allocated out of its own unique | 48 | will have its blocks allocated out of its own unique |
49 | preallocation pool. | 49 | preallocation pool. |
50 | 50 | ||
51 | What: /sys/fs/ext4/<disk>/inode_readahead | 51 | What: /sys/fs/ext4/<disk>/inode_readahead_blks |
52 | Date: March 2008 | 52 | Date: March 2008 |
53 | Contact: "Theodore Ts'o" <tytso@mit.edu> | 53 | Contact: "Theodore Ts'o" <tytso@mit.edu> |
54 | Description: | 54 | Description: |
@@ -85,7 +85,14 @@ Date: June 2008 | |||
85 | Contact: "Theodore Ts'o" <tytso@mit.edu> | 85 | Contact: "Theodore Ts'o" <tytso@mit.edu> |
86 | Description: | 86 | Description: |
87 | Tuning parameter which (if non-zero) controls the goal | 87 | Tuning parameter which (if non-zero) controls the goal |
88 | inode used by the inode allocator in p0reference to | 88 | inode used by the inode allocator in preference to |
89 | all other allocation hueristics. This is intended for | 89 | all other allocation heuristics. This is intended for |
90 | debugging use only, and should be 0 on production | 90 | debugging use only, and should be 0 on production |
91 | systems. | 91 | systems. |
92 | |||
93 | What: /sys/fs/ext4/<disk>/max_writeback_mb_bump | ||
94 | Date: September 2009 | ||
95 | Contact: "Theodore Ts'o" <tytso@mit.edu> | ||
96 | Description: | ||
97 | The maximum number of megabytes the writeback code will | ||
98 | try to write out before move on to another inode. | ||
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 6ab9442d7eeb..6b050464a90d 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt | |||
@@ -367,12 +367,47 @@ init_itable=n The lazy itable init code will wait n times the | |||
367 | minimizes the impact on the systme performance | 367 | minimizes the impact on the systme performance |
368 | while file system's inode table is being initialized. | 368 | while file system's inode table is being initialized. |
369 | 369 | ||
370 | discard Controls whether ext4 should issue discard/TRIM | 370 | discard Controls whether ext4 should issue discard/TRIM |
371 | nodiscard(*) commands to the underlying block device when | 371 | nodiscard(*) commands to the underlying block device when |
372 | blocks are freed. This is useful for SSD devices | 372 | blocks are freed. This is useful for SSD devices |
373 | and sparse/thinly-provisioned LUNs, but it is off | 373 | and sparse/thinly-provisioned LUNs, but it is off |
374 | by default until sufficient testing has been done. | 374 | by default until sufficient testing has been done. |
375 | 375 | ||
376 | nouid32 Disables 32-bit UIDs and GIDs. This is for | ||
377 | interoperability with older kernels which only | ||
378 | store and expect 16-bit values. | ||
379 | |||
380 | resize Allows to resize filesystem to the end of the last | ||
381 | existing block group, further resize has to be done | ||
382 | with resize2fs either online, or offline. It can be | ||
383 | used only with conjunction with remount. | ||
384 | |||
385 | block_validity This options allows to enables/disables the in-kernel | ||
386 | noblock_validity facility for tracking filesystem metadata blocks | ||
387 | within internal data structures. This allows multi- | ||
388 | block allocator and other routines to quickly locate | ||
389 | extents which might overlap with filesystem metadata | ||
390 | blocks. This option is intended for debugging | ||
391 | purposes and since it negatively affects the | ||
392 | performance, it is off by default. | ||
393 | |||
394 | dioread_lock Controls whether or not ext4 should use the DIO read | ||
395 | dioread_nolock locking. If the dioread_nolock option is specified | ||
396 | ext4 will allocate uninitialized extent before buffer | ||
397 | write and convert the extent to initialized after IO | ||
398 | completes. This approach allows ext4 code to avoid | ||
399 | using inode mutex, which improves scalability on high | ||
400 | speed storages. However this does not work with nobh | ||
401 | option and the mount will fail. Nor does it work with | ||
402 | data journaling and dioread_nolock option will be | ||
403 | ignored with kernel warning. Note that dioread_nolock | ||
404 | code path is only used for extent-based files. | ||
405 | Because of the restrictions this options comprises | ||
406 | it is off by default (e.g. dioread_lock). | ||
407 | |||
408 | i_version Enable 64-bit inode version support. This option is | ||
409 | off by default. | ||
410 | |||
376 | Data Mode | 411 | Data Mode |
377 | ========= | 412 | ========= |
378 | There are 3 different data modes: | 413 | There are 3 different data modes: |
@@ -400,6 +435,176 @@ needs to be read from and written to disk at the same time where it | |||
400 | outperforms all others modes. Currently ext4 does not have delayed | 435 | outperforms all others modes. Currently ext4 does not have delayed |
401 | allocation support if this data journalling mode is selected. | 436 | allocation support if this data journalling mode is selected. |
402 | 437 | ||
438 | /proc entries | ||
439 | ============= | ||
440 | |||
441 | Information about mounted ext4 file systems can be found in | ||
442 | /proc/fs/ext4. Each mounted filesystem will have a directory in | ||
443 | /proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or | ||
444 | /proc/fs/ext4/dm-0). The files in each per-device directory are shown | ||
445 | in table below. | ||
446 | |||
447 | Files in /proc/fs/ext4/<devname> | ||
448 | .............................................................................. | ||
449 | File Content | ||
450 | mb_groups details of multiblock allocator buddy cache of free blocks | ||
451 | .............................................................................. | ||
452 | |||
453 | /sys entries | ||
454 | ============ | ||
455 | |||
456 | Information about mounted ext4 file systems can be found in | ||
457 | /sys/fs/ext4. Each mounted filesystem will have a directory in | ||
458 | /sys/fs/ext4 based on its device name (i.e., /sys/fs/ext4/hdc or | ||
459 | /sys/fs/ext4/dm-0). The files in each per-device directory are shown | ||
460 | in table below. | ||
461 | |||
462 | Files in /sys/fs/ext4/<devname> | ||
463 | (see also Documentation/ABI/testing/sysfs-fs-ext4) | ||
464 | .............................................................................. | ||
465 | File Content | ||
466 | |||
467 | delayed_allocation_blocks This file is read-only and shows the number of | ||
468 | blocks that are dirty in the page cache, but | ||
469 | which do not have their location in the | ||
470 | filesystem allocated yet. | ||
471 | |||
472 | inode_goal Tuning parameter which (if non-zero) controls | ||
473 | the goal inode used by the inode allocator in | ||
474 | preference to all other allocation heuristics. | ||
475 | This is intended for debugging use only, and | ||
476 | should be 0 on production systems. | ||
477 | |||
478 | inode_readahead_blks Tuning parameter which controls the maximum | ||
479 | number of inode table blocks that ext4's inode | ||
480 | table readahead algorithm will pre-read into | ||
481 | the buffer cache | ||
482 | |||
483 | lifetime_write_kbytes This file is read-only and shows the number of | ||
484 | kilobytes of data that have been written to this | ||
485 | filesystem since it was created. | ||
486 | |||
487 | max_writeback_mb_bump The maximum number of megabytes the writeback | ||
488 | code will try to write out before move on to | ||
489 | another inode. | ||
490 | |||
491 | mb_group_prealloc The multiblock allocator will round up allocation | ||
492 | requests to a multiple of this tuning parameter if | ||
493 | the stripe size is not set in the ext4 superblock | ||
494 | |||
495 | mb_max_to_scan The maximum number of extents the multiblock | ||
496 | allocator will search to find the best extent | ||
497 | |||
498 | mb_min_to_scan The minimum number of extents the multiblock | ||
499 | allocator will search to find the best extent | ||
500 | |||
501 | mb_order2_req Tuning parameter which controls the minimum size | ||
502 | for requests (as a power of 2) where the buddy | ||
503 | cache is used | ||
504 | |||
505 | mb_stats Controls whether the multiblock allocator should | ||
506 | collect statistics, which are shown during the | ||
507 | unmount. 1 means to collect statistics, 0 means | ||
508 | not to collect statistics | ||
509 | |||
510 | mb_stream_req Files which have fewer blocks than this tunable | ||
511 | parameter will have their blocks allocated out | ||
512 | of a block group specific preallocation pool, so | ||
513 | that small files are packed closely together. | ||
514 | Each large file will have its blocks allocated | ||
515 | out of its own unique preallocation pool. | ||
516 | |||
517 | session_write_kbytes This file is read-only and shows the number of | ||
518 | kilobytes of data that have been written to this | ||
519 | filesystem since it was mounted. | ||
520 | .............................................................................. | ||
521 | |||
522 | Ioctls | ||
523 | ====== | ||
524 | |||
525 | There is some Ext4 specific functionality which can be accessed by applications | ||
526 | through the system call interfaces. The list of all Ext4 specific ioctls are | ||
527 | shown in the table below. | ||
528 | |||
529 | Table of Ext4 specific ioctls | ||
530 | .............................................................................. | ||
531 | Ioctl Description | ||
532 | EXT4_IOC_GETFLAGS Get additional attributes associated with inode. | ||
533 | The ioctl argument is an integer bitfield, with | ||
534 | bit values described in ext4.h. This ioctl is an | ||
535 | alias for FS_IOC_GETFLAGS. | ||
536 | |||
537 | EXT4_IOC_SETFLAGS Set additional attributes associated with inode. | ||
538 | The ioctl argument is an integer bitfield, with | ||
539 | bit values described in ext4.h. This ioctl is an | ||
540 | alias for FS_IOC_SETFLAGS. | ||
541 | |||
542 | EXT4_IOC_GETVERSION | ||
543 | EXT4_IOC_GETVERSION_OLD | ||
544 | Get the inode i_generation number stored for | ||
545 | each inode. The i_generation number is normally | ||
546 | changed only when new inode is created and it is | ||
547 | particularly useful for network filesystems. The | ||
548 | '_OLD' version of this ioctl is an alias for | ||
549 | FS_IOC_GETVERSION. | ||
550 | |||
551 | EXT4_IOC_SETVERSION | ||
552 | EXT4_IOC_SETVERSION_OLD | ||
553 | Set the inode i_generation number stored for | ||
554 | each inode. The '_OLD' version of this ioctl | ||
555 | is an alias for FS_IOC_SETVERSION. | ||
556 | |||
557 | EXT4_IOC_GROUP_EXTEND This ioctl has the same purpose as the resize | ||
558 | mount option. It allows to resize filesystem | ||
559 | to the end of the last existing block group, | ||
560 | further resize has to be done with resize2fs, | ||
561 | either online, or offline. The argument points | ||
562 | to the unsigned logn number representing the | ||
563 | filesystem new block count. | ||
564 | |||
565 | EXT4_IOC_MOVE_EXT Move the block extents from orig_fd (the one | ||
566 | this ioctl is pointing to) to the donor_fd (the | ||
567 | one specified in move_extent structure passed | ||
568 | as an argument to this ioctl). Then, exchange | ||
569 | inode metadata between orig_fd and donor_fd. | ||
570 | This is especially useful for online | ||
571 | defragmentation, because the allocator has the | ||
572 | opportunity to allocate moved blocks better, | ||
573 | ideally into one contiguous extent. | ||
574 | |||
575 | EXT4_IOC_GROUP_ADD Add a new group descriptor to an existing or | ||
576 | new group descriptor block. The new group | ||
577 | descriptor is described by ext4_new_group_input | ||
578 | structure, which is passed as an argument to | ||
579 | this ioctl. This is especially useful in | ||
580 | conjunction with EXT4_IOC_GROUP_EXTEND, | ||
581 | which allows online resize of the filesystem | ||
582 | to the end of the last existing block group. | ||
583 | Those two ioctls combined is used in userspace | ||
584 | online resize tool (e.g. resize2fs). | ||
585 | |||
586 | EXT4_IOC_MIGRATE This ioctl operates on the filesystem itself. | ||
587 | It converts (migrates) ext3 indirect block mapped | ||
588 | inode to ext4 extent mapped inode by walking | ||
589 | through indirect block mapping of the original | ||
590 | inode and converting contiguous block ranges | ||
591 | into ext4 extents of the temporary inode. Then, | ||
592 | inodes are swapped. This ioctl might help, when | ||
593 | migrating from ext3 to ext4 filesystem, however | ||
594 | suggestion is to create fresh ext4 filesystem | ||
595 | and copy data from the backup. Note, that | ||
596 | filesystem has to support extents for this ioctl | ||
597 | to work. | ||
598 | |||
599 | EXT4_IOC_ALLOC_DA_BLKS Force all of the delay allocated blocks to be | ||
600 | allocated to preserve application-expected ext3 | ||
601 | behaviour. Note that this will also start | ||
602 | triggering a write of the data blocks, but this | ||
603 | behaviour may change in the future as it is | ||
604 | not necessary and has been done this way only | ||
605 | for sake of simplicity. | ||
606 | .............................................................................. | ||
607 | |||
403 | References | 608 | References |
404 | ========== | 609 | ========== |
405 | 610 | ||