diff options
| author | Lukas Czerner <lczerner@redhat.com> | 2011-02-21 20:16:21 -0500 |
|---|---|---|
| committer | Theodore Ts'o <tytso@mit.edu> | 2011-02-21 20:16:21 -0500 |
| commit | 6f9524e9e118929f1de02840dffe858f99685aea (patch) | |
| tree | 6d0b62df535a97299317af1038e4afacc3ed4e3c /Documentation | |
| parent | 3abb17e82f08628b59e20d8cbcb55e2204180f69 (diff) | |
ext4: update ext4 documentation
Add documentation for mount options and ioctls to
Documentation/filesystem/ext4.txt, which has not been udpated for some
time. Also add for ext4 sysfs tunables to the
Documentation/ABI/testing/sysfs-fs-ext4 file, and fix a few
typographical errors in that file.
https://bugzilla.kernel.org/show_bug.cgi?id=9423
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Diffstat (limited to 'Documentation')
| -rw-r--r-- | Documentation/ABI/testing/sysfs-fs-ext4 | 13 | ||||
| -rw-r--r-- | Documentation/filesystems/ext4.txt | 207 |
2 files changed, 216 insertions, 4 deletions
diff --git a/Documentation/ABI/testing/sysfs-fs-ext4 b/Documentation/ABI/testing/sysfs-fs-ext4 index 5fb709997d9..f22ac0872ae 100644 --- a/Documentation/ABI/testing/sysfs-fs-ext4 +++ b/Documentation/ABI/testing/sysfs-fs-ext4 | |||
| @@ -48,7 +48,7 @@ Description: | |||
| 48 | will have its blocks allocated out of its own unique | 48 | will have its blocks allocated out of its own unique |
| 49 | preallocation pool. | 49 | preallocation pool. |
| 50 | 50 | ||
| 51 | What: /sys/fs/ext4/<disk>/inode_readahead | 51 | What: /sys/fs/ext4/<disk>/inode_readahead_blks |
| 52 | Date: March 2008 | 52 | Date: March 2008 |
| 53 | Contact: "Theodore Ts'o" <tytso@mit.edu> | 53 | Contact: "Theodore Ts'o" <tytso@mit.edu> |
| 54 | Description: | 54 | Description: |
| @@ -85,7 +85,14 @@ Date: June 2008 | |||
| 85 | Contact: "Theodore Ts'o" <tytso@mit.edu> | 85 | Contact: "Theodore Ts'o" <tytso@mit.edu> |
| 86 | Description: | 86 | Description: |
| 87 | Tuning parameter which (if non-zero) controls the goal | 87 | Tuning parameter which (if non-zero) controls the goal |
| 88 | inode used by the inode allocator in p0reference to | 88 | inode used by the inode allocator in preference to |
| 89 | all other allocation hueristics. This is intended for | 89 | all other allocation heuristics. This is intended for |
| 90 | debugging use only, and should be 0 on production | 90 | debugging use only, and should be 0 on production |
| 91 | systems. | 91 | systems. |
| 92 | |||
| 93 | What: /sys/fs/ext4/<disk>/max_writeback_mb_bump | ||
| 94 | Date: September 2009 | ||
| 95 | Contact: "Theodore Ts'o" <tytso@mit.edu> | ||
| 96 | Description: | ||
| 97 | The maximum number of megabytes the writeback code will | ||
| 98 | try to write out before move on to another inode. | ||
diff --git a/Documentation/filesystems/ext4.txt b/Documentation/filesystems/ext4.txt index 6ab9442d7ee..6b050464a90 100644 --- a/Documentation/filesystems/ext4.txt +++ b/Documentation/filesystems/ext4.txt | |||
| @@ -367,12 +367,47 @@ init_itable=n The lazy itable init code will wait n times the | |||
| 367 | minimizes the impact on the systme performance | 367 | minimizes the impact on the systme performance |
| 368 | while file system's inode table is being initialized. | 368 | while file system's inode table is being initialized. |
| 369 | 369 | ||
| 370 | discard Controls whether ext4 should issue discard/TRIM | 370 | discard Controls whether ext4 should issue discard/TRIM |
| 371 | nodiscard(*) commands to the underlying block device when | 371 | nodiscard(*) commands to the underlying block device when |
| 372 | blocks are freed. This is useful for SSD devices | 372 | blocks are freed. This is useful for SSD devices |
| 373 | and sparse/thinly-provisioned LUNs, but it is off | 373 | and sparse/thinly-provisioned LUNs, but it is off |
| 374 | by default until sufficient testing has been done. | 374 | by default until sufficient testing has been done. |
| 375 | 375 | ||
| 376 | nouid32 Disables 32-bit UIDs and GIDs. This is for | ||
| 377 | interoperability with older kernels which only | ||
| 378 | store and expect 16-bit values. | ||
| 379 | |||
| 380 | resize Allows to resize filesystem to the end of the last | ||
| 381 | existing block group, further resize has to be done | ||
| 382 | with resize2fs either online, or offline. It can be | ||
| 383 | used only with conjunction with remount. | ||
| 384 | |||
| 385 | block_validity This options allows to enables/disables the in-kernel | ||
| 386 | noblock_validity facility for tracking filesystem metadata blocks | ||
| 387 | within internal data structures. This allows multi- | ||
| 388 | block allocator and other routines to quickly locate | ||
| 389 | extents which might overlap with filesystem metadata | ||
| 390 | blocks. This option is intended for debugging | ||
| 391 | purposes and since it negatively affects the | ||
| 392 | performance, it is off by default. | ||
| 393 | |||
| 394 | dioread_lock Controls whether or not ext4 should use the DIO read | ||
| 395 | dioread_nolock locking. If the dioread_nolock option is specified | ||
| 396 | ext4 will allocate uninitialized extent before buffer | ||
| 397 | write and convert the extent to initialized after IO | ||
| 398 | completes. This approach allows ext4 code to avoid | ||
| 399 | using inode mutex, which improves scalability on high | ||
| 400 | speed storages. However this does not work with nobh | ||
| 401 | option and the mount will fail. Nor does it work with | ||
| 402 | data journaling and dioread_nolock option will be | ||
| 403 | ignored with kernel warning. Note that dioread_nolock | ||
| 404 | code path is only used for extent-based files. | ||
| 405 | Because of the restrictions this options comprises | ||
| 406 | it is off by default (e.g. dioread_lock). | ||
| 407 | |||
| 408 | i_version Enable 64-bit inode version support. This option is | ||
| 409 | off by default. | ||
| 410 | |||
| 376 | Data Mode | 411 | Data Mode |
| 377 | ========= | 412 | ========= |
| 378 | There are 3 different data modes: | 413 | There are 3 different data modes: |
| @@ -400,6 +435,176 @@ needs to be read from and written to disk at the same time where it | |||
| 400 | outperforms all others modes. Currently ext4 does not have delayed | 435 | outperforms all others modes. Currently ext4 does not have delayed |
| 401 | allocation support if this data journalling mode is selected. | 436 | allocation support if this data journalling mode is selected. |
| 402 | 437 | ||
| 438 | /proc entries | ||
| 439 | ============= | ||
| 440 | |||
| 441 | Information about mounted ext4 file systems can be found in | ||
| 442 | /proc/fs/ext4. Each mounted filesystem will have a directory in | ||
| 443 | /proc/fs/ext4 based on its device name (i.e., /proc/fs/ext4/hdc or | ||
| 444 | /proc/fs/ext4/dm-0). The files in each per-device directory are shown | ||
| 445 | in table below. | ||
| 446 | |||
| 447 | Files in /proc/fs/ext4/<devname> | ||
| 448 | .............................................................................. | ||
| 449 | File Content | ||
| 450 | mb_groups details of multiblock allocator buddy cache of free blocks | ||
| 451 | .............................................................................. | ||
| 452 | |||
| 453 | /sys entries | ||
| 454 | ============ | ||
| 455 | |||
| 456 | Information about mounted ext4 file systems can be found in | ||
| 457 | /sys/fs/ext4. Each mounted filesystem will have a directory in | ||
| 458 | /sys/fs/ext4 based on its device name (i.e., /sys/fs/ext4/hdc or | ||
| 459 | /sys/fs/ext4/dm-0). The files in each per-device directory are shown | ||
| 460 | in table below. | ||
| 461 | |||
| 462 | Files in /sys/fs/ext4/<devname> | ||
| 463 | (see also Documentation/ABI/testing/sysfs-fs-ext4) | ||
| 464 | .............................................................................. | ||
| 465 | File Content | ||
| 466 | |||
| 467 | delayed_allocation_blocks This file is read-only and shows the number of | ||
| 468 | blocks that are dirty in the page cache, but | ||
| 469 | which do not have their location in the | ||
| 470 | filesystem allocated yet. | ||
| 471 | |||
| 472 | inode_goal Tuning parameter which (if non-zero) controls | ||
| 473 | the goal inode used by the inode allocator in | ||
| 474 | preference to all other allocation heuristics. | ||
| 475 | This is intended for debugging use only, and | ||
| 476 | should be 0 on production systems. | ||
| 477 | |||
| 478 | inode_readahead_blks Tuning parameter which controls the maximum | ||
| 479 | number of inode table blocks that ext4's inode | ||
| 480 | table readahead algorithm will pre-read into | ||
| 481 | the buffer cache | ||
| 482 | |||
| 483 | lifetime_write_kbytes This file is read-only and shows the number of | ||
| 484 | kilobytes of data that have been written to this | ||
| 485 | filesystem since it was created. | ||
| 486 | |||
| 487 | max_writeback_mb_bump The maximum number of megabytes the writeback | ||
| 488 | code will try to write out before move on to | ||
| 489 | another inode. | ||
| 490 | |||
| 491 | mb_group_prealloc The multiblock allocator will round up allocation | ||
| 492 | requests to a multiple of this tuning parameter if | ||
| 493 | the stripe size is not set in the ext4 superblock | ||
| 494 | |||
| 495 | mb_max_to_scan The maximum number of extents the multiblock | ||
| 496 | allocator will search to find the best extent | ||
| 497 | |||
| 498 | mb_min_to_scan The minimum number of extents the multiblock | ||
| 499 | allocator will search to find the best extent | ||
| 500 | |||
| 501 | mb_order2_req Tuning parameter which controls the minimum size | ||
| 502 | for requests (as a power of 2) where the buddy | ||
| 503 | cache is used | ||
| 504 | |||
| 505 | mb_stats Controls whether the multiblock allocator should | ||
| 506 | collect statistics, which are shown during the | ||
| 507 | unmount. 1 means to collect statistics, 0 means | ||
| 508 | not to collect statistics | ||
| 509 | |||
| 510 | mb_stream_req Files which have fewer blocks than this tunable | ||
| 511 | parameter will have their blocks allocated out | ||
| 512 | of a block group specific preallocation pool, so | ||
| 513 | that small files are packed closely together. | ||
| 514 | Each large file will have its blocks allocated | ||
| 515 | out of its own unique preallocation pool. | ||
| 516 | |||
| 517 | session_write_kbytes This file is read-only and shows the number of | ||
| 518 | kilobytes of data that have been written to this | ||
| 519 | filesystem since it was mounted. | ||
| 520 | .............................................................................. | ||
| 521 | |||
| 522 | Ioctls | ||
| 523 | ====== | ||
| 524 | |||
| 525 | There is some Ext4 specific functionality which can be accessed by applications | ||
| 526 | through the system call interfaces. The list of all Ext4 specific ioctls are | ||
| 527 | shown in the table below. | ||
| 528 | |||
| 529 | Table of Ext4 specific ioctls | ||
| 530 | .............................................................................. | ||
| 531 | Ioctl Description | ||
| 532 | EXT4_IOC_GETFLAGS Get additional attributes associated with inode. | ||
| 533 | The ioctl argument is an integer bitfield, with | ||
| 534 | bit values described in ext4.h. This ioctl is an | ||
| 535 | alias for FS_IOC_GETFLAGS. | ||
| 536 | |||
| 537 | EXT4_IOC_SETFLAGS Set additional attributes associated with inode. | ||
| 538 | The ioctl argument is an integer bitfield, with | ||
| 539 | bit values described in ext4.h. This ioctl is an | ||
| 540 | alias for FS_IOC_SETFLAGS. | ||
| 541 | |||
| 542 | EXT4_IOC_GETVERSION | ||
| 543 | EXT4_IOC_GETVERSION_OLD | ||
| 544 | Get the inode i_generation number stored for | ||
| 545 | each inode. The i_generation number is normally | ||
| 546 | changed only when new inode is created and it is | ||
| 547 | particularly useful for network filesystems. The | ||
| 548 | '_OLD' version of this ioctl is an alias for | ||
| 549 | FS_IOC_GETVERSION. | ||
| 550 | |||
| 551 | EXT4_IOC_SETVERSION | ||
| 552 | EXT4_IOC_SETVERSION_OLD | ||
| 553 | Set the inode i_generation number stored for | ||
| 554 | each inode. The '_OLD' version of this ioctl | ||
| 555 | is an alias for FS_IOC_SETVERSION. | ||
| 556 | |||
| 557 | EXT4_IOC_GROUP_EXTEND This ioctl has the same purpose as the resize | ||
| 558 | mount option. It allows to resize filesystem | ||
| 559 | to the end of the last existing block group, | ||
| 560 | further resize has to be done with resize2fs, | ||
| 561 | either online, or offline. The argument points | ||
| 562 | to the unsigned logn number representing the | ||
| 563 | filesystem new block count. | ||
| 564 | |||
| 565 | EXT4_IOC_MOVE_EXT Move the block extents from orig_fd (the one | ||
| 566 | this ioctl is pointing to) to the donor_fd (the | ||
| 567 | one specified in move_extent structure passed | ||
| 568 | as an argument to this ioctl). Then, exchange | ||
| 569 | inode metadata between orig_fd and donor_fd. | ||
| 570 | This is especially useful for online | ||
| 571 | defragmentation, because the allocator has the | ||
| 572 | opportunity to allocate moved blocks better, | ||
| 573 | ideally into one contiguous extent. | ||
| 574 | |||
| 575 | EXT4_IOC_GROUP_ADD Add a new group descriptor to an existing or | ||
| 576 | new group descriptor block. The new group | ||
| 577 | descriptor is described by ext4_new_group_input | ||
| 578 | structure, which is passed as an argument to | ||
| 579 | this ioctl. This is especially useful in | ||
| 580 | conjunction with EXT4_IOC_GROUP_EXTEND, | ||
| 581 | which allows online resize of the filesystem | ||
| 582 | to the end of the last existing block group. | ||
| 583 | Those two ioctls combined is used in userspace | ||
| 584 | online resize tool (e.g. resize2fs). | ||
| 585 | |||
| 586 | EXT4_IOC_MIGRATE This ioctl operates on the filesystem itself. | ||
| 587 | It converts (migrates) ext3 indirect block mapped | ||
| 588 | inode to ext4 extent mapped inode by walking | ||
| 589 | through indirect block mapping of the original | ||
| 590 | inode and converting contiguous block ranges | ||
| 591 | into ext4 extents of the temporary inode. Then, | ||
| 592 | inodes are swapped. This ioctl might help, when | ||
| 593 | migrating from ext3 to ext4 filesystem, however | ||
| 594 | suggestion is to create fresh ext4 filesystem | ||
| 595 | and copy data from the backup. Note, that | ||
| 596 | filesystem has to support extents for this ioctl | ||
| 597 | to work. | ||
| 598 | |||
| 599 | EXT4_IOC_ALLOC_DA_BLKS Force all of the delay allocated blocks to be | ||
| 600 | allocated to preserve application-expected ext3 | ||
| 601 | behaviour. Note that this will also start | ||
| 602 | triggering a write of the data blocks, but this | ||
| 603 | behaviour may change in the future as it is | ||
| 604 | not necessary and has been done this way only | ||
| 605 | for sake of simplicity. | ||
| 606 | .............................................................................. | ||
| 607 | |||
| 403 | References | 608 | References |
| 404 | ========== | 609 | ========== |
| 405 | 610 | ||
