aboutsummaryrefslogtreecommitdiffstats
path: root/fs/ext4
Commit message (Collapse)AuthorAge
...
* ext4: Use page_mkwrite vma_operations to get mmap write notification.Aneesh Kumar K.V2008-07-11
| | | | | | | | | | | | | | We would like to get notified when we are doing a write on mmap section. This is needed with respect to preallocated area. We split the preallocated area into initialzed extent and uninitialzed extent in the call back. This let us handle ENOSPC better. Otherwise we get ENOSPC in the writepage and that would result in data loss. The changes are also needed to handle ENOSPC when writing to an mmap section of files with holes. Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix online resize with mballocFrederic Bohe2008-07-11
| | | | | | | | | | Update group infos when updating a group's descriptor. Add group infos when adding a group's descriptor. Refresh cache pages used by mb_alloc when changes occur. This will probably need modifications when META_BG resizing will be allowed. Signed-off-by: Frederic Bohe <frederic.bohe@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com>
* ext4: use atomic functions to set bh_stateEric Sandeen2008-07-11
| | | | | | | | | | Use the BUFFER_FNS functions (set_buffer_foo) to set buffer head state atomically instead of nonatomic __set_bit(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Set journal pointer to NULL when journal is releasedJan Kara2008-07-11
| | | | | | | | | | | Set sbi->s_journal to NULL after we call journal_destroy(). This will be later needed because after journal_destroy() is called, ext4_clear_inode() can still be called for some inodes (e.g. root inode) and we'll need to detect there that journal doesn't exists anymore. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: mballoc avoid use root reserved blocks for non root allocationMingming Cao2008-07-11
| | | | | | | | | mballoc allocation missed check for blocks reserved for root users. Add ext4_has_free_blocks() check before allocation. Also modified ext4_has_free_blocks() to support multiple block allocation request. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: call blkdev_issue_flush on fsyncEric Sandeen2008-07-11
| | | | | | | | | | | | To ensure that bits are truly on-disk after an fsync, we should call blkdev_issue_flush if barriers are supported. Inspired by an old thread on barriers, by reiserfs & xfs which do the same, and by a patch SuSE ships with their kernel Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: cleanup block allocatorAneesh Kumar K.V2008-07-11
| | | | | | | | | Move the code related to block allocation to a single function and add helper funtions to differient allocation for data and meta data blocks Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Use inode preallocation with -o noextentsAneesh Kumar K.V2008-07-11
| | | | | | | | | | | | | | | | | | | When mballoc is enabled, block allocation for old block-based files are allocated using mballoc allocator instead of old block-based allocator. The old ext3 block reservation is turned off when mballoc is turned on. However, the in-core preallocation is not enabled for block-based/ non-extent based file block allocation. This result in performance regression, as now we don't have "reservation" ore in-core preallocation to prevent interleaved fragmentation in multiple writes workload. This patch fix this by enable per inode in-core preallocation for non extent files when mballoc is used. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix ext4_init_block_bitmap() for metablock block groupAkinobu Mita2008-07-11
| | | | | | | | | | | | | | | When meta_bg feature is enabled and s_first_meta_bg != 0, ext4_init_block_bitmap() miscalculates the number of block used by the group descriptor table (0 or 1 for metablock block group) This patch fixes this by using ext4_bg_num_gdb() Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Stephen Tweedie <sct@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Andreas Dilger <adilger@sun.com>
* ext4: Fix sparse warningAneesh Kumar K.V2008-07-11
| | | | | Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: cleanup never-used magic numbers from htree codeLi Zefan2008-07-11
| | | | | | | | | | | dx_root_limit() will had some dead code which forced it to always return 20, and dx_node_limit to always return 22 for debugging purposes. Remove it. Acked-by: Andreas Dilger <adilger@sun.com> Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Fix ext4_ext_journal_restart() to reflect errors up to the callerShen Feng2008-07-11
| | | | | | | | | | | Fix ext4_ext_journal_restart() so it returns any errors reported by ext4_journal_extend() and ext4_journal_restart(). Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Make ext4_ext_find_extent fills ext_path completelyShen Feng2008-07-11
| | | | | | | | | When pos=0 or depth, the fields of ext4_ext_path is are not completely filled. This patch also removes some unnecessary code. Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: return error when calling ext4_ext_split failedShen Feng2008-07-11
| | | | | | | | | | ext4_ext_create_new_leaf must return error when its calling to ext4_ext_split failed. Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Update i_disksize properly when allocating from fallocate area.Aneesh Kumar K.V2008-07-11
| | | | | | | | | | | | | | When allocating unitialized space at the end of file which had been preallocated with the FALLOC_FL_KEEP_SIZE option, the file size is not updated at that time. But the later we are not updating the file size when writing to that preallocated space. These changes are for code correctness. This patch allows us to update the i_disksize at the write_end() callback of filesystem properly. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: remove quota allocation when ext4_mb_new_blocks failsShen Feng2008-07-11
| | | | | | | | | | Quota allocation is not removed when ext4_mb_new_blocks calls kmem_cache_alloc failed. Also make sure the allocation context is freed on the error path. Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: remove redundant code in ext4_fill_super()Li Zefan2008-07-11
| | | | | | | | The previous sb_min_blocksize() has already set the block size. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: New inode allocation for FLEX_BG meta-data groups.Jose R. Santos2008-07-11
| | | | | | | | | | | | | | | | | | | | | | | This patch mostly controls the way inode are allocated in order to make ialloc aware of flex_bg block group grouping. It achieves this by bypassing the Orlov allocator when block group meta-data are packed toghether through mke2fs. Since the impact on the block allocator is minimal, this patch should have little or no effect on other block allocation algorithms. By controlling the inode allocation, it can basically control where the initial search for new block begins and thus indirectly manipulate the block allocator. This allocator favors data and meta-data locality so the disk will gradually be filled from block group zero upward. This helps improve performance by reducing seek time. Since the group of inode tables within one flex_bg are treated as one giant inode table, uninitialized block groups would not need to partially initialize as many inode table as with Orlov which would help fsck time as the filesystem usage goes up. Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Signed-off-by: Valerie Clement <valerie.clement@bull.net> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: replace __FUNCTION__ occurrencesStoyan Gaydarov2008-07-13
| | | | | | | | | | | __FUNCTION__ is gcc-specific, use __func__ instead Signed-off-by: Stoyan Gaydarov <stoyboyker@gmail.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* ext4: fix error processing in mb_free_blocksShen Feng2008-07-13
| | | | | | | | | | | | | | | | | | The error processing of the return value of mb_free_blocks is meanless because it only returns 0. This fix includes - make mb_free_blocks return void - remove the error processing part in callers - unlock group before calling ext4_error in mb_free_blocks Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Cc: Mingming Cao <cmm@us.ibm.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* ext4: error proc entry creation when the fs/ext4 is not correctly createdShen Feng2008-07-13
| | | | | | | | | | When the directory fs/ext4 is not correctly created under proc, the entry under this directory should not be created. Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* ext4: fix build failure if DX_DEBUG is enabledLi Zefan2008-07-11
| | | | | | | | | ext4_next_entry() is used by the debugging function dx_show_leaf(), so it must be defined before that function. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Remove unused variable from ext4_show_optionsTheodore Ts'o2008-07-11
| | | | Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Rename read_block_bitmap() to ext4_read_block_bitmap()Theodore Ts'o2008-07-11
| | | | | | | Since this a non-static function, make it be ext4 specific to avoid conflicts with potentially other filesystems. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: remove double definitions of xattr macrosShen Feng2008-07-11
| | | | | | | | | remove the definitions of macros XATTR_TRUSTED_PREFIX and XATTR_USER_PREFIX since they are defined in linux/xattr.h Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: miscellaneous error checks and coding cleanups for mballocShen Feng2008-07-11
| | | | | | | | | | | | | | | | ext4_mb_seq_history_open(): check if sbi->s_mb_history is NULL ext4_mb_history_init(): replace kmalloc and memset with kzalloc ext4_mb_init_backend(): remove memset since kzalloc is used ext4_mb_init(): the return value of ext4_mb_init_backend is int, but i is unsigned, replace it with a new int variable. Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: add error processing when calling ext4_mb_init_cache in mballocShen Feng2008-07-11
| | | | | | | | | | Add error processing for ext4_mb_load_buddy when it calls ext4_mb_init_cache. Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Fix ext4_mb_init_cache return errorMingming Cao2008-07-11
| | | | | | | | | ext4_mb_init_cache() incorrectly always return EIO on success. This causes the caller of ext4_mb_init_cache() fail when it checks the return value. Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: improve some code in rb tree part of dir.cShen Feng2008-07-11
| | | | | | | | | | | | | | | | * remove unnecessary code in free_rb_tree_fname * rename free_rb_tree_fname to ext4_htree_create_dir_info since it and ext4_htree_free_dir_info are a pair * replace kmalloc with kzalloc in ext4_htree_free_dir_info All these make the code more readable and simple. PS: this patch is also suitable for ext3. Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: switch to seq_filesAlexey Dobriyan2008-07-11
| | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* ext4: Use BUG_ON() instead of BUG()Julia Lawall2008-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | if (...) BUG(); should be replaced with BUG_ON(...) when the test has no side-effects to allow a definition of BUG_ON that drops the code completely. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/) // <smpl> @ disable unlikely @ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (unlikely(E)) { BUG(); } + BUG_ON(E); ) @@ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (E) { BUG(); } + BUG_ON(E); ) // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: start searching for the right extent from the goal group.Aneesh Kumar K.V2008-07-11
| | | | | | | | | | With mballoc we search for the best extent using different criteria. We should always use the goal group when we are starting with a new criteria. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix comments to say "ext4"Shen Feng2008-07-11
| | | | | | | | Change second/third to fourth. Signed-off-by: Shen Feng <shen@cn.fujitsu.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Fix mb_find_next_bit not to return larger than maxAneesh Kumar K.V2008-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some architectures implement ext4_find_next_bit and ext4_find_next_zero_bit in such a way that they return greater than max for some input values. Make sure mb_find_next_bit and mb_find_next_zero_bit return the right values. On 2.6.25 we have include/asm-x86/bitops_32.h static inline unsigned find_first_bit(const unsigned long *addr, unsigned size) { unsigned x = 0; while (x < size) { unsigned long val = *addr++; if (val) return __ffs(val) + x; x += (sizeof(*addr)<<3); } return x; } This can return value greater than size. Reported and fixed here for lustre https://bugzilla.lustre.org/show_bug.cgi?id=15932 https://bugzilla.lustre.org/attachment.cgi?id=17205 Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: validate directory entry data before useDuane Griffin2008-07-11
| | | | | | | | | | | | | | | | | | ext4_dx_find_entry uses ext4_next_entry without verifying that the entry is valid. If its rec_len == 0 this causes an infinite loop. Refactor the loop to check the validity of entries before checking whether they match and moving onto the next one. There are other uses of ext4_next_entry in this file which also look problematic. They should be reviewed and fixed if/when we have a test-case that triggers them. This patch fixes the first case (image hdb.25.softlockup.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* ext4: handle deleting corrupted indirect blocksDuane Griffin2008-07-11
| | | | | | | | | | | | | | | | While freeing indirect blocks we attach a journal head to the parent buffer head, free the blocks, then journal the parent. If the indirect block list is corrupted and points to the parent the journal head will be detached when the block is cleared, causing an OOPS. Check for that explicitly and handle it gracefully. This patch fixes the third case (image hdb.20000057.nullderef.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* ext4: handle corrupted orphan list at mountDuane Griffin2008-07-11
| | | | | | | | | | | | | | If the orphan node list includes valid, untruncatable nodes with nlink > 0 the ext4_orphan_cleanup loop which attempts to delete them will not do so, causing it to loop forever. Fix by checking for such nodes in the ext4_orphan_get function. This patch fixes the second case (image hdb.20000009.softlockup.gz) reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882. Signed-off-by: Duane Griffin <duaneg@dghda.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
* ext4: add missing unlock to an error path in ext4_quota_write()Jan Kara2008-07-04
| | | | | | | | | When write in ext4_quota_write() fails, we have to properly release i_mutex. One error path has been missing the unlock... Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Ext4: Fix online resize block group descriptor corruptionFrederic Bohe2008-06-20
| | | | | | | | | | | This is the patch for the group descriptor table corruption during online resize pointed out by Theodore Tso. The problem was caused by the fact that the ext4 group descriptor can be either 32 or 64 bytes long. Only the 64 bytes structure was taken into account. Signed-off-by: Frederic Bohe <frederic.bohe@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: enable barriers by defaultEric Sandeen2008-05-26
| | | | | | | | | | | | | | | | | | I can't think of any valid reason for ext4 to not use barriers when they are available; I believe this is necessary for filesystem integrity in the face of a volatile write cache on storage. An administrator who trusts that the cache is sufficiently battery- backed (and power supplies are sufficiently redundant, etc...) can always turn it back off again. SuSE has carried such a patch for ext3 for quite some time now. Also document the mount option while we're at it. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Display the journal_async_commit mount option in /proc/mountsTheodore Ts'o2008-05-26
| | | | | | Cc: Andreas Dilger <adilger@clusterfs.com> Cc: Girish Shilamkar <girish@clusterfs.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* jbd2: If a journal checksum error is detected, propagate the error to ext4Theodore Ts'o2008-06-06
| | | | | | | | | | If a journal checksum error is detected, the ext4 filesystem will call ext4_error(), and the mount will either continue, become a read-only mount, or cause a kernel panic based on the superblock flags indicating the user's preference of what to do in case of filesystem corruption being detected. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix online resize bugJosef Bacik2008-06-06
| | | | | | | | | | | | There is a bug when we are trying to verify that the reserve inode's double indirect blocks point back to the primary gdt blocks. The fix is obvious, we need to mod the gdb count by the addr's per block. This was verified using the same testcase as with the ext3 equivalent of this patch. Signed-off-by: Josef Bacik <jbacik@redhat.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Fix uninit block group initialization with FLEX_BGJose R. Santos2008-06-03
| | | | | | | | | | | | | | With FLEX_BG block bitmaps, inode bitmaps and inode tables _MAY_ be allocated outside the group. So, when initializing an uninitialized block bitmap, we need to check the location of this blocks before setting the corresponding bits in the block bitmap of the newly initialized group. Also return the right number of free blocks when counting the available free blocks in uninit group. Tested-by: Aneesh Kumar K.V <aneesh.kumar@inux.vnet.ibm.com> Signed-off-by: Jose R. Santos <jrs@us.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Fix use of uninitialized data with debug enabled.Aneesh Kumar K.V2008-06-05
| | | | | | | | | Fix use of uninitialized data with debug enabled. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Retry block allocation if new blocks are allocated from system zone.Aneesh Kumar K.V2008-05-15
| | | | | | | | | | | | | | If the block allocator gets blocks out of system zone ext4 calls ext4_error. But if the file system is mounted with errors=continue retry block allocation. We need to mark the system zone blocks as in use to make sure retry don't pick them again System zone is the block range mapping block bitmap, inode bitmap and inode table. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: mballoc fix mb_normalize_request algorithm for 1KB block size filesystemsValerie Clement2008-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In case of inode preallocation, the number of blocks to allocate depends on the file size and it is calculated in ext4_mb_normalize_request(). Each group in the filesystem is then checked to find one that can be used for allocation; this is done in ext4_mb_good_group(). When a file bigger than 4MB is created, the requested number of blocks to preallocate, calculated by ext4_mb_normalize_request is 4096. However for a filesystem with 1KB block size, the maximum size of the block buddies used by the multiblock allocator is 2048, so none of groups in the filesystem satisfies the search criteria in ext4_mb_good_group(). Scanning all the filesystem groups impacts performance. This was demonstrated by using a freshly created, 70GB, 1k block filesystem, with caches dropped write before the test via /proc/sys/vm/drop_caches, and with the filesystem mounted with nodelalloc and nodealloc,nomballoc. The time to write an 8 megabyte file using "dd if=/dev/zero of=/mnt/test/fo bs=8k count=1k conv=fsync" took 35.5091 seconds (236kB/s) with nodellaloc, and 0.233754 seconds (35.9 MB/s) with the nodelloc,nomballoc options. With a 1TB partition, it took several minutes to write 8MB! This patch modifies the algorithm in ext4_mb_normalize_group_request to calculate the number of blocks to allocate by taking into account the maximum size of free blocks chunks handled by the multiblock allocator. It has also been tested for filesystems with 2KB and 4KB block sizes to ensure that those cases don't regress. Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Valerie Clement <valerie.clement@bull.net> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix typos in messages and comments (journalled -> journaled)Jan Kara2008-05-13
| | | | | | | | Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: fix synchronization of quota files in journal=data modeJan Kara2008-05-13
| | | | | | | | | | | | | In journal=data mode, it is not enough to do write_inode_now as done in vfs_quota_on() to write all data to their final location (which is needed for quota_read to work correctly). Calling journal_flush() does its job. Cc: <linux-ext4@vger.kernel.org> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
* ext4: Fix mount messages when quota disabledJan Kara2008-05-13
| | | | | | | | | | When quota is disabled, we should not print 'journaled quota not supported' when user tried to mount non-journaled quota. Also fix typo in the message. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Mingming Cao <cmm@us.ibm.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>