aboutsummaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAge
...
| * | | | | | | | f2fs: fix tracking parent inode numberJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, f2fs didn't track the parent inode number correctly which is stored in each f2fs_inode. In the case of the following scenario, a bug can be occured. Let's suppose there are one directory, "/b", and two files, "/a" and "/b/a". - pino of "/a" is ROOT_INO. - pino of "/b/a" is DIR_B_INO. Then, # sync : The inode pages of "/a" and "/b/a" contain the parent inode numbers as ROOT_INO and DIR_B_INO respectively. # mv /a /b/a : The parent inode number of "/a" should be changed to DIR_B_INO, but f2fs didn't do that. Ref. f2fs_set_link(). In order to fix this clearly, I added i_pino in f2fs_inode_info, and whenever it needs to be changed like in f2fs_add_link() and f2fs_set_link(), it is updated temporarily in f2fs_inode_info. And later, f2fs_write_inode() stores the latest information to the inode pages. For power-off-recovery, f2fs_sync_file() triggers simply f2fs_write_inode(). Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: cleanup the f2fs_bio_alloc routineJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do cleanup more for better code readability. - Change the parameter set of f2fs_bio_alloc() This function should allocate a bio only since it is not something like f2fs_bio_init(). Instead, the caller should initialize the allocated bio. - Introduce SECTOR_FROM_BLOCK This macro translates a block address to its sector address. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Reviewed-by: Namjae Jeon <namjae.jeon@samsung.com>
| * | | | | | | | f2fs: introduce accessor to retrieve number of dentry slotsNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify code by providing the accessor macro to retrieve the number of dentry slots for a given filename length. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
| * | | | | | | | f2fs: remove redundant call to f2fs_put_page in delete entryNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since, we anyway need to put the page after deleting entry. So, there is no need to make same call under different conditions. Move out the f2fs_put_page from the two conditions and call at once. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
| * | | | | | | | f2fs: make use of GFP_F2FS_ZERO for setting gfp_maskNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since, GFP_NOFS and __GFP_ZERO is being used to set gfp_mask. We can instead make use of already predefined macro GFP_F2FS_ZERO. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
| * | | | | | | | f2fs: rewrite f2fs_bio_alloc to make it simplerNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since, GFP_NOFS(__GFP_WAIT) is used for allocation requests of bio in f2fs. So, there is no chance of returning NULL from the BIO allocation. Making the bio allocation routine for f2fs simpler. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
| * | | | | | | | f2fs: remove unused variableWei Yongjun2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The variables node_page and page_offset are initialized but never used otherwise, so remove those unused variables. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
| * | | | | | | | f2fs: move error condition for mkdir at proper placeNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In function f2fs_mkdir, err is being initialized without even checking if there was any error in new inode creation. So, instead check the inode error and make use of error/return condition. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
| * | | | | | | | f2fs: remove unneeded initializationNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No need to initialize "struct f2fs_gc_kthread *gc_th = NULL", as gc_th = NULL, will be taken care by the return values of kmalloc(). And fix codes in other places. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
| * | | | | | | | f2fs: check read only condition before beginning write outNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the filesystem is mounted as read-only then return from that point itself instead of first doing a writeout/wait and then checking for read-only condition. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
| * | | | | | | | f2fs: remove unneeded memset from init_onceNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since, __GFP_ZERO is used while f2fs inode allocation, so we do not need memset for f2fs_inode_info, as this is already zeroed out. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
| * | | | | | | | f2fs: show error in case of invalid mount argumentsNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | print the invalid argument/value from parse_options in case of mount failure. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Amit Sahrawat <a.sahrawat@samsung.com>
| * | | | | | | | f2fs: fix the compiler warning for uninitialized use of variableNamjae Jeon2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CONFIG_CC_OPTIMIZE_FOR_SIZE is enabled in the kernel, -Os optimisation flag is passed to gcc for compilation, and somehow while trying to optimize the code, compiler is might not able to see the initialisation of variable ne struct variable inside the get_node_info() function and results into following warning: fs/f2fs/node.c: In function 'get_node_info': fs/f2fs/node.c:175:3: warning: 'ne.block_addr' may be used uninitialized in this function [-Wuninitialized] fs/f2fs/node.c:265:24: note: 'ne.block_addr' was declared here fs/f2fs/node.c:176:3: warning: 'ne.ino' may be used uninitialized in this function [-Wuninitialized] fs/f2fs/node.c:265:24: note: 'ne.ino' was declared here fs/f2fs/node.c:177:3: warning: 'ne.version' may be used uninitialized in this function [-Wuninitialized] fs/f2fs/node.c:265:24: note: 'ne.version' was declared here Hence, lets initialise the ne struct variable to zero, which will remove this warning and also doing this does not seems to making any impact on the code behavior. Signed-off-by: Namjae Jeon <namjae.jeon@samsung.com> Signed-off-by: Pankaj Kumar <pankaj.km@samsung.com>
| * | | | | | | | f2fs: resolve build failuresJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There exist two build failures reported by Randy Dunlap as follows. (on i386) a. (config-r8857) ERROR: "f2fs_xattr_advise_handler" [fs/f2fs/f2fs.ko] undefined! Key configs in (config-r8857) are as follows. CONFIG_F2FS_FS=m # CONFIG_F2FS_STAT_FS is not set CONFIG_F2FS_FS_XATTR=y # CONFIG_F2FS_FS_POSIX_ACL is not set The error was occurred due to the function location that we made a mistake. Recently we added a new functionality for users to indicate cold files explicitly through xattr operations (i.e., f2fs_xattr_advise_handler). This handler should have been added in xattr.c instead of acl.c in order to avoid an undefined operation like in this case where XATTR is set and ACL is not set. b. (config-r8855) fs/f2fs/file.c: In function 'f2fs_vm_page_mkwrite': fs/f2fs/file.c:97:2: error: implicit declaration of function 'block_page_mkwrite_return' Key config in (config-r8855) is CONFIG_BLOCK. Obviously, f2fs works on top of the block device so that we should consider carefully a sort of config dependencies. The reason why this error was occurred was that f2fs_vm_page_mkwrite() calls block_page_mkwrite_return() which is enalbed only if CONFIG_BLOCK is set. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com> Acked-by: Randy Dunlap <rdunlap@xenotime.net>
| * | | | | | | | f2fs: adjust kernel coding styleJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As pointed out by Randy Dunlap, this patch removes all usage of "/**" for comment blocks. Instead, just use "/*". Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: fix endian conversion bugs reported by sparseJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch should resolve the bugs reported by the sparse tool. Initial reports were written by "kbuild test robot" managed by fengguang.wu. In my local machines, I've tested also by running: > make C=2 CF="-D__CHECK_ENDIAN__" Accordingly, I've found lots of warnings and bugs related to the endian conversion. And I've fixed all at this moment. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: remove unneeded version.h header file from f2fs.hSachin Kamat2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Including <linux/version.h> is not necessary. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
| * | | | | | | | f2fs: update Kconfig and MakefileJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds Makefile and Kconfig for f2fs, and updates Makefile and Kconfig files in the fs directory. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: move proc files to debugfsGreg Kroah-Hartman2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves all of the f2fs debugging files into debugfs. The files are located in /sys/kernel/debug/f2fs/ Note, I think we are generating all of the same information in each of the files for every unique f2fs filesystem in the machine. This copies the functionality that was present in the proc files, but this should be fixed up in the future. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [jaegeuk.kim@samsung.com: merged 3 debugfs entries into a *status* entry] Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add recovery routines for roll-forwardJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds roll-forward routines to recover fsynced data. - F2FS uses basically roll-back model with checkpointing. - In order to implement fsync(), there are two approaches as follows. 1. A roll-back model with checkpointing at every fsync() : This is a naive method, but suffers from very low performance. 2. A roll-forward model : F2FS adopts this model where all the fsynced data should be recovered, which were written after checkpointing was done. In order to figure out the data, F2FS keeps a "fsync" mark in direct node blocks. In addition, F2FS remains the location of next node block in each direct node block for reconstructing the chain of node blocks during the recovery. - In order to enhance the performance, F2FS keeps a "dentry" mark also in direct node blocks. If this is set during the recovery, F2FS replays adding a dentry. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add garbage collection functionsJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds on-demand and background cleaning functions. - The basic background cleaning policy is trying to do cleaning jobs as much as possible whenever the system is idle. Once the background cleaning is done, the cleaner sleeps an amount of time not to interfere with VFS calls. The time is dynamically adjusted according to the status of whole segments, which is decreased when the following conditions are satisfied. . GC is not conducted currently, and . IO subsystem is idle by checking the number of requets in bdev's request list, and . There are enough dirty segments. Otherwise, the time is increased incrementally until to the maximum time. Note that, min and max times are 10 secs and 30 secs by default. - F2FS adopts a default victim selection policy where background cleaning uses a cost-benefit algorithm, while on-demand cleaning uses a greedy algorithm. - The method of moving data during the cleaning is slightly different between background and on-demand cleaning schemes. In the case of background cleaning, F2FS loads the data, and marks them as dirty. Then, F2FS expects that the data will be moved by flusher or VM. In the case of on-demand cleaning, F2FS should move the data right away. - In order to identify valid blocks in a victim segment, F2FS scans the bitmap of the segment managed as an SIT entry. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add xattr and acl functionalitiesJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements xattr and acl functionalities. - F2FS uses a node page to contain use extended attributes. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add core directory operationsJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this adds core functions to find, add, delete, and link dentries. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add inode operations for special inodesJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds inode operations for directory, symlink, and special inodes. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add core inode operationsJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds core functions to get, read, write, and evict an inode. Signed-off-by: Changman Lee <cm224.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add address space operations for dataJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds address space operations for data. - F2FS supports readpages(), writepages(), and direct_IO(). - Because of out-of-place writes, f2fs_direct_IO() does not write data in place. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add file operationsJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds memory operations and file/file_inode operations. - F2FS supports fallocate(), mmap(), fsync(), and basic ioctl(). Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add segment operationsJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds specific functions not only to manage dirty/free segments, SIT pages, a cache for SIT entries, and summary entries, but also to allocate free blocks and write three types of pages: data, node, and meta. - F2FS maintains three types of bitmaps in memory, which indicate free, prefree, and dirty segments respectively. - The key information of an SIT entry consists of a segment number, the number of valid blocks in the segment, a bitmap to identify there-in valid or invalid blocks. - An SIT page is composed of a certain range of SIT entries, which is maintained by the address space of meta_inode. - To cache SIT entries, a simple array is used. The index for the array is the segment number. - A summary entry for data contains the parent node information. A summary entry for node contains its node offset from the inode. - F2FS manages information about six active logs and those summary entries in memory. Whenever one of them is changed, its summary entries are flushed to its SIT page maintained by the address space of meta_inode. - This patch adds a default block allocation function which supports heap-based allocation policy. - This patch adds core functions to write data, node, and meta pages. Since LFS basically produces a series of sequential writes, F2FS merges sequential bios with a single one as much as possible to reduce the IO scheduling overhead. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add node operationsJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds specific functions to manage NAT pages, a cache for NAT entries, free nids, direct/indirect node blocks for indexing data, and address space for node pages. - The key information of an NAT entry consists of a node id and a block address. - An NAT page is composed of block addresses covered by a certain range of NAT entries, which is maintained by the address space of meta_inode. - A radix tree structure is used to cache NAT entries. The index for the tree is a node id. - When there is no free nid, F2FS should scan NAT entries to find new one. In order to avoid scanning frequently, F2FS manages a list containing a number of free nids in memory. Only when free nids in the list are exhausted, scanning process, build_free_nids(), is triggered. - F2FS has direct and indirect node blocks for indexing data. This patch adds fuctions related to the node block management such as getting, allocating, and truncating node blocks to index data. - In order to cache node blocks in memory, F2FS has a node_inode with an address space for node pages. This patch also adds the address space operations for node_inode. Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add checkpoint operationsJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds functions required by the checkpoint operations. Basically, f2fs adopts a roll-back model with checkpoint blocks written in the CP area. The checkpoint procedure includes as follows. - write_checkpoint() 1. block_operations() freezes VFS calls. 2. submit cached bios. 3. flush_nat_entries() writes NAT pages updated by dirty NAT entries. 4. flush_sit_entries() writes SIT pages updated by dirty SIT entries. 5. do_checkpoint() writes, - checkpoint block (#0) - orphan inode blocks - summary blocks made by active logs - checkpoint block (copy of #0) 6. unblock_opeations() In order to provide an address space for meta pages, f2fs_sb_info has a special inode, namely meta_inode. This patch also adds the address space operations for meta_inode. Signed-off-by: Chul Lee <chur.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add super block operationsJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the implementation of superblock operations for f2fs, which includes - init_f2fs_fs/exit_f2fs_fs - f2fs_mount - super_operations of f2fs Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
| * | | | | | | | f2fs: add superblock and major in-memory structureJaegeuk Kim2012-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the following major in-memory structures in f2fs. - f2fs_sb_info: contains f2fs-specific information, two special inode pointers for node and meta address spaces, and orphan inode management. - f2fs_inode_info: contains vfs_inode and other fs-specific information. - f2fs_nm_info: contains node manager information such as NAT entry cache, free nid list, and NAT page management. - f2fs_node_info: represents a node as node id, inode number, block address, and its version. - f2fs_sm_info: contains segment manager information such as SIT entry cache, free segment map, current active logs, dirty segment management, and segment utilization. The specific structures are sit_info, free_segmap_info, dirty_seglist_info, curseg_info. In addition, add F2FS_SUPER_MAGIC in magic.h. Signed-off-by: Chul Lee <chur.lee@samsung.com> Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
* | | | | | | | | Merge tag 'for-linus-20121219' of git://git.infradead.org/linux-mtdLinus Torvalds2012-12-19
|\ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull MTD updates from David Woodhouse: - Various cleanups especially in NAND tests - Add support for NAND flash on BCMA bus - DT support for sh_flctl and denali NAND drivers - Kill obsolete/superceded drivers (fortunet, nomadik_nand) - Fix JFFS2 locking bug in ENOMEM failure path - New SPI flash chips, as usual - Support writing in 'reliable mode' for DiskOnChip G4 - Debugfs support in nandsim * tag 'for-linus-20121219' of git://git.infradead.org/linux-mtd: (96 commits) mtd: nand: typo in nand_id_has_period() comments mtd: nand/gpio: use io{read,write}*_rep accessors mtd: block2mtd: throttle writes by calling balance_dirty_pages_ratelimited. mtd: nand: gpmi: reset BCH earlier, too, to avoid NAND startup problems mtd: nand/docg4: fix and improve read of factory bbt mtd: nand/docg4: reserve bb marker area in ecclayout mtd: nand/docg4: add support for writing in reliable mode mtd: mxc_nand: reorder part_probes to let cmdline override other sources mtd: mxc_nand: fix unbalanced clk_disable() in error path mtd: nandsim: Introduce debugfs infrastructure mtd: physmap_of: error checking to prevent a NULL pointer dereference mtg: docg3: potential divide by zero in doc_write_oob() mtd: bcm47xxnflash: writing support mtd: tests/read: initialize buffer for whole next page mtd: at91: atmel_nand: return bit flips for the PMECC read_page() mtd: fix recovery after failed write-buffer operation in cfi_cmdset_0002.c mtd: nand: onfi need to be probed in 8 bits mode mtd: nand: add NAND_BUSWIDTH_AUTO to autodetect bus width mtd: nand: print flash size during detection mted: nand_wait_ready timeout fix ...
| * | | | | | | | | jffs2: hold erase_completion_lock on exitAlexey Khoroshilov2012-11-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Users of jffs2_do_reserve_space() expect they still held erase_completion_lock after call to it. But there is a path where jffs2_do_reserve_space() leaves erase_completion_lock unlocked. The patch fixes it. Found by Linux Driver Verification project (linuxtesting.org). Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru> Cc: stable@vger.kernel.org Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
* | | | | | | | | | ceph: fix dentry reference leak in ceph_encode_fh()Cyril Roelandt2012-12-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dput() was not called in the error path. Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Cc: Sage Weil <sage@inktank.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2012-12-18
|\ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull (again) user namespace infrastructure changes from Eric Biederman: "Those bugs, those darn embarrasing bugs just want don't want to get fixed. Linus I just updated my mirror of your kernel.org tree and it appears you successfully pulled everything except the last 4 commits that fix those embarrasing bugs. When you get a chance can you please repull my branch" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: userns: Fix typo in description of the limitation of userns_install userns: Add a more complete capability subset test to commit_creds userns: Require CAP_SYS_ADMIN for most uses of setns. Fix cap_capable to only allow owners in the parent user namespace to have caps.
| * | | | | | | | | | userns: Require CAP_SYS_ADMIN for most uses of setns.Eric W. Biederman2012-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Andy Lutomirski <luto@amacapital.net> found a nasty little bug in the permissions of setns. With unprivileged user namespaces it became possible to create new namespaces without privilege. However the setns calls were relaxed to only require CAP_SYS_ADMIN in the user nameapce of the targed namespace. Which made the following nasty sequence possible. pid = clone(CLONE_NEWUSER | CLONE_NEWNS); if (pid == 0) { /* child */ system("mount --bind /home/me/passwd /etc/passwd"); } else if (pid != 0) { /* parent */ char path[PATH_MAX]; snprintf(path, sizeof(path), "/proc/%u/ns/mnt"); fd = open(path, O_RDONLY); setns(fd, 0); system("su -"); } Prevent this possibility by requiring CAP_SYS_ADMIN in the current user namespace when joing all but the user namespace. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
* | | | | | | | | | | Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osdLinus Torvalds2012-12-18
|\ \ \ \ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pull exofs changes from Boaz Harrosh: "These are just 3 patches, the last two are bug fixes on the error paths in exofs. The important patch is the one to osd_uld which adds sysfs info to osd devices for use by user-mode clustering discovery software. I'm already sitting on this patch since before February this year, It is important for some of the big installation cluster systems, who's been compiling their own kernel just for that patch." Ugh. The osd_uld patch already went through the SCSI tree, so this was kind of pointless. But at least it has the two small error-path fixes.. * 'for-linus' of git://git.open-osd.org/linux-open-osd: exofs: don't leak io_state and pages on read error exofs: clean up the correct page collection on write error osduld: Add osdname & systemid sysfs at scsi_osd class
| * | | | | | | | | | | exofs: don't leak io_state and pages on read errorBoaz Harrosh2012-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Same bug as fixed by Idan for write_exec was in read_exec. Fix the io_state leak and pages state on read error. Also while at it: The if (!pcol->read_4_write) at the error path is redundant because all goto err; are after the if (pcol->read_4_write) bale out. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
| * | | | | | | | | | | exofs: clean up the correct page collection on write errorIdan Kedar2012-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | if ore_write() fails, we would unlock the pages of pcol, which is now empty, rather than pcol_copy which owns the pages when ore_write() is called. this means that no pages will actually be unlocked (pcol.nr_pages == 0) and the writing process (more accurately, the syncing process) will hang waiting for a writeback notification that never comes. moreover, if ore_write() fails, pcol_free() is called for pcol, whereas pcol_copy is the object owning the ore_io_state, thus leaking the ore_io_state. [Boaz] I have simplified Idan's original patch a bit, everything else still holds Signed-off-by: Idan Kedar <idank@tonian.com> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
* | | | | | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2012-12-18
|\ \ \ \ \ \ \ \ \ \ \ \ | | |_|_|_|/ / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs update from Chris Mason: "A big set of fixes and features. In terms of line count, most of the code comes from Stefan, who added the ability to replace a single drive in place. This is different from how btrfs normally replaces drives, and is much much much faster. Josef is plowing through our synchronous write performance. This pull request does not include the DIO_OWN_WAITING patch that was discussed on the list, but it has a number of other improvements to cut down our latencies and CPU time during fsync/O_DIRECT writes. Miao Xie has a big series of fixes and is spreading out ordered operations over more CPUs. This improves performance and reduces contention. I've put in fixes for error handling around hash collisions. These are going back to individual stable kernels as I test against them. Otherwise we have a lot of fixes and cleanups, thanks everyone! raid5/6 is being rebased against the device replacement code. I'll have it posted this Friday along with a nice series of benchmarks." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (115 commits) Btrfs: fix a bug of per-file nocow Btrfs: fix hash overflow handling Btrfs: don't take inode delalloc mutex if we're a free space inode Btrfs: fix autodefrag and umount lockup Btrfs: fix permissions of empty files not affected by umask Btrfs: put raid properties into global table Btrfs: fix BUG() in scrub when first superblock reading gives EIO Btrfs: do not call file_update_time in aio_write Btrfs: only unlock and relock if we have to Btrfs: use tokens where we can in the tree log Btrfs: optimize leaf_space_used Btrfs: don't memset new tokens Btrfs: only clear dirty on the buffer if it is marked as dirty Btrfs: move checks in set_page_dirty under DEBUG Btrfs: log changed inodes based on the extent map tree Btrfs: add path->really_keep_locks Btrfs: do not mark ems as prealloc if we are writing to them Btrfs: keep track of the extents original block length Btrfs: inline csums if we're fsyncing Btrfs: don't bother copying if we're only logging the inode ...
| * | | | | | | | | | | Btrfs: fix a bug of per-file nocowLiu Bo2012-12-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Users report a bug, the reproducer is: $ mkfs.btrfs /dev/loop0 $ mount /dev/loop0 /mnt/btrfs/ $ mkdir /mnt/btrfs/dir $ chattr +C /mnt/btrfs/dir/ $ dd if=/dev/zero of=/mnt/btrfs/dir/foo bs=4K count=10; $ lsattr /mnt/btrfs/dir/foo ---------------C- /mnt/btrfs/dir/foo $ filefrag /mnt/btrfs/dir/foo /mnt/btrfs/dir/foo: 1 extent found ---> an extent $ dd if=/dev/zero of=/mnt/btrfs/dir/foo bs=4K count=1 seek=5 conv=notrunc,nocreat; sync $ filefrag /mnt/btrfs/dir/foo /mnt/btrfs/dir/foo: 3 extents found ---> with nocow, btrfs breaks the extent into three parts The new created file should not only inherit the NODATACOW flag, but also honor NODATASUM flag, because we must do COW on a file extent with checksum. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | | | | | | | | | Btrfs: fix hash overflow handlingChris Mason2012-12-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The handling for directory crc hash overflows was fairly obscure, split_leaf returns EOVERFLOW when we try to extend the item and that is supposed to bubble up to userland. For a while it did so, but along the way we added better handling of errors and forced the FS readonly if we hit IO errors during the directory insertion. Along the way, we started testing only for EEXIST and the EOVERFLOW case was dropped. The end result is that we may force the FS readonly if we catch a directory hash bucket overflow. This fixes a few problem spots. First I add tests for EOVERFLOW in the places where we can safely just return the error up the chain. btrfs_rename is harder though, because it tries to insert the new directory item only after it has already unlinked anything the rename was going to overwrite. Rather than adding very complex logic, I added a helper to test for the hash overflow case early while it is still safe to bail out. Snapshot and subvolume creation had a similar problem, so they are using the new helper now too. Signed-off-by: Chris Mason <chris.mason@fusionio.com> Reported-by: Pascal Junod <pascal@junod.info>
| * | | | | | | | | | | Btrfs: don't take inode delalloc mutex if we're a free space inodeJosef Bacik2012-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This confuses and angers lockdep even though it's ok. We don't really need the lock for free space inodes since only the transaction committer will be reserving space. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | | | | | | | | | Btrfs: fix autodefrag and umount lockupJosef Bacik2012-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This happens because writeback_inodes_sb_nr_if_idle does down_read. This doesn't work for us and it has not been fixed upstream yet, so do it ourselves and use that instead so we can stop having this stupid long standing lockup. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | | | | | | | | | Btrfs: fix permissions of empty files not affected by umaskFilipe Brandenburger2012-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a new file is created with btrfs_create(), the inode will initially be created with permissions 0666 and later on in btrfs_init_acl() it will be adapted to mask out the umask bits. The problem is that this change won't make it into the btrfs_inode unless there's another change to the inode (e.g. writing content changing the size or touching the file changing the mtime.) This fix adds a call to btrfs_update_inode() to btrfs_create() to make sure that the change will not get lost if the in-memory inode is flushed before other changes are made to the file. Signed-off-by: Filipe Brandenburger <filbranden@google.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | | | | | | | | | Btrfs: put raid properties into global tableLiu Bo2012-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Raid properties can be shared among raid calculation code, we can put them into a global table to keep it simple. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | | | | | | | | | Btrfs: fix BUG() in scrub when first superblock reading gives EIOStefan Behrens2012-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes a very special case that can be reproduced by just disconnecting a disk at runtime, and without unmounting the filesystem first, start scrub on the filesystem with the disconnected disk. All read and write EIOs are handled correctly, only the first superblock is an exception and gives a BUG() in a subfunction. The BUG() is correct, it would crash later otherwise. The subfunction must not be called for superblocks and this is what the fix changes. Reported-by: Joeri Vanthienen <mail@joerivanthienen.be> Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | | | | | | | | | Btrfs: do not call file_update_time in aio_writeJosef Bacik2012-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This starts a transaction and dirties the inode everytime we call it, which is super expensive if you have a write heavy workload. We will be updating the inode when the IO completes and we reserve the space for the inode update when we reserve space for the write, so there is no chance of loss of information or enospc issues. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
| * | | | | | | | | | | Btrfs: only unlock and relock if we have toJosef Bacik2012-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed while doing fsync tests that we were always dropping the path and re-searching when we first cow the log root even though we've already gotten the write lock on the root. That's because we don't take into account that there might not be a parent node, so fix the check to make sure there is actually a parent node before we undo all of this work for nothing. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>