aboutsummaryrefslogtreecommitdiffstats
path: root/fs/btrfs
Commit message (Collapse)AuthorAge
...
| | * | Btrfs: fix for incorrect directory entries after fsync log replayFilipe Manana2016-05-12
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we move a directory to a new parent and later log that parent and don't explicitly log the old parent, when we replay the log we can end up with entries for the moved directory in both the old and new parent directories. Besides being ilegal to have directories with multiple hard links in linux, it also resulted in the leaving the inode item with a link count of 1. A similar issue also happens if we move a regular file - after the log tree is replayed the file has a link in both the old and new parent directories, when it should be only at the new directory. Sample reproducer: $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt $ mkdir /mnt/x $ mkdir /mnt/y $ touch /mnt/x/foo $ mkdir /mnt/y/z $ sync $ ln /mnt/x/foo /mnt/x/bar $ mv /mnt/y/z /mnt/x/z < power fail > $ mount /dev/sdc /mnt $ ls -1Ri /mnt /mnt: 257 x 258 y /mnt/x: 259 bar 259 foo 260 z /mnt/x/z: /mnt/y: 260 z /mnt/y/z: $ umount /dev/sdc $ btrfs check /dev/sdc Checking filesystem on /dev/sdc UUID: a67e2c4a-a4b4-4fdc-b015-9d9af1e344be checking extents checking free space cache checking fs roots root 5 inode 260 errors 2000, link count wrong unresolved ref dir 257 index 4 namelen 1 name z filetype 2 errors 0 unresolved ref dir 258 index 2 namelen 1 name z filetype 2 errors 0 (...) Attempting to remove the directory becomes impossible: $ mount /dev/sdc /mnt $ rmdir /mnt/y/z $ ls -lh /mnt/y ls: cannot access /mnt/y/z: No such file or directory total 0 d????????? ? ? ? ? ? z $ rmdir /mnt/x/z rmdir: failed to remove ‘/mnt/x/z’: Stale file handle $ ls -lh /mnt/x ls: cannot access /mnt/x/z: Stale file handle total 0 -rw-r--r-- 2 root root 0 Apr 6 18:06 bar -rw-r--r-- 2 root root 0 Apr 6 18:06 foo d????????? ? ? ? ? ? z So make sure that on rename we set the last_unlink_trans value for our inode, even if it's a directory, to the value of the current transaction's ID and that if the new parent directory is logged that we fallback to a transaction commit. A test case for fstests is being submitted as well. Signed-off-by: Filipe Manana <fdmanana@suse.com>
| * | Merge branch 'foreign/jeffm/uapi' into for-chris-4.7-20160516David Sterba2016-05-16
| |\ \ | | | | | | | | | | | | | | | | # Conflicts: # include/uapi/linux/btrfs.h
| | * | btrfs: uapi/linux/btrfs_tree.h migration, item types and definesJeff Mahoney2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BTRFS_IOC_SEARCH_TREE ioctl returns file system items directly to userspace. In order to decode them, full type information is required. Create a new header, btrfs_tree to contain these since most users won't need them. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: uapi/linux/btrfs.h migration, move struct btrfs_ioctl_defrag_range_argsJeff Mahoney2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct btrfs_ioctl_defrag_range_args is used by the BTRFS_IOC_DEFRAG_RANGE ioctl. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: uapi/linux/btrfs.h migration, move balance flagsJeff Mahoney2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BTRFS_BALANCE_* flags are used by struct btrfs_ioctl_balance_args.flags and btrfs_ioctl_balance_args.{data,meta,sys}.flags in the BTRFS_IOC_BALANCE ioctl. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: uapi/linux/btrfs.h migration, move feature flagsJeff Mahoney2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The compat/compat_ro/incompat feature flags are used by the feature set/get ioctls. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: uapi/linux/btrfs.h migration, qgroup limit flagsJeff Mahoney2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The BTRFS_QGROUP_LIMIT_* flags are required to tell the kernel which fields are valid when using the BTRFS_IOC_QGROUP_LIMIT ioctl. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: uapi/linux/btrfs.h migration, move BTRFS_LABEL_SIZEJeff Mahoney2016-04-28
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | BTRFS_LABEL_SIZE is required to define the BTRFS_IOC_GET_FSLABEL and BTRFS_IOC_SET_FSLABEL ioctls. Signed-off-by: Jeff Mahoney <jeffm@suse.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | Merge branch 'foreign/anand/dev-del-by-id-ext' into for-chris-4.7-20160516David Sterba2016-05-16
| |\ \
| | * | btrfs: cleanup assigning next active device with a checkAnand Jain2016-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Creates helper fucntion as needed by the device delete and replace operations. Also now it checks if the next device being assigned is an active device. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: s_bdev is not null after missing replaceAnand Jain2016-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Yauhen reported in the ML that s_bdev is null at mount, and s_bdev gets updated to some device when missing device is replaced, as because bdev is null for missing device, things gets matched up. Fix this by checking if s_bdev is set. I didn't want to completely remove updating s_bdev because the future multi device support at vfs layer may need it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reported-by: Yauhen Kharuzhy <yauhen.kharuzhy@zavadatar.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: refactor btrfs_dev_replace_start for reuseAnand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A refactor patch, and avoids user input verification in the btrfs_dev_replace_start(), and so this function can be reused. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: use fs_info directlyAnand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Local variable fs_info, contains root->fs_info, use it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: rename flags for vol args v2David Sterba2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rename BTRFS_DEVICE_BY_ID so it's more descriptive that we specify the device by id, it'll be part of the public API. The mask of supported flags is also renamed, only for internal use. The error code for unknown flags is EOPNOTSUPP, fixed. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: rename btrfs_find_device_by_user_inputDavid Sterba2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For clarity how we are going to find the device, let's call it a device specifier, devspec for short. Also rename the arguments that are a leftover from previous function purpose. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: use existing device constraints table btrfs_raid_arrayDavid Sterba2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We should avoid duplicating the device constraints, let's use the btrfs_raid_array in btrfs_check_raid_min_devices. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: introduce raid-type to error-code table, for minimum device constraintDavid Sterba2016-04-28
| | | | | | | | | | | | | | | | | | | | Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: pass number of devices to btrfs_check_raid_min_devicesDavid Sterba2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this patch, btrfs_check_raid_min_devices would do an off-by-one check of the constraints and not the miminmum check, as its name suggests. This is not a problem if the only caller is device remove, but would be confusing for others. Add an argument with the exact number and let the caller(s) decide if this needs any adjustments, like when device replace is running. Reviewed-by: Anand Jain <anand.jain@oracle.com> Tested-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: rename __check_raid_min_devicesDavid Sterba2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Underscores are for special functions, use the full prefix for better stacktrace recognition. Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: optimize check for stale deviceAnand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Optimize check for stale device to only be checked when there is device added or changed. If there is no update to the device, there is no need to call btrfs_free_stale_device(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: introduce device delete by devidAnand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces new ioctl BTRFS_IOC_RM_DEV_V2, which uses enhanced struct btrfs_ioctl_vol_args_v2 to carry devid as an user argument. The patch won't delete the old ioctl interface and so kernel remains backward compatible with user land progs. Test case/script: echo "0 $(blockdev --getsz /dev/sdf) linear /dev/sdf 0" | dmsetup create bad_disk mkfs.btrfs -f -d raid1 -m raid1 /dev/sdd /dev/sde /dev/mapper/bad_disk mount /dev/sdd /btrfs dmsetup suspend bad_disk echo "0 $(blockdev --getsz /dev/sdf) error /dev/sdf 0" | dmsetup load bad_disk dmsetup resume bad_disk echo "bad disk failed. now deleting/replacing" btrfs dev del 3 /btrfs echo $? btrfs fi show /btrfs umount /btrfs btrfs-show-super /dev/sdd | egrep num_device dmsetup remove bad_disk wipefs -a /dev/sdf Signed-off-by: Anand Jain <anand.jain@oracle.com> Reported-by: Martin <m_btrfs@ml1.co.uk> [ adjust messages, s/disk/device/ ] Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: make use of btrfs_scratch_superblocks() in btrfs_rm_device()Anand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the previous patches now the btrfs_scratch_superblocks() is ready to be used in btrfs_rm_device() so use it. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ use GFP_KERNEL ] Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: enhance btrfs_find_device_by_user_input() to check device pathAnand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The operation of device replace and device delete follows same steps upto some depth with in btrfs kernel, however they don't share codes. This enhancement will help replace and delete to share codes. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: make use of btrfs_find_device_by_user_input()Anand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_rm_device() has a section of the code which can be replaced btrfs_find_device_by_user_input() Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: create helper btrfs_find_device_by_user_input()Anand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch renames btrfs_dev_replace_find_srcdev() to btrfs_find_device_by_user_input() and moves it to volumes.c, so that delete device can use it. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: clean up and optimize __check_raid_min_device()Anand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __check_raid_min_device() which was pealed from btrfs_rm_device() maintianed its original code to show the block move. This patch cleans up __check_raid_min_device(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: create helper function __check_raid_min_devices()Anand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | move a section of btrfs_rm_device() code to check for min number of the devices into the function __check_raid_min_devices() Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: create a helper function to read the disk superAnand Jain2016-04-28
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A part of code from btrfs_scan_one_device() is moved to a new function btrfs_read_disk_super(), so that former function looks cleaner. (In this process it also moves the code which ensures null terminating label). So this creates easy opportunity to merge various duplicate codes on read disk super. Earlier attempt to merge duplicate codes highlighted that there were some issues for which there are duplicate codes (to read disk super), however it was not clear what was the issue. So until we figure that out, its better to keep them in a separate functions. Signed-off-by: Anand Jain <anand.jain@oracle.com> [ use GFP_KERNEL, PAGE_CACHE_ removal related fixups ] Signed-off-by: David Sterba <dsterba@suse.com>
| * | Merge branch 'cleanups-4.7' into for-chris-4.7-20160516David Sterba2016-05-16
| |\ \
| | * | btrfs: GFP_NOFS does not GFP_HIGHMEMDavid Sterba2016-05-10
| | | | | | | | | | | | | | | | | | | | | | | | Masking HIGHMEM out of NOFS does not make sense. Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: switch to common message helpers in open_ctree, adjust messagesDavid Sterba2016-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we lack the identification of the filesystem in most if not all mount messages, done via printk/pr_* functions. We can use the btrfs_* helpers in open_ctree, as the fs_info <-> sb link is established at the beginning of the function. The messages have been updated at the same time to be more consistent: * dropped sb->s_id, as it's not available via btrfs_* * added %d for return code where appropriate * wording changed * %Lx replaced by %llx Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: ioctl: reorder exclusive op check in RM_DEVDavid Sterba2016-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the op exclusivity check before the other code (same as in ADD_DEV). Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: kill unused writepage_io_hook callbackDavid Sterba2016-05-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It seems to be long time unused, since 2008 and 6885f308b5570 ("Btrfs: Misc 2.6.25 updates"). Propagating the removal touches some code but has no functional effect. Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: pass the right error code to the btrfs_std_errorAnand Jain2016-05-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Also drop the newline from the message. Signed-off-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: fix typos in commentsLuis de Bethencourt2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Correct a typo in the chunk_mutex name to make it grepable. Since it is better to fix several typos at once, fixing the 2 more in the same file. Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | Btrfs: Refactor btrfs_lock_cluster() to kill compiler warningGeert Uytterhoeven2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fs/btrfs/extent-tree.c: In function ‘btrfs_lock_cluster’: fs/btrfs/extent-tree.c:6399: warning: ‘used_bg’ may be used uninitialized in this function - Replace "again: ... goto again;" by standard C "while (1) { ... }", - Move block not processed during the first iteration of the loop to the end of the loop, which allows to kill the "locked" variable, Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-and-Tested-by: Miao Xie <miaox@cn.fujitsu.com> [ the compilation warning has been fixed by other patch, now we want to clean up the function ] Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: remove save_error_info()Anand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Actually save_error_info() sets the FS state to error and nothing else. Further the word save doesn't induce caffeine when compared to the word set in what actually it does. So to make it better understandable move save_error_info() code to its only consumer itself. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: Simplify conditions about compress while mapping btrfs flags to inode ↵Satoru Takeuchi2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | flags Signed-off-by: Satoru Takeuchi <takeuchi_satoru@jp.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: move error handling code together in ctree.hAnand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Looks like we added the incompatible defines in between the error handling defines in the file ctree.h. Now group them back. Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: remove unused function btrfs_assert()Anand Jain2016-04-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apparently looks like ASSERT does the same intended job, as intended btrfs_assert(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| | * | btrfs: rename btrfs_std_error to btrfs_handle_fs_errorAnand Jain2016-04-28
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | btrfs_std_error() handles errors, puts FS into readonly mode (as of now). So its good idea to rename it to btrfs_handle_fs_error(). Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> [ edit changelog ] Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: fix memory leak during RAID 5/6 device replacementScott Talbert2016-05-16
| | | | | | | | | | | | | | | | | | | | | | | | A 'struct bio' is allocated in scrub_missing_raid56_pages(), but it was never freed anywhere. Signed-off-by: Scott Talbert <scott.talbert@hgst.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: build fixup for qgroup_account_snapshotDavid Sterba2016-05-12
| | | | | | | | | | | | | | | | | | | | | | | | The macro btrfs_std_error got renamed to btrfs_handle_fs_error in an independent branch for the same merge target (4.7). To make the code compilable for bisectability reasons, add a temporary stub. Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: qgroup: Fix qgroup accounting when creating snapshotQu Wenruo2016-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current btrfs qgroup design implies a requirement that after calling btrfs_qgroup_account_extents() there must be a commit root switch. Normally this is OK, as btrfs_qgroup_accounting_extents() is only called inside btrfs_commit_transaction() just be commit_cowonly_roots(). However there is a exception at create_pending_snapshot(), which will call btrfs_qgroup_account_extents() but no any commit root switch. In case of creating a snapshot whose parent root is itself (create a snapshot of fs tree), it will corrupt qgroup by the following trace: (skipped unrelated data) ====== btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 1 qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 0, excl = 0 qgroup_update_counters: qgid = 5, cur_old_count = 0, cur_new_count = 1, rfer = 16384, excl = 16384 btrfs_qgroup_account_extent: bytenr = 29786112, num_bytes = 16384, nr_old_roots = 0, nr_new_roots = 0 ====== The problem here is in first qgroup_account_extent(), the nr_new_roots of the extent is 1, which means its reference got increased, and qgroup increased its rfer and excl. But at second qgroup_account_extent(), its reference got decreased, but between these two qgroup_account_extent(), there is no switch roots. This leads to the same nr_old_roots, and this extent just got ignored by qgroup, which means this extent is wrongly accounted. Fix it by call commit_cowonly_roots() after qgroup_account_extent() in create_pending_snapshot(), with needed preparation. Mark: I added a check at the top of qgroup_account_snapshot() to skip this code if qgroups are turned off. xfstest btrfs/122 exposes this problem. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Mark Fasheh <mfasheh@suse.de> Signed-off-by: David Sterba <dsterba@suse.com>
| * | Btrfs: fix fspath error deallocationVincent Stehlé2016-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure to deallocate fspath with vfree() in case of error in init_ipath(). fspath is allocated with vmalloc() in init_data_container() since commit 425d17a290c0 ("Btrfs: use larger limit for translation of logical to inode"). Signed-off-by: Vincent Stehlé <vincent.stehle@intel.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: make find_workspace warn if there are no workspacesDavid Sterba2016-05-10
| | | | | | | | | | | | | | | | | | | | | Be verbose if there are no workspaces at all, ie. the module init time preallocation failed. Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: make find_workspace always succeedDavid Sterba2016-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With just one preallocated workspace we can guarantee forward progress even if there's no memory available for new workspaces. The cost is more waiting but we also get rid of several error paths. On average, there will be several idle workspaces, so the waiting penalty won't be so bad. In the worst case, all cpus will compete for one workspace until there's some memory. Attempts to allocate a new one are done each time the waiters are woken up. Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: preallocate compression workspacesDavid Sterba2016-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Preallocate one workspace for each compression type so we can guarantee forward progress in the worst case. A failure cannot be a hard error as we might not use compression at all on the filesystem. If we can't allocate the workspaces later when need them, it might actually deadlock, but in such situation the system has effectively not enough memory to operate properly. Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: rename and document compression workspace membersDavid Sterba2016-05-10
| | | | | | | | | | | | | | | | | | The names are confusing, pick more fitting names and add comments. Signed-off-by: David Sterba <dsterba@suse.com>
| * | btrfs: fix int32 overflow in shrink_delalloc().Adam Borowski2016-05-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UBSAN: Undefined behaviour in fs/btrfs/extent-tree.c:4623:21 signed integer overflow: 10808 * 262144 cannot be represented in type 'int [8]' If 8192<=items<16384, we request a writeback of an insane number of pages which is benign (everything will be written). But if items>=16384, the space reservation won't be enough. Signed-off-by: Adam Borowski <kilobyte@angband.pl> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>