aboutsummaryrefslogtreecommitdiffstats
path: root/fs/xfs/libxfs/xfs_inode_fork.c
Commit message (Collapse)AuthorAge
* xfs: check for obviously bad level values in the bmbt rootDarrick J. Wong2017-04-08
| | | | | | | | | | | | | commit b3bf607d58520ea8c0666aeb4be60dbb724cd3a2 upstream. We can't handle a bmbt that's taller than BTREE_MAXLEVELS, and there's no such thing as a zero-level bmbt (for that we have extents format), so if we see this, send back an error code. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* xfs: fix toctou race when locking an inode to access the data mapDarrick J. Wong2017-04-08
| | | | | | | | | | | | | | | | | | | | | commit 4b5bd5bf3fb182dc504b1b64e0331300f156e756 upstream. We use di_format and if_flags to decide whether we're grabbing the ilock in btree mode (btree extents not loaded) or shared mode (anything else), but the state of those fields can be changed by other threads that are also trying to load the btree extents -- IFEXTENTS gets set before the _bmap_read_extents call and cleared if it fails. We don't actually need to have IFEXTENTS set until after the bmbt records are successfully loaded and validated, which will fix the race between multiple threads trying to read the same directory. The next patch strengthens directory bmbt validation by refusing to open the directory if reading the bmbt to start directory readahead fails. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* xfs: new inode extent list lookup helpersChristoph Hellwig2017-01-12
| | | | | | | | | | | | | | | | | | | | | | | commit 93533c7855c3c78c8a900cac65c8d669bb14935d upstream. xfs_iext_lookup_extent looks up a single extent at the passed in offset, and returns the extent covering the area, or the one behind it in case of a hole, as well as the index of the returned extent in arguments, as well as a simple bool as return value that is set to false if no extent could be found because the offset is behind EOF. It is a simpler replacement for xfs_bmap_search_extent that leaves looking up the rarely needed previous extent to the caller and has a nicer calling convention. xfs_iext_get_extent is a helper for iterating over the extent list, it takes an extent index as input, and returns the extent at that index in it's expanded form in an argument if it exists. The actual return value is a bool whether the index is valid or not. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* xfs: provide helper for counting extents from if_bytesEric Sandeen2017-01-12
| | | | | | | | | | | | | | | | | | | | commit 5d829300bee000980a09ac2ccb761cb25867b67c upstream. The open-coded pattern: ifp->if_bytes / (uint)sizeof(xfs_bmbt_rec_t) is all over the xfs code; provide a new helper xfs_iext_count(ifp) to count the number of inline extents in an inode fork. [dchinner: pick up several missed conversions] Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
* xfs: introduce the CoW forkDarrick J. Wong2016-10-04
| | | | | | | | | Introduce a new in-core fork for storing copy-on-write delalloc reservations and allocated extents that are in the process of being written out. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* xfs: don't allow reflinked dir/dev/fifo/socket/pipe filesDarrick J. Wong2016-10-04
| | | | | | | | Only non-rt files can be reflinked, so check that when we load an inode. Also, don't leak the attr fork if there's a failure. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
* Merge branch 'xfs-4.7-inode-reclaim' into for-nextDave Chinner2016-05-19
|\
| * xfs: optimise xfs_iext_destroyAlex Lyakas2016-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When unmounting XFS, we call: xfs_inode_free => xfs_idestroy_fork => xfs_iext_destroy This goes over the whole indirection array and calls xfs_iext_irec_remove for each one of the erps (from the last one to the first one). As a result, we keep shrinking (reallocating actually) the indirection array until we shrink out all of its elements. When we have files with huge numbers of extents, umount takes 30-80 sec, depending on the amount of files that XFS loaded and the amount of indirection entries of each file. The unmount stack looks like: [<ffffffffc0b6d200>] xfs_iext_realloc_indirect+0x40/0x60 [xfs] [<ffffffffc0b6cd8e>] xfs_iext_irec_remove+0xee/0xf0 [xfs] [<ffffffffc0b6cdcd>] xfs_iext_destroy+0x3d/0xb0 [xfs] [<ffffffffc0b6cef6>] xfs_idestroy_fork+0xb6/0xf0 [xfs] [<ffffffffc0b87002>] xfs_inode_free+0xb2/0xc0 [xfs] [<ffffffffc0b87260>] xfs_reclaim_inode+0x250/0x340 [xfs] [<ffffffffc0b87583>] xfs_reclaim_inodes_ag+0x233/0x370 [xfs] [<ffffffffc0b8823d>] xfs_reclaim_inodes+0x1d/0x20 [xfs] [<ffffffffc0b96feb>] xfs_unmountfs+0x7b/0x1a0 [xfs] [<ffffffffc0b98e4d>] xfs_fs_put_super+0x2d/0x70 [xfs] [<ffffffff811e9e36>] generic_shutdown_super+0x76/0x100 [<ffffffff811ea207>] kill_block_super+0x27/0x70 [<ffffffff811ea519>] deactivate_locked_super+0x49/0x60 [<ffffffff811eaaee>] deactivate_super+0x4e/0x70 [<ffffffff81207593>] cleanup_mnt+0x43/0x90 [<ffffffff81207632>] __cleanup_mnt+0x12/0x20 [<ffffffff8108f8e7>] task_work_run+0xa7/0xe0 [<ffffffff81014ff7>] do_notify_resume+0x97/0xb0 [<ffffffff81717c6f>] int_signal+0x12/0x17 Further, this reallocation prevents us from freeing the extent list from a RCU callback as allocation can block. Hence if the extent list is in indirect format, optimise the freeing of the extent list to only use kmem_free calls by freeing entire extent buffer pages at a time, rather than extent by extent. [dchinner: simplified freeing loop based on Christoph's suggestion] Signed-off-by: Alex Lyakas <alex@zadarastorage.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | Merge branch 'xfs-4.7-misc-fixes' into for-nextDave Chinner2016-05-19
|\ \
| * | xfs: improve kmem_reallocChristoph Hellwig2016-04-05
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | Use krealloc to implement our realloc function. This helps to avoid new allocations if we are still in the slab bucket. At least for the bmap btree root that's actually the common case. This also allows removing the now unused oldsize argument. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: optimize inline symlinksChristoph Hellwig2016-04-05
| | | | | | | | | | | | | | | | | | | | | | | | By overallocating the in-core inode fork data buffer and zero terminating the link target in xfs_init_local_fork we can avoid the memory allocation in ->follow_link. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: factor out a helper to initialize a local format inode forkChristoph Hellwig2016-04-05
|/ | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* Merge branch 'xfs-gut-icdinode-4.6' into for-nextDave Chinner2016-03-06
|\
| * xfs: mode di_mode to vfs inodeDave Chinner2016-02-09
| | | | | | | | | | | | | | | | | | | | | | | | Move the di_mode value from the xfs_icdinode to the VFS inode, reducing the xfs_icdinode byte another 2 bytes and collapsing another 2 byte hole in the structure. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Dave Chinner <david@fromorbit.com>
* | xfs: move struct xfs_attr_shortform to xfs_da_format.hDarrick J. Wong2016-02-07
|/ | | | | | | | | | | | Move the shortform attr structure definition to the same place as the other attribute structure definitions for consistency and also so that xfs/122 verifies the structure size. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: merge xfs_inum.h into xfs_format.hChristoph Hellwig2014-11-27
| | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: move most of xfs_sb.h to xfs_format.hChristoph Hellwig2014-11-27
| | | | | | | | | More on-disk format consolidation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: merge xfs_ag.h into xfs_format.hChristoph Hellwig2014-11-27
| | | | | | | | | | More on-disk format consolidation. A few declarations that weren't on-disk format related move into better suitable spots. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: merge xfs_dinode.h into xfs_format.hChristoph Hellwig2014-11-27
| | | | | | | | | | | More consolidatation for the on-disk format defintions. Note that the XFS_IS_REALTIME_INODE moves to xfs_linux.h instead as it is not related to the on disk format, but depends on a CONFIG_ option. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: require 64-bit sector_tChristoph Hellwig2014-07-29
| | | | | | | | | | | | | Trying to support tiny disks only and saving a bit memory might have made sense on an SGI O2 15 years ago, but is pretty pointless today. Remove the rarely tested codepath that uses various smaller in-memory types to reduce our test matrix and make the codebase a little bit smaller and less complicated. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Ben Myers <bpm@sgi.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* xfs: global error sign conversionDave Chinner2014-06-25
| | | | | | | | | | | | | | | | | | | | | | | | | Convert all the errors the core XFs code to negative error signs like the rest of the kernel and remove all the sign conversion we do in the interface layers. Errors for conversion (and comparison) found via searches like: $ git grep " E" fs/xfs $ git grep "return E" fs/xfs $ git grep " E[A-Z].*;$" fs/xfs Negation points found via searches like: $ git grep "= -[a-z,A-Z]" fs/xfs $ git grep "return -[a-z,A-D,F-Z]" fs/xfs $ git grep " -[a-z].*;" fs/xfs [ with some bits I missed from Brian Foster ] Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>
* libxfs: move source filesDave Chinner2014-06-25
Move all the source files that are shared with userspace into libxfs/. This is done as one big chunk simpy to get it done quickly Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com>