aboutsummaryrefslogtreecommitdiffstats
path: root/fs/gfs2/bmap.c
Commit message (Collapse)AuthorAge
* [GFS2] Fix write alloc required shortcut calculationSteven Whitehouse2008-01-25
| | | | | | The comparison was being made against the wrong quantity. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] gfs2_alloc_required performanceBob Peterson2008-01-25
| | | | | | | | | | | This is a small I/O performance enhancement to gfs2. (Actually, it is a rework of an earlier version I got wrong). The idea here is to check if the write extends past the last block in the file. If so, the function can save itself a lot of time and trouble because it knows an allocate will be required. Benchmarks like iozone should see better performance. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Reduce inode size by moving i_alloc out of lineSteven Whitehouse2008-01-25
| | | | | | | | | | | | | | | | | It is possible to reduce the size of GFS2 inodes by taking the i_alloc structure out of the gfs2_inode. This patch allocates the i_alloc structure whenever its needed, and frees it afterward. This decreases the amount of low memory we use at the expense of requiring a memory allocation for each page or partial page that we write. A quick test with postmark shows that the overhead is not measurable and I also note that OCFS2 use the same approach. In the future I'd like to solve the problem by shrinking down the size of the members of the i_alloc structure, but for now, this reduces the immediate problem of using too much low-memory on x86 and doesn't add too much overhead. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Only fetch the dinode once in block_mapBob Peterson2008-01-25
| | | | | | | | Function gfs2_block_map was often looking up the disk inode twice. This optimizes it so that only does it once. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Remove function gfs2_get_blockBob Peterson2008-01-25
| | | | | | | | | | | | This patch is just a cleanup. Function gfs2_get_block() just calls function gfs2_block_map reversing the last two parameters. By reversing the parameters, gfs2_block_map() may be called directly and function gfs2_get_block may be eliminated altogether. Since this function is done for every block operation, this streamlines the code and makes it a little bit more efficient. Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Add gfs2_is_writeback()Steven Whitehouse2008-01-25
| | | | | | | | | | This adds a function "gfs2_is_writeback()" along the lines of the existing "gfs2_is_jdata()" in order to clean up the code and make the various tests for the inode mode more obvious. It also fixes the PageChecked() logic where we were resetting the flag too early in the case of an error path. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Fix ordering of dirty/journal for ordered buffer unstuffingBob Peterson2007-10-10
| | | | | Signed-off-by: Bob Peterson <rpeterso@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Don't mark jdata dirty in gfs2_unstuffer_page()Steven Whitehouse2007-10-10
| | | | | | | Journaled data is marked dirty by gfs2_unpin and should not be marked dirty here. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Reduce truncate IO trafficWendy Cheng2007-10-10
| | | | | | | | | | | Current GFS2 setattr call unconditionally invokes do_shrink even the requested size and actual file size are equal. This has generated large amount of extra IOs found during NFS benchmark runs. This patch moves the relevant logic out of shrink code path. Since setattr is a system call, the time stamps update is still required. Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Fix gfs2_block_truncate_page err returnS. Wendy Cheng2007-07-09
| | | | | | | | | Code segment inside gfs2_block_truncate_page() doesn't set the return code correctly. This causes NFSD erroneously returns EIO back to client with setattr procedure call (truncate error). Signed-off-by: S. Wendy Cheng <wcheng@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Add nanosecond timestamp featureSteven Whitehouse2007-07-09
| | | | | | | | | | | This adds a nanosecond timestamp feature to the GFS2 filesystem. Due to the way that the on-disk format works, older filesystems will just appear to have this field set to zero. When mounted by an older version of GFS2, the filesystem will simply ignore the extra fields so that it will again appear to have whole second resolution, so that its trivially backward compatible. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Fix sign problem in quota/statfs and cleanup _host structuresSteven Whitehouse2007-07-09
| | | | | | | | | | | | | | | | | | | | | | | | This patch fixes some sign issues which were accidentally introduced into the quota & statfs code during the endianess annotation process. Also included is a general clean up which moves all of the _host structures out of gfs2_ondisk.h (where they should not have been to start with) and into the places where they are actually used (often only one place). Also those _host structures which are not required any more are removed entirely (which is the eventual plan for all of them). The conversion routines from ondisk.c are also moved into the places where they are actually used, which for almost every one, was just one single place, so all those are now static functions. This also cleans up the end of gfs2_ondisk.h which no longer needs the #ifdef __KERNEL__. The net result is a reduction of about 100 lines of code, many functions now marked static plus the bug fixes as mentioned above. For good measure I ran the code through sparse after making these changes to check that there are no warnings generated. This fixes Red Hat bz #239686 Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Clean up inode number handlingSteven Whitehouse2007-07-09
| | | | | | | | | | | | | | | | | | | | | | | This patch cleans up the inode number handling code. The main difference is that instead of looking up the inodes using a struct gfs2_inum_host we now use just the no_addr member of this structure. The tests relating to no_formal_ino can then be done by the calling code. This has advantages in that we want to do different things in different code paths if the no_formal_ino doesn't match. In the NFS patch we want to return -ESTALE, but in the ->lookup() path, its a bug in the fs if the no_formal_ino doesn't match and thus we can withdraw in this case. In order to later fix bz #201012, we need to be able to look up an inode without knowing no_formal_ino, as the only information that is known to us is the on-disk location of the inode in question. This patch will also help us to fix bz #236099 at a later date by cleaning up a lot of the code in that area. There are no user visible changes as a result of this patch and there are no changes to the on-disk format either. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] use zero_user_pageNate Diller2007-07-09
| | | | | | | | Use zero_user_page() instead of open-coding it. Signed-off-by: Nate Diller <nate.diller@gmail.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
* [PATCH] remove many unneeded #includes of sched.hTim Schmielau2007-02-14
| | | | | | | | | | | | | | | | | | | | | | | | After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [GFS2] use CURRENT_TIME_SEC instead of get_seconds in gfs2Eric Sandeen2007-02-05
| | | | | | | | | | | | | | | | | | I was looking something else up and came across this... I don't honestly have a good reason to change it other than to make it like every other Linux filesystem in this regard. ;-) It doesn't functionally change anything, but makes some lines shorter. :) I'm also curious; why does gfs2 have 64-bits of on-disk timestamps, but not in timespec_t format, and only stores second resolutions? Seems like you're halfway to sub-second resolutions already. I suppose if that gets implemented then all of the below should instead be CURRENT_TIME not CURRENT_TIME_SEC. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Tidy up bmap & fix boundary bugSteven Whitehouse2006-11-30
| | | | | | | | | | This moves the locking for bmap into the bmap function itself rather than using a wrapper function. It also fixes a bug where the boundary flag was set on the wrong bh. Also the flags on the mapped bh are reset earlier in the function to ensure that they are 100% correct on the error path. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Remove gfs2_inode_attr_inSteven Whitehouse2006-11-30
| | | | | | | | | | | | This function wasn't really doing the right thing. There was no need to update the inode size at this point and the updating of the i_blocks field has now been moved to the places where di_blocks is updated. A result of this patch and some those preceeding it is that unlocking a glock is now a much more efficient process, since there is no longer any requirement to copy data from the gfs2 inode into the vfs inode at this point. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Shrink gfs2_inode (6) - di_atime/di_mtime/di_ctimeSteven Whitehouse2006-11-30
| | | | | | | Remove the di_[amc]time fields and use inode->i_[amc]time fields instead. This saves 24 bytes from the gfs2_inode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Shrink gfs2_inode (4) - di_uid/di_gidSteven Whitehouse2006-11-30
| | | | | | | Remove duplicate di_uid/di_gid fields in favour of using inode->i_uid/inode->i_gid instead. This saves 8 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Shrink gfs2_inode (3) - di_modeSteven Whitehouse2006-11-30
| | | | | | | This removes the duplicate di_mode field in favour of using the inode->i_mode field. This saves 4 bytes. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Change argument of gfs2_dinode_outSteven Whitehouse2006-11-30
| | | | | | | | | | Everywhere this was called, a struct gfs2_inode was available, but despite that, it was always called with a struct gfs2_dinode as an argument. By making this change it paves the way to start eliminating fields duplicated between the kernel's struct inode and the struct gfs2_dinode. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] gfs2 misc endianness annotationsAl Viro2006-11-30
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Fix bmap to map extents properlySteven Whitehouse2006-10-20
| | | | | | | | | This fix means that bmap will map extents of the length requested by the VFS rather than guessing at it, or just mapping one block at a time. The other callers of gfs2_block_map are audited to ensure they send the correct max extent lengths (i.e. set bh->b_size correctly). Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Remove uneeded endian conversionSteven Whitehouse2006-10-02
| | | | | | | | | | | | | In many places GFS2 was calling the endian conversion routines for an inode even when only a single field, or a few fields might have changed. As a result we were copying lots of data needlessly. This patch replaces those calls with conversion of just the required fields in each case. This should be faster and easier to understand. There are still other places which suffer from this problem, but this is a start in the right direction. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2/DLM] Fix trailing whitespaceSteven Whitehouse2006-09-25
| | | | | | | As per Andrew Morton's request, removed trailing whitespace. Cc: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Tidy up meta_io codeSteven Whitehouse2006-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a bug in the directory reading code, where we might have dereferenced a NULL pointer in case of OOM. Updated the directory code to use the new & improved version of gfs2_meta_ra() which now returns the first block that was being read. Previously it was releasing it requiring following code to grab the block again at each point it was called. Also turned off readahead on directory lookups since we are reading a hash table, and therefore reading the entries in order is very unlikely. Readahead is still used for all other calls to the directory reading function (e.g. when growing the hash table). Removed the DIO_START constant. Everywhere this was used, it was used to unconditionally start i/o aside from a couple of places, so I've removed it and made the couple of exceptions to this rule into separate functions. Also hunted through the other DIO flags and removed them as arguments from functions which were always called with the same combination of arguments. Updated gfs2_meta_indirect_buffer to be a bit more efficient and hopefully also be a bit easier to read. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Export lm_interface to kernel headersFabio Massimo Di Nitto2006-09-19
| | | | | | | | | | | | | | | lm_interface.h has a few out of the tree clients such as GFS1 and userland tools. Right now, these clients keeps a copy of the file in their build tree that can go out of sync. Move lm_interface.h to include/linux, export it to userland and clean up fs/gfs2 to use the new location. Signed-off-by: Fabio M. Di Nitto <fabbione@ubuntu.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] inode-diet-eliminate-i_blksize-and-use-a-per-superblock-default-vs-gfs2akpm@osdl.org2006-09-19
| | | | | | | | | i_blksize got removed in -mm. Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Map multiple blocks at once where possibleSteven Whitehouse2006-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a tidy up of the GFS2 bmap code. The main change is that the bh is passed to gfs2_block_map allowing the flags to be set directly rather than having to repeat that code several times in ops_address.c. At the same time, the extent mapping code from gfs2_extent_map has been moved into gfs2_block_map. This allows all calls to gfs2_block_map to map extents in the case that no allocation is taking place. As a result reads and non-allocating writes should be faster. A quick test with postmark appears to support this. There is a limit on the number of blocks mapped in a single bmap call in that it will only ever map blocks which are pointed to from a single pointer block. So in other words, it will never try to do additional i/o in order to satisfy read-ahead. The maximum number of blocks is thus somewhat less than 512 (the GFS2 4k block size minus the header divided by sizeof(u64)). I've further limited the mapping of "normal" blocks to 32 blocks (to avoid extra work) since readpages() will currently read a maximum of 32 blocks ahead (128k). Some further work will probably be needed to set a suitable value for DIO as well, but for now thats left at the maximum 512 (see ops_address.c:gfs2_get_block_direct). There is probably a lot more that can be done to improve bmap for GFS2, but this is a good first step. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] More style changesJan Engelhardt2006-09-07
| | | | | | | | Remove redundant brackets Signed-off-by: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Remove a cast, tidy gfs2_inode_attr_inSteven Whitehouse2006-09-04
| | | | | | | | The remains of the changes for Jan Engelhardt's third email. Remove a cast and tidy up gfs2_inode_attr_in. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Change all types to uX styleSteven Whitehouse2006-09-04
| | | | | | | This makes all fixed size types have consistent names. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Align all labels against LH sideSteven Whitehouse2006-09-04
| | | | | | This makes everything consistent. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Tidy up bmap/inode codeSteven Whitehouse2006-09-04
| | | | | | | | | As per Jan Engelhardt's third set of comments, this make various code style changes and moves the structures from format.h into super.c, which was the only place that format.h was actually used. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Update copyright, tidy up incore.hSteven Whitehouse2006-09-01
| | | | | | | | | | | | | | | | | | | | | | | As per comments from Jan Engelhardt <jengelh@linux01.gwdg.de> this updates the copyright message to say "version" in full rather than "v.2". Also incore.h has been updated to remove forward structure declarations which are not required. The gfs2_quota_lvb structure has now had endianess annotations added to it. Also quota.c has been updated so that we now store the lvb data locally in endian independant format to avoid needing a structure in host endianess too. As a result the endianess conversions are done as required at various points and thus the conversion routines in lvb.[ch] are no longer required. I've moved the one remaining constant in lvb.h thats used into lm.h and removed the unused lvb.[ch]. I have not changed the HIF_ constants. That is left to a later patch which I hope will unify the gh_flags and gh_iflags fields of the struct gfs2_holder. Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Remove page.[ch]Steven Whitehouse2006-07-26
| | | | | | | | The remaining routines in page.c were all only used in one other file, so they are now moved into the files where they are referenced and made static. Thus page.[ch] are no longer required. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Tidy gfs2_unstuffer_pageSteven Whitehouse2006-07-26
| | | | | | | | | | | | | | Tidy up gfs2_unstuffer_page by: a) Moving it into bmap.c b) Making it static c) Calling it directly from gfs2_unstuff_dinode d) Updating all callers of gfs2_unstuff_dinode due to one less required argument. It doesn't change the behaviour at all. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Fix unlinked file handlingSteven Whitehouse2006-06-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes the way we have been dealing with unlinked, but still open files. It removes all limits (other than memory for inodes, as per every other filesystem) on numbers of these which we can support on GFS2. It also means that (like other fs) its the responsibility of the last process to close the file to deallocate the storage, rather than the person who did the unlinking. Note that with GFS2, those two events might take place on different nodes. Also there are a number of other changes: o We use the Linux inode subsystem as it was intended to be used, wrt allocating GFS2 inodes o The Linux inode cache is now the point which we use for local enforcement of only holding one copy of the inode in core at once (previous to this we used the glock layer). o We no longer use the unlinked "special" file. We just ignore it completely. This makes unlinking more efficient. o We now use the 4th block allocation state. The previously unused state is used to track unlinked but still open inodes. o gfs2_inoded is no longer needed o Several fields are now no longer needed (and removed) from the in core struct gfs2_inode o Several fields are no longer needed (and removed) from the in core superblock There are a number of future possible optimisations and clean ups which have been made possible by this patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Update copyright date to 2006Steven Whitehouse2006-05-18
| | | | Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Remove semaphore.h from C filesSteven Whitehouse2006-05-18
| | | | | | | We no longer use semaphores, everything has been converted to mutex or rwsem, so we don't need to include this header any more. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Reverse block order in build_heightSteven Whitehouse2006-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The original code ordered the blocks allocated in the build_height routine backwards causing excessive disk seeks during a read of the metadata. This patch reverses the order to try and reduce disk seeks. Example: A five level metadata tree, I = Inode, P = Pointers, D = Data You need to read the blocks in the order: I P5 P4 P3 P2 P1 D in order to read a single data block. The new code now orders the blocks in this way. The old code used to order them as: I P1 P2 P3 P4 P5 D requiring two extra seeks on average. Note that for files which are grown by gradual extension rather than by truncate or by llseek/write at a large offset, this doesn't apply. In the case of writing to a file linearly, this routine will only be called upon to extend the height of the tree by one block at a time, so the ordering is determined by when its called rather than by the internals of the routine itself. Optimising that part of the ordering is a much harder problem. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Readpages supportSteven Whitehouse2006-05-05
| | | | | | | | | | | | | | | | | | This adds readpages support (and also corrects a small bug in the readpage error path at the same time). Hopefully this will improve performance by allowing GFS to submit larger lumps of I/O at a time. In order to simplify the setting of BH_Boundary, it currently gets set when we hit the end of a indirect pointer block. There is always a boundary at this point with the current allocation code. It doesn't get all the boundaries right though, so there is still room for improvement in this. See comments in fs/gfs2/ops_address.c for further information about readpages with GFS2. Signed-off-by: Steven Whitehouse
* [GFS2] Remove some unused codeSteven Whitehouse2006-04-28
| | | | | | | Remove some of the unused code flagged up by Adrian Bunk. Cc: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse
* [GFS2] [-mm patch] fs/gfs2/: possible cleanupsAdrian Bunk2006-04-28
| | | | | | | | | | | | This patch contains the following possible cleanups: - make needlessly global code static - #if 0 unused functions - remove the following global function that was both unused and unimplemented: - super.c: gfs2_do_upgrade() Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Tidy up dir code as per Christoph Hellwig's commentsSteven Whitehouse2006-04-24
| | | | | | | | | 1. Comment whitespace fix 2. Removed unused header files from dir.c 3. Split the gfs2_dir_get_buffer() function into two functions Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Further updates to dir and logging codeSteven Whitehouse2006-03-28
| | | | | | | | | | | | | | | | | This reduces the size of the directory code by about 3k and gets readdir() to use the functions which were introduced in the previous directory code update. Two memory allocations are merged into one. Eliminates zeroing of some buffers which were never used before they were initialised by other data. There is still scope for further improvement in the directory code. On the logging side, a hand created mutex has been replaced by a standard Linux mutex in the log allocation code. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Macros removal in gfs2.hSteven Whitehouse2006-02-27
| | | | | | | | | | | | | | As suggested by Pekka Enberg <penberg@cs.helsinki.fi>. The DIV_RU macro is renamed DIV_ROUND_UP and and moved to kernel.h The other macros are gone from gfs2.h as (although not requested by Pekka Enberg) are a number of included header file which are now included individually. The inode number comparison function is now an inline function. The DT2IF and IF2DT may be addressed in a future patch. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] 80 Column audit of GFS2Steven Whitehouse2006-02-27
| | | | | | | Requested by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* [GFS2] Make journaled data files identical to normal files on diskSteven Whitehouse2006-02-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a very large patch, with a few still to be resolved issues so you might want to check out the previous head of the tree since this is known to be unstable. Fixes for the various bugs will be forthcoming shortly. This patch removes the special data format which has been used up till now for journaled data files. Directories still retain the old format so that they will remain on disk compatible with earlier releases. As a result you can now do the following with journaled data files: 1) mmap them 2) export them over NFS 3) convert to/from normal files whenever you want to (the zero length restriction is gone) In addition the level at which GFS' locking is done has changed for all files (since they all now use the page cache) such that the locking is done at the page cache level rather than the level of the fs operations. This should mean that things like loopback mounts and other things which touch the page cache directly should now work. Current known issues: 1. There is a lock mode inversion problem related to the resource group hold function which needs to be resolved. 2. Any significant amount of I/O causes an oops with an offset of hex 320 (NULL pointer dereference) which appears to be related to a journaled data buffer appearing on a list where it shouldn't be. 3. Direct I/O writes are disabled for the time being (will reappear later) 4. There is probably a deadlock between the page lock and GFS' locks under certain combinations of mmap and fs operation I/O. 5. Issue relating to ref counting on internally used inodes causes a hang on umount (discovered before this patch, and not fixed by it) 6. One part of the directory metadata is different from GFS1 and will need to be resolved before next release. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>