aboutsummaryrefslogtreecommitdiffstats
path: root/fs/fuse/inode.c
Commit message (Collapse)AuthorAge
* SL*B: drop kmem cache argument from constructorAlexey Dobriyan2008-07-26
| | | | | | | | | | | | | | | | | | | | | | | | Kmem cache passed to constructor is only needed for constructors that are themselves multiplexeres. Nobody uses this "feature", nor does anybody uses passed kmem cache in non-trivial way, so pass only pointer to object. Non-trivial places are: arch/powerpc/mm/init_64.c arch/powerpc/mm/hugetlbpage.c This is flag day, yes. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Christoph Lameter <cl@linux-foundation.org> Cc: Jon Tollefson <kniht@linux.vnet.ibm.com> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Matt Mackall <mpm@selenic.com> [akpm@linux-foundation.org: fix arch/powerpc/mm/hugetlbpage.c] [akpm@linux-foundation.org: fix mm/slab.c] [akpm@linux-foundation.org: fix ubifs] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: nfs export special lookupsMiklos Szeredi2008-07-25
| | | | | | | | | | | | | | | | | | | | | | | Implement the get_parent export operation by sending a LOOKUP request with ".." as the name. Implement looking up an inode by node ID after it has been evicted from the cache. This is done by seding a LOOKUP request with "." as the name (for all file types, not just directories). The filesystem can set the FUSE_EXPORT_SUPPORT flag in the INIT reply, to indicate that it supports these special lookups. Thanks to John Muir for the original implementation of this feature. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Matthew Wilcox <matthew@wil.cx> Cc: David Teigland <teigland@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: add export operationsMiklos Szeredi2008-07-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement export_operations, to allow fuse filesystems to be exported to NFS. This feature has been in the out-of-tree fuse module, and is widely used and tested. It has not been originally merged into mainline, because doing the NFS export in userspace was thought to be a cleaner and more efficient way of doing it, than through the kernel. While that is true, it would also have involved a lot of duplicated effort at reimplementing NFS exporting (all the different versions of the protocol). This effort was unfortunately not undertaken by anyone, so we are left with doing it the easy but less efficient way. If this feature goes in, the out-of-tree fuse module can go away, which would have several advantages: - not having to maintain two versions - less confusion for users - no bugs due to kernel API changes Comment from hch: - Use the same fh_type values as XFS, since we use the same fh encoding. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix thinko in max I/O size calucationMiklos Szeredi2008-06-17
| | | | | | | | | | | | | | Use max not min to enforce a lower limit on the max I/O size. This bug was introduced by "fuse: fix max i/o size calculation" (commit e5d9a0df07484d6d191756878c974e4307fb24ce). Thanks to Brian Wang for noticing. Reported-by: Brian Wang <ywang221@hotmail.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Acked-by: Szabolcs Szakacsits <szaka@ntfs-3g.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix bdi naming conflictMiklos Szeredi2008-05-24
| | | | | | | | | | | | | | | | | | | Fuse allocates a separate bdi for each filesystem, and registers them in sysfs with "MAJOR:MINOR" of sb->s_dev (st_dev). This works fine for anon devices normally used by fuse, but can conflict with an already registered BDI for "fuseblk" filesystems, where sb->s_dev represents a real block device. In particularl this happens if a non-partitioned device is being mounted. Fix by registering with a different name for "fuseblk" filesystems. Thanks to Ioan Ionita for the bug report. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Reported-by: Ioan Ionita <opslynx@gmail.com> Tested-by: Ioan Ionita <opslynx@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: add flag to turn on big writesMiklos Szeredi2008-05-13
| | | | | | | | | | | | | | | | | Prior to 2.6.26 fuse only supported single page write requests. In theory all fuse filesystem should be able support bigger than 4k writes, as there's nothing in the API to prevent it. Unfortunately there's a known case in NTFS-3G where big writes cause filesystem corruption. There could also be other filesystems, where the lack of testing with big write requests would result in bugs. To prevent such problems on a kernel upgrade, disable big writes by default, but let filesystems set a flag to turn it on. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Szabolcs Szakacsits <szaka@ntfs-3g.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix node ID typeMiklos Szeredi2008-04-30
| | | | | | | | | Node ID is 64bit but it is passed as unsigned long to some functions. This breakage wasn't noticed, because libfuse uses unsigned long too. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix max i/o size calculationMiklos Szeredi2008-04-30
| | | | | | | | | | | | Fix a bug that Werner Baumann reported: fuse can send a bigger write request than the maximum specified. This only affected direct_io operation. In addition set a sane minimum for the max_read and max_write tunables, so I/O always makes some progress. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: support writable mmapMiklos Szeredi2008-04-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Quoting Linus (3 years ago, FUSE inclusion discussions): "User-space filesystems are hard to get right. I'd claim that they are almost impossible, unless you limit them somehow (shared writable mappings are the nastiest part - if you don't have those, you can reasonably limit your problems by limiting the number of dirty pages you accept through normal "write()" calls)." Instead of attempting the impossible, I've just waited for the dirty page accounting infrastructure to materialize (thanks to Peter Zijlstra and others). This nicely solved the biggest problem: limiting the number of pages used for write caching. Some small details remained, however, which this largish patch attempts to address. It provides a page writeback implementation for fuse, which is completely safe against VM related deadlocks. Performance may not be very good for certain usage patterns, but generally it should be acceptable. It has been tested extensively with fsx-linux and bash-shared-mapping. Fuse page writeback design -------------------------- fuse_writepage() allocates a new temporary page with GFP_NOFS|__GFP_HIGHMEM. It copies the contents of the original page, and queues a WRITE request to the userspace filesystem using this temp page. The writeback is finished instantly from the MM's point of view: the page is removed from the radix trees, and the PageDirty and PageWriteback flags are cleared. For the duration of the actual write, the NR_WRITEBACK_TEMP counter is incremented. The per-bdi writeback count is not decremented until the actual write completes. On dirtying the page, fuse waits for a previous write to finish before proceeding. This makes sure, there can only be one temporary page used at a time for one cached page. This approach is wasteful in both memory and CPU bandwidth, so why is this complication needed? The basic problem is that there can be no guarantee about the time in which the userspace filesystem will complete a write. It may be buggy or even malicious, and fail to complete WRITE requests. We don't want unrelated parts of the system to grind to a halt in such cases. Also a filesystem may need additional resources (particularly memory) to complete a WRITE request. There's a great danger of a deadlock if that allocation may wait for the writepage to finish. Currently there are several cases where the kernel can block on page writeback: - allocation order is larger than PAGE_ALLOC_COSTLY_ORDER - page migration - throttle_vm_writeout (through NR_WRITEBACK) - sync(2) Of course in some cases (fsync, msync) we explicitly want to allow blocking. So for these cases new code has to be added to fuse, since the VM is not tracking writeback pages for us any more. As an extra safetly measure, the maximum dirty ratio allocated to a single fuse filesystem is set to 1% by default. This way one (or several) buggy or malicious fuse filesystems cannot slow down the rest of the system by hogging dirty memory. With appropriate privileges, this limit can be raised through '/sys/class/bdi/<bdi>/max_ratio'. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: bdi: expose the BDI object in sysfs for FUSEMiklos Szeredi2008-04-30
| | | | | | | | | | | | | Register FUSE's backing_dev_info under sysfs with the name "fuse-MAJOR:MINOR" Make the fuse control filesystem use s_dev instead of a fuse specific ID. This makes it easier to match directories under /sys/fs/fuse/connections/ with directories under /sys/class/bdi, and with actual mounts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] restore sane ->umount_begin() APIAl Viro2008-04-25
| | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
* mount options: fix fuseMiklos Szeredi2008-02-08
| | | | | | | | Add blksize= option to /proc/mounts for fuseblk filesystems. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* iget: stop FUSE from using iget() and read_inode()David Howells2008-02-07
| | | | | | | | | | Stop the FUSE filesystem from using read_inode(), which it doesn't use anyway. Signed-off-by: David Howells <dhowells@redhat.com> Cc: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: limit queued background requestsMiklos Szeredi2008-02-06
| | | | | | | | | | | | | | | | Libfuse basically creates a new thread for each new request. This is fine for synchronous requests, which are naturally limited. However background requests (especially writepage) can cause a thread creation storm. To avoid this, limit the number of background requests available to userspace. This is done by introducing another queue for background requests, and a counter for the number of "active" requests, which are currently available for userspace. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Kobject: convert fs/* from kobject_unregister() to kobject_put()Greg Kroah-Hartman2008-01-24
| | | | | | | | | | | There is no need for kobject_unregister() anymore, thanks to Kay's kobject cleanup changes, so replace all instances of it with kobject_put(). Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* kobject: convert main fs kobject to use kobject_createGreg Kroah-Hartman2008-01-24
| | | | | | | | | | This also renames fs_subsys to fs_kobj to catch all current users with a build error instead of a build warning which can easily be missed. Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* kobject: convert fuse to use kobject_createGreg Kroah-Hartman2008-01-24
| | | | | | | | | | We don't need a kset here, a simple kobject will do just fine, so dynamically create the kobject and use it. Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* kobject: remove struct kobj_type from struct ksetGreg Kroah-Hartman2008-01-24
| | | | | | | | | | | | | | | | | We don't need a "default" ktype for a kset. We should set this explicitly every time for each kset. This change is needed so that we can make ksets dynamic, and cleans up one of the odd, undocumented assumption that the kset/kobject/ktype model has. This patch is based on a lot of help from Kay Sievers. Nasty bug in the block code was found by Dave Young <hidave.darkstar@gmail.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* fuse: fix uninitialized field in fuse_inodeJohn Muir2007-11-29
| | | | | | | | | | | | | I found problems accessing (executing) previously existing files, until I did chmod on them (or setattr). If the fi->attr_version is not initialized, then it could be larger than fc->attr_version until a setattr is executed, and as a result the inode attributes would never be set. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix FUSE_FILE_OPS sendingMiklos Szeredi2007-11-29
| | | | | | | | | | | | FUSE_FILE_OPS is meant to signal that the kernel will send the open file to to the userspace filesystem for operations on open files, so that sillyrenaming unlinked files becomes unnecessary. However this needs VFS changes, which won't make it into 2.6.24. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: add blksize field to fuse_attrMiklos Szeredi2007-10-18
| | | | | | | | | | | | | | | | | | | | | | | | There are cases when the filesystem will be passed the buffer from a single read or write call, namely: 1) in 'direct-io' mode (not O_DIRECT), read/write requests don't go through the page cache, but go directly to the userspace fs 2) currently buffered writes are done with single page requests, but if Nick's ->perform_write() patch goes it, it will be possible to do larger write requests. But only if the original write() was also bigger than a page. In these cases the filesystem might want to give a hint to the app about the optimal I/O size. Allow the userspace filesystem to supply a blksize value to be returned by stat() and friends. If the field is zero, it defaults to the old PAGE_CACHE_SIZE value. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: add list of writable files to fuse_inodeMiklos Szeredi2007-10-18
| | | | | | | | | | | | | | | | Each WRITE request must carry a valid file descriptor. When a page is written back from a memory mapping, the file through which the page was dirtied is not available, so a new mechananism is needed to find a suitable file in ->writepage(s). A list of fuse_files is added to fuse_inode. The file is removed from the list in fuse_release(). This patch is in preparation for writable mmap support. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: add atomic open+truncate supportMiklos Szeredi2007-10-18
| | | | | | | | | This patch allows fuse filesystems to implement open(..., O_TRUNC) as a single request, instead of separate truncate and open requests. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: add file handle to getattr operationMiklos Szeredi2007-10-18
| | | | | | | | | | | | | Add necessary protocol changes for supplying a file handle with the getattr operation. Step the API version to 7.9. This patch doesn't actually supply the file handle, because that needs some kind of VFS support, which we haven't yet been able to agree upon. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix race between getattr and writeMiklos Szeredi2007-10-18
| | | | | | | | | | | | | | | | | | | | | | | | | Getattr and lookup operations can be running in parallel to attribute changing operations, such as write and setattr. This means, that if for example getattr was slower than a write, the cached size attribute could be set to a stale value. To prevent this race, introduce a per-filesystem attribute version counter. This counter is incremented whenever cached attributes are modified, and the incremented value stored in the inode. Before storing new attributes in the cache, getattr and lookup check, using the version number, whether the attributes have been modified during the request's lifetime. If so, the returned attributes are not cached, because they might be stale. Thanks to Jakub Bogusz for the bug report and test program. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Jakub Bogusz <jakub.bogusz@gemius.pl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix allowing operationsMiklos Szeredi2007-10-18
| | | | | | | | | | | | | | | The following operation didn't check if sending the request was allowed: setattr listxattr statfs Some other operations don't explicitly do the check, but VFS calls ->permission() which checks this. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix permission checking on sticky directoriesMiklos Szeredi2007-10-17
| | | | | | | | | | | | | | | | | | | | The VFS checks sticky bits on the parent directory even if the filesystem defines it's own ->permission(). In some situations (sshfs, mountlo, etc) the user does have permission to delete a file even if the attribute based checking would not allow it. So work around this by storing the permission bits separately and returning them in stat(), but cutting the permission bits off from inode->i_mode. This is slightly hackish, but it's probably not worth it to add new infrastructure in VFS and a slight performance penalty for all filesystems, just for the sake of fuse. [Jan Engelhardt] cosmetic fixes Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Jan Engelhardt <jengelh@linux01.gwdg.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: set i_nlink to sane value after mountMiklos Szeredi2007-10-17
| | | | | | | | | | | | Aufs seems to depend on a positive i_nlink value. So fill in a dummy but sane value for the root inode at mount time. The inode attributes are refreshed with the correct values at the first opportunity. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix page invalidationMiklos Szeredi2007-10-17
| | | | | | | | | | | | | | | | | Other than truncate, there are two cases, when fuse tries to get rid of cached pages: a) in open, if KEEP_CACHE flag is not set b) in getattr, if file size changed spontaneously Until now invalidate_mapping_pages() were used, which didn't get rid of mapped pages. This is wrong, and becomes more wrong as dirty pages are introduced. So instead properly invalidate all pages with invalidate_inode_pages2(). Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: truncate on spontaneous size changeMiklos Szeredi2007-10-17
| | | | | | | | | | | | | | | Memory mappings were only truncated on an explicit truncate, but not when the file size was changed externally. Fix this by moving the truncation code from fuse_setattr to fuse_change_attributes. Yes, there are races between write and and external truncation, but we can't really do anything about them. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: fix reserved request wake upMiklos Szeredi2007-10-17
| | | | | | | | | | | | Use wake_up_all instead of wake_up in put_reserved_req(), otherwise it is possible that the right task is not woken up. Also create a separate reserved_req_waitq in addition to the blocked_waitq, since they fulfill totally separate functions. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Slab API: remove useless ctor parameter and reorder parametersChristoph Lameter2007-10-17
| | | | | | | | | | | | | | | | | | | | | Slab constructors currently have a flags parameter that is never used. And the order of the arguments is opposite to other slab functions. The object pointer is placed before the kmem_cache pointer. Convert ctor(void *object, struct kmem_cache *s, unsigned long flags) to ctor(struct kmem_cache *s, void *object) throughout the kernel [akpm@linux-foundation.org: coupla fixes] Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: bdi init hooksPeter Zijlstra2007-10-17
| | | | | | | | | provide BDI constructor/destructor hooks [akpm@linux-foundation.org: compile fix] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* mm: Remove slab destructors from kmem_cache_create().Paul Mundt2007-07-19
| | | | | | | | | | | | | | Slab destructors were no longer supported after Christoph's c59def9f222d44bb7e2f0a559f2906191a0862d7 change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* fuse: ->fs_flags fixletAlexey Dobriyan2007-06-16
| | | | | | | | | | fs/fuse/inode.c:658:3: error: Initializer entry defined twice fs/fuse/inode.c:661:3: also defined here Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* fuse: delete inode on dropMiklos Szeredi2007-05-23
| | | | | | | | | | | When inode is dropped (no more references) delete it from cache. There's not much point in keeping it cached, when a new lookup will refresh the attributes anyway. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Detach sched.h from mm.hAlexey Dobriyan2007-05-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Remove SLAB_CTOR_CONSTRUCTORChristoph Lameter2007-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | SLAB_CTOR_CONSTRUCTOR is always specified. No point in checking it. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Steven French <sfrench@us.ibm.com> Cc: Michael Halcrow <mhalcrow@us.ibm.com> Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Miklos Szeredi <miklos@szeredi.hu> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: "J. Bruce Fields" <bfields@fieldses.org> Cc: Anton Altaparmakov <aia21@cantab.net> Cc: Mark Fasheh <mark.fasheh@oracle.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@ucw.cz> Cc: David Chinner <dgc@sgi.com> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* add filesystem subtype supportMiklos Szeredi2007-05-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a slight problem with filesystem type representation in fuse based filesystems. From the kernel's view, there are just two filesystem types: fuse and fuseblk. From the user's view there are lots of different filesystem types. The user is not even much concerned if the filesystem is fuse based or not. So there's a conflict of interest in how this should be represented in fstab, mtab and /proc/mounts. The current scheme is to encode the real filesystem type in the mount source. So an sshfs mount looks like this: sshfs#user@server:/ /mnt/server fuse rw,nosuid,nodev,... This url-ish syntax works OK for sshfs and similar filesystems. However for block device based filesystems (ntfs-3g, zfs) it doesn't work, since the kernel expects the mount source to be a real device name. A possibly better scheme would be to encode the real type in the type field as "type.subtype". So fuse mounts would look like this: /dev/hda1 /mnt/windows fuseblk.ntfs-3g rw,... user@server:/ /mnt/server fuse.sshfs rw,nosuid,nodev,... This patch adds the necessary code to the kernel so that this can be correctly displayed in /proc/mounts. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* slab allocators: Remove SLAB_DEBUG_INITIAL flagChristoph Lameter2007-05-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I have never seen a use of SLAB_DEBUG_INITIAL. It is only supported by SLAB. I think its purpose was to have a callback after an object has been freed to verify that the state is the constructor state again? The callback is performed before each freeing of an object. I would think that it is much easier to check the object state manually before the free. That also places the check near the code object manipulation of the object. Also the SLAB_DEBUG_INITIAL callback is only performed if the kernel was compiled with SLAB debugging on. If there would be code in a constructor handling SLAB_DEBUG_INITIAL then it would have to be conditional on SLAB_DEBUG otherwise it would just be dead code. But there is no such code in the kernel. I think SLUB_DEBUG_INITIAL is too problematic to make real use of, difficult to understand and there are easier ways to accomplish the same effect (i.e. add debug code before kfree). There is a related flag SLAB_CTOR_VERIFY that is frequently checked to be clear in fs inode caches. Remove the pointless checks (they would even be pointless without removeal of SLAB_DEBUG_INITIAL) from the fs constructors. This is the last slab flag that SLUB did not support. Remove the check for unimplemented flags from SLUB. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* remove "struct subsystem" as it is no longer neededGreg Kroah-Hartman2007-05-02
| | | | | | | | | | | We need to work on cleaning up the relationship between kobjects, ksets and ktypes. The removal of 'struct subsystem' is the first step of this, especially as it is not really needed at all. Thanks to Kay for fixing the bugs in this patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* [PATCH] fuse: validate rootmode mount optionTimo Savola2007-04-08
| | | | | | | | | | If rootmode isn't valid, we hit the BUG() in fuse_init_inode. Now EINVAL is returned. Signed-off-by: Timo Savola <tsavola@movial.fi> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] Mark struct super_operations constJosef 'Jeff' Sipek2007-02-12
| | | | | | | | | | | This patch is inspired by Arjan's "Patch series to mark struct file_operations and struct inode_operations const". Compile tested with gcc & sparse. Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] remove invalidate_inode_pages()Andrew Morton2007-02-11
| | | | | | | | | | | Convert all calls to invalidate_inode_pages() into open-coded calls to invalidate_mapping_pages(). Leave the invalidate_inode_pages() wrapper in place for now, marked as deprecated. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* [PATCH] fuse: fix compile without CONFIG_BLOCKMiklos Szeredi2006-12-07
| | | | | | | | | | | | | | | | | | Randy Dunlap wote: > Should FUSE depend on BLOCK? Without that and with BLOCK=n, I get: > > inode.c:(.text+0x3acc5): undefined reference to `sb_set_blocksize' > inode.c:(.text+0x3a393): undefined reference to `get_sb_bdev' > fs/built-in.o:(.data+0xd718): undefined reference to `kill_block_super Most fuse filesystems work fine without block device support, so I think a better solution is to disable the 'fuseblk' filesystem type if BLOCK=n. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fuse: add DESTROY operationMiklos Szeredi2006-12-07
| | | | | | | | | | | | | Add a DESTROY operation for block device based filesystems. With the help of this operation, such a filesystem can flush dirty data to the device synchronously before the umount returns. This is needed in situations where the filesystem is assumed to be clean immediately after unmount (e.g. ejecting removable media). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fuse: add blksize optionMiklos Szeredi2006-12-07
| | | | | | | | | | Add 'blksize' option for block device based filesystems. During initialization this is used to set the block size on the device and the super block. The default block size is 512bytes. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] fuse: add support for block device based filesystemsMiklos Szeredi2006-12-07
| | | | | | | | | | | | | | | | | | I never intended this, but people started using fuse to implement block device based "real" filesystems (ntfs-3g, zfs). The following four patches add better support for these kinds of filesystems. Unlike "normal" fuse filesystems, using this feature should require superuser privileges (enforced by the fusermount utility). Thanks to Szabolcs Szakacsits for the input and testing. This patch adds a 'fuseblk' filesystem type, which is only different from the 'fuse' filesystem type in how the 'dev_name' mount argument is interpreted. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] slab: remove kmem_cache_tChristoph Lameter2006-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* [PATCH] slab: remove SLAB_KERNELChristoph Lameter2006-12-07
| | | | | | | | SLAB_KERNEL is an alias of GFP_KERNEL. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>