aboutsummaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAge
* [PATCH] relayfsTom Zanussi2005-09-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here's the latest version of relayfs, against linux-2.6.11-mm2. I'm hoping you'll consider putting this version back into your tree - the previous rounds of comment seem to have shaken out all the API issues and the number of comments on the code itself have also steadily dwindled. This patch is essentially the same as the relayfs redux part 5 patch, with some minor changes based on reviewer comments. Thanks again to Pekka Enberg for those. The patch size without documentation is now a little smaller at just over 40k. Here's a detailed list of the changes: - removed the attribute_flags in relay open and changed it to a boolean specifying either overwrite or no-overwrite mode, and removed everything referencing the attribute flags. - added a check for NULL names in relayfs_create_entry() - got rid of the unnecessary multiple labels in relay_create_buf() - some minor simplification of relay_alloc_buf() which got rid of a couple params - updated the Documentation In addition, this version (through code contained in the relay-apps tarball linked to below, not as part of the relayfs patch) tries to make it as easy as possible to create the cooperating kernel/user pieces of a typical and common type of logging application, one where kernel logging is kicked off when a user space data collection app starts and stops when the collection app exits, with the data being automatically logged to disk in between. To create this type of application, you basically just include a header file (relay-app.h, included in the relay-apps tarball) in your kernel module, define a couple of callbacks and call an initialization function, and on the user side call a single function that sets up and continuously monitors the buffers, and writes data to files as it becomes available. Channels are created when the collection app is started and destroyed when it exits, not when the kernel module is inserted, so different channel buffer sizes can be specified for each separate run via command-line options. See the README in the relay-apps tarball for details. Also included in the relay-apps tarball are a couple examples demonstrating how you can use this to create quick and dirty kernel logging/debugging applications. They are: - tprintk, short for 'tee printk', which temporarily puts a kprobe on printk() and writes a duplicate stream of printk output to a relayfs channel. This could be used anywhere there's printk() debugging code in the kernel which you'd like to exercise, but would rather not have your system logs cluttered with debugging junk. You'd probably want to kill klogd while you do this, otherwise there wouldn't be much point (since putting a kprobe on printk() doesn't change the output of printk()). I've used this method to temporarily divert the packet logging output of the iptables LOG target from the system logs to relayfs files instead, for instance. - klog, which just provides a printk-like formatted logging function on top of relayfs. Again, you can use this to keep stuff out of your system logs if used in place of printk. The example applications can be found here: http://prdownloads.sourceforge.net/dprobes/relay-apps.tar.gz?download From: Christoph Hellwig <hch@lst.de> avoid lookup_hash usage in relayfs Signed-off-by: Tom Zanussi <zanussi@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Linus Torvalds2005-09-05
|\
| * [CRYPTO]: crypto_free_tfm() callers no longer need to check for NULLJesper Juhl2005-09-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the patch to add a NULL short-circuit to crypto_free_tfm() went in, there's no longer any need for callers of that function to check for NULL. This patch removes the redundant NULL checks and also a few similar checks for NULL before calls to kfree() that I ran into while doing the crypto_free_tfm bits. I've succesfuly compile tested this patch, and a kernel with the patch applied boots and runs just fine. When I posted the patch to LKML (and other lists/people on Cc) it drew the following comments : J. Bruce Fields commented "I've no problem with the auth_gss or nfsv4 bits.--b." Sridhar Samudrala said "sctp change looks fine." Herbert Xu signed off on the patch. So, I guess this is ready to be dropped into -mm and eventually mainline. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
| * [CRYPTO]: Use CRYPTO_TFM_REQ_MAY_SLEEP where appropriateHerbert Xu2005-09-01
| | | | | | | | | | | | | | | | | | This patch goes through the current users of the crypto layer and sets CRYPTO_TFM_REQ_MAY_SLEEP at crypto_alloc_tfm() where all crypto operations are performed in process context. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | [PATCH] uml: fixes performance regression in activate_mm and thus exec()Paolo 'Blaisorblade' Giarrusso2005-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Normally, activate_mm() is called from exec(), and thus it used to be a no-op because we use a completely new "MM context" on the host (for instance, a new process), and so we didn't need to flush any "TLB entries" (which for us are the set of memory mappings for the host process from the virtual "RAM" file). Kernel threads, instead, are usually handled in a different way. So, when for AIO we call use_mm(), things used to break and so Benjamin implemented activate_mm(). However, that is only needed for AIO, and could slow down exec() inside UML, so be smart: detect being called for AIO (via PF_BORROWED_MM) and do the full flush only in that situation. Comment also the caller so that people won't go breaking UML without noticing. I also rely on the caller's locks for testing current->flags. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> CC: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] Generic VFS fallback for security xattrsStephen Smalley2005-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch modifies the VFS setxattr, getxattr, and listxattr code to fall back to the security module for security xattrs if the filesystem does not support xattrs natively. This allows security modules to export the incore inode security label information to userspace even if the filesystem does not provide xattr storage, and eliminates the need to individually patch various pseudo filesystem types to provide such access. The patch removes the existing xattr code from devpts and tmpfs as it is then no longer needed. The patch restructures the code flow slightly to reduce duplication between the normal path and the fallback path, but this should only have one user-visible side effect - a program may get -EACCES rather than -EOPNOTSUPP if policy denied access but the filesystem didn't support the operation anyway. Note that the post_setxattr hook call is not needed in the fallback case, as the inode_setsecurity hook call handles the incore inode security state update directly. In contrast, we do call fsnotify in both cases. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] add /proc/pid/smapsMauricio Lin2005-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a "smaps" entry to /proc/pid: show howmuch memory is resident in each mapping. People that want to perform a memory consumption analysing can use it mainly if someone needs to figure out which libraries can be reduced for embedded systems. So the new features are the physical size of shared and clean [or dirty]; private and clean [or dirty]. Take a look the example below: # cat /proc/4576/smaps 08048000-080dc000 r-xp /bin/bash Size: 592 KB Rss: 500 KB Shared_Clean: 500 KB Shared_Dirty: 0 KB Private_Clean: 0 KB Private_Dirty: 0 KB 080dc000-080e2000 rw-p /bin/bash Size: 24 KB Rss: 24 KB Shared_Clean: 0 KB Shared_Dirty: 0 KB Private_Clean: 0 KB Private_Dirty: 24 KB 080e2000-08116000 rw-p Size: 208 KB Rss: 208 KB Shared_Clean: 0 KB Shared_Dirty: 0 KB Private_Clean: 0 KB Private_Dirty: 208 KB b7e2b000-b7e34000 r-xp /lib/tls/libnss_files-2.3.2.so Size: 36 KB Rss: 12 KB Shared_Clean: 12 KB Shared_Dirty: 0 KB Private_Clean: 0 KB Private_Dirty: 0 KB ... (Includes a cleanup from "Richard Purdie" <rpurdie@rpsys.net>) From: Torsten Foertsch <torsten.foertsch@gmx.net> show_smap calls first show_map and then prints its additional information to the seq_file. show_map checks if all it has to print fits into the buffer and if yes marks the current vma as written. While that is correct for show_map it is not for show_smap. Here the vma should be marked as written only after the additional information is also written. The attached patch cures the problem. It moves the functionality of the show_map function to a new function show_map_internal that is called with an additional struct mem_size_stats* argument. Then show_map calls show_map_internal with NULL as struct mem_size_stats* whereas show_smap calls it with a real pointer. Now the final if (m->count < m->size) /* vma is copied successfully */ m->version = (vma != get_gate_vma(task))? vma->vm_start: 0; is done only if the whole entry fits into the buffer. Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] /proc/<pid>/numa_maps to show on which nodes pages resideChristoph Lameter2005-09-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch was recently discussed on linux-mm: http://marc.theaimsgroup.com/?t=112085728500002&r=1&w=2 I inherited a large code base from Ray for page migration. There was a small patch in there that I find to be very useful since it allows the display of the locality of the pages in use by a process. I reworked that patch and came up with a /proc/<pid>/numa_maps that gives more information about the vma's of a process. numa_maps is indexes by the start address found in /proc/<pid>/maps. F.e. with this patch you can see the page use of the "getty" process: margin:/proc/12008 # cat maps 00000000-00004000 r--p 00000000 00:00 0 2000000000000000-200000000002c000 r-xp 00000000 08:04 516 /lib/ld-2.3.3.so 2000000000038000-2000000000040000 rw-p 00028000 08:04 516 /lib/ld-2.3.3.so 2000000000040000-2000000000044000 rw-p 2000000000040000 00:00 0 2000000000058000-2000000000260000 r-xp 00000000 08:04 54707842 /lib/tls/libc.so.6.1 2000000000260000-2000000000268000 ---p 00208000 08:04 54707842 /lib/tls/libc.so.6.1 2000000000268000-2000000000274000 rw-p 00200000 08:04 54707842 /lib/tls/libc.so.6.1 2000000000274000-2000000000280000 rw-p 2000000000274000 00:00 0 2000000000280000-20000000002b4000 r--p 00000000 08:04 9126923 /usr/lib/locale/en_US.utf8/LC_CTYPE 2000000000300000-2000000000308000 r--s 00000000 08:04 60071467 /usr/lib/gconv/gconv-modules.cache 2000000000318000-2000000000328000 rw-p 2000000000318000 00:00 0 4000000000000000-4000000000008000 r-xp 00000000 08:04 29576399 /sbin/mingetty 6000000000004000-6000000000008000 rw-p 00004000 08:04 29576399 /sbin/mingetty 6000000000008000-600000000002c000 rw-p 6000000000008000 00:00 0 [heap] 60000fff7fffc000-60000fff80000000 rw-p 60000fff7fffc000 00:00 0 60000ffffff44000-60000ffffff98000 rw-p 60000ffffff44000 00:00 0 [stack] a000000000000000-a000000000020000 ---p 00000000 00:00 0 [vdso] cat numa_maps 2000000000000000 default MaxRef=43 Pages=11 Mapped=11 N0=4 N1=3 N2=2 N3=2 2000000000038000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2 2000000000040000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1 2000000000058000 default MaxRef=43 Pages=61 Mapped=61 N0=14 N1=15 N2=16 N3=16 2000000000268000 default MaxRef=1 Pages=2 Mapped=2 Anon=2 N0=2 2000000000274000 default MaxRef=1 Pages=3 Mapped=3 Anon=3 N0=3 2000000000280000 default MaxRef=8 Pages=3 Mapped=3 N0=3 2000000000300000 default MaxRef=8 Pages=2 Mapped=2 N0=2 2000000000318000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N2=1 4000000000000000 default MaxRef=6 Pages=2 Mapped=2 N1=2 6000000000004000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1 6000000000008000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1 60000fff7fffc000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1 60000ffffff44000 default MaxRef=1 Pages=1 Mapped=1 Anon=1 N0=1 getty uses ld.so. The first vma is the code segment which is used by 43 other processes and the pages are evenly distributed over the 4 nodes. The second vma is the process specific data portion for ld.so. This is only one page. The display format is: <startaddress> Links to information in /proc/<pid>/map <memory policy> This can be "default" "interleave={}", "prefer=<node>" or "bind={<zones>}" MaxRef= <maximum reference to a page in this vma> Pages= <Nr of pages in use> Mapped= <Nr of pages with mapcount > Anon= <nr of anonymous pages> Nx= <Nr of pages on Node x> The content of the proc-file is self-evident. If this would be tied into the sparsemem system then the contents of this file would not be too useful. Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | [PATCH] uclinux: use MAP_PRIVATE when mmaping code regions in flat binary loaderGreg Ungerer2005-09-02
|/ | | | | | | | | Use MAP_PRIVATE when calling mmap to get memory for the code region. The flat loader was using MAP_SHARED, but underlying changes to the MMUless mmap means this is now wrong. Signed-off-by: Greg Ungerer <gerg@uclinux.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* Merge refs/heads/for-linus from ↵Linus Torvalds2005-08-30
|\ | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/shaggy/jfs-2.6.git
| * Merge with /home/shaggy/git/linus-clean/Dave Kleikamp2005-08-19
| |\
| * | JFS: Initialize dentry->d_op for negative dentries tooDave Kleikamp2005-08-17
| | | | | | | | | | | | Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
* | | [TCP]: Move the tcp sock states to net/tcp_states.hArnaldo Carvalho de Melo2005-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lots of places just needs the states, not even linux/tcp.h, where this enum was, needs it. This speeds up development of the refactorings as less sources are rebuilt when things get moved from net/tcp.h. Signed-off-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | Merge HEAD from master.kernel.org:/home/rmk/linux-2.6-arm.git Linus Torvalds2005-08-29
|\ \ \
| * | | [ARM] fs/adfs/adfs.h: "extern inline" doesn't make senseAdrian Bunk2005-08-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | "extern inline" doesn't make sense. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
* | | | [PATCH] Fix oops in sysfs_hash_and_remove_file()James Bottomley2005-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem arises if an entity in sysfs is created and removed without ever having been made completely visible. In SCSI this is triggered by removing a device while it's initialising. The problem appears to be that because it was never made visible in sysfs, the sysfs dentry has a null d_inode which oopses when a reference is made to it. The solution is simply to check d_inode and assume the object was never made visible (and thus doesn't need deleting) if it's NULL. (akpm: possibly a stopgap for 2.6.13 scsi problems. May not be the long-term fix) Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] Fix oops in fs/locks.c on close of file with pending locksSteve French2005-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recent change to locks_remove_flock code in fs/locks.c changes how byte range locks are removed from closing files, which shows up a bug in cifs. The assumption in the cifs code was that the close call sent to the server would remove any pending locks on the server on this file, but that is no longer safe as the fs/locks.c code on the client wants unlock of 0 to PATH_MAX to remove all locks (at least from this client, it is not possible AFAIK to remove all locks from other clients made to the server copy of the file). Note that cifs locks are different from posix locks - and it is not possible to map posix locks perfectly on the wire yet, due to restrictions of the cifs network protocol, even to Samba without adding a new request type to the network protocol (which we plan to do for Samba 3.0.21 within a few months), but the local client will have the correct, posix view, of the lock in most cases. The correct fix for cifs for this would involve a bigger change than I would like to do this late in the 2.6.13-rc cycle - and would involve cifs keeping track of all unmerged (uncoalesced) byte range locks for each remote inode and scanning that list to remove locks that intersect or fall wholly within the range - locks that intersect may have to be reaquired with the smaller, remaining range. Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] hppfs: fix symlink error pathPaolo 'Blaisorblade' Giarrusso2005-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While touching this code I noticed the error handling is bogus, so I fixed it up. I've removed the IS_ERR(proc_dentry) check, which will never trigger and is clearly a typo: we must check proc_file instead. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] Fixup symlink function pointers for hppfs [for 2.6.13]Paolo 'Blaisorblade' Giarrusso2005-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update hppfs for the symlink functions prototype change. Yes, I know the code I leave there is still _bogus_, see next patch for this. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] Document idr_get_new_above() semantics, update inotifyJohn McCutchan2005-08-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an off by one problem with idr_get_new_above. The comment and function name suggest that it will return an id > starting_id, but it actually returned an id >= starting_id, and kernel callers other than inotify treated it as such. The patch below fixes the comment, and fixes inotifys usage. The function name still doesn't match the behaviour, but it never did. Signed-off-by: John McCutchan <ttb@tentacle.dhs.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | Don't allow normal users to set idle IO priorityLinus Torvalds2005-08-20
| | | | | | | | | | | | | | | | | | | | | | | | It has all the normal priority inversion problems. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] freevxfs: fix breakage introduced by symlink fixesAlexey Dobriyan2005-08-20
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | befs: fix up missed follow_link declaration changeLinus Torvalds2005-08-20
| | | | | | | | | | | | | | | | | | | | We'd updated the prototype and the return value, but not the function declaration itself.
* | | | [PATCH] NFSv4: unbalanced BKL in nfs_atomic_lookup()Steve Dickson2005-08-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Added missing unlock_kernel() to NFSv4 atomic lookup. Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] Fix up symlink function pointersAl Viro2005-08-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes up the symlink functions for the calling convention change: * afs, autofs4, befs, devfs, freevxfs, jffs2, jfs, ncpfs, procfs, smbfs, sysvfs, ufs, xfs - prototype change for ->follow_link() * befs, smbfs, xfs - same for ->put_link() Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | Fix nasty ncpfs symlink handling bug.Linus Torvalds2005-08-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bug could cause oopses and page state corruption, because ncpfs used the generic page-cache symlink handlign functions. But those functions only work if the page cache is guaranteed to be "stable", ie a page that was installed when the symlink walk was started has to still be installed in the page cache at the end of the walk. We could have fixed ncpfs to not use the generic helper routines, but it is in many ways much cleaner to instead improve on the symlink walking helper routines so that they don't require that absolute stability. We do this by allowing "follow_link()" to return a error-pointer as a cookie, which is fed back to the cleanup "put_link()" routine. This also simplifies NFS symlink handling. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] jffs2: fix symlink error handlingAl Viro2005-08-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current calling conventions for ->follow_link() are already fairly complex. What we have is 1) you can return -error; then you must release nameidata yourself and ->put_link() will _not_ be called. 2) you can do nd_set_link(nd, ERR_PTR(-error)) and return 0 3) you can do nd_set_link(nd, path) and return 0 4) you can return 0 (after having moved nameidata yourself) jffs2 follow_link() is broken - it has an exit where it returns -EIO and leaks nameidata. Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] reiserfs+acl+quota deadlock fixJan Kara2005-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When i_acl_default is set to some error we do not hold the lock (hence we are not allowed to drop it and reacquire later). Signed-off-by: Jan Kara <jack@suse.cz> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Chris Mason <mason@suse.com> Cc: <reiserfs-dev@namesys.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] NFS: Introduce the use of inode->i_lock to protect fields in nfsiChuck Lever2005-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Down the road we want to eliminate the use of the global kernel lock entirely from the NFS client. To do this, we need to protect the fields in the nfs_inode structure adequately. Start by serializing updates to the "cache_validity" field. Note this change addresses an SMP hang found by njw@osdl.org, where processes deadlock because nfs_end_data_update and nfs_revalidate_mapping update the "cache_validity" field without proper serialization. Test plan: Millions of fsx ops on SMP clients. Run Nick Wilson's breaknfs program on large SMP clients. Signed-off-by: Chuck Lever <cel@netapp.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] NFS: use atomic bitops to manipulate flags in nfsi->flagsChuck Lever2005-08-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce atomic bitops to manipulate the bits in the nfs_inode structure's "flags" field. Using bitops means we can use a generic wait_on_bit call instead of an ad hoc locking scheme in fs/nfs/inode.c, so we can remove the "nfs_i_wait" field from nfs_inode at the same time. The other new flags field will continue to use bitmask and logic AND and OR. This permits several flags to be set at the same time efficiently. The following patch adds a spin lock to protect these flags, and this spin lock will later cover other fields in the nfs_inode structure, amortizing the cost of using this type of serialization. Test plan: Millions of fsx ops on SMP clients. Signed-off-by: Chuck Lever <cel@netapp.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | [PATCH] NFS: split nfsi->flags into two fieldsChuck Lever2005-08-18
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Certain bits in nfsi->flags can be manipulated with atomic bitops, and some are better manipulated via logical bitmask operations. This patch splits the flags field into two. The next patch introduces atomic bitops for one of the fields. Test plan: Millions of fsx ops on SMP clients. Signed-off-by: Chuck Lever <cel@netapp.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | Merge master.kernel.org:/pub/scm/linux/kernel/git/aia21/ntfs-2.6Linus Torvalds2005-08-17
|\ \ \
| * | | NTFS: Complete the previous fix for the unset device when mapping buffersAnton Altaparmakov2005-08-16
| |/ / | | | | | | | | | | | | | | | | | | for mft record writing. I had missed the writepage based mft record write code path. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
* | / [PATCH] nfsd to unlock kernel before exitingSteven Rostedt2005-08-17
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nfsd holds the big kernel lock upon exit, when it really shouldn't. Not to mention that this breaks Ingo's RT patch. This is a trivial fix to release the lock. Ingo, this patch also works with your kernel, and stops the problem with nfsd. Note, there's a "goto out;" where "out:" is right above svc_exit_thread. The point of the goto also holds the kernel_lock, so I don't see any problem here in releasing it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | Merge head 'for-linus' of ↵Linus Torvalds2005-08-16
|\ \ | |/ |/| | | master.kernel.org:/pub/scm/linux/kernel/git/shaggy/jfs-2.6
| * Merge with /home/shaggy/git/linus-clean/Dave Kleikamp2005-08-10
| |\ | | | | | | | | | Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
| * | JFS: Fix race in txLockDave Kleikamp2005-08-10
| | | | | | | | | | | | | | | | | | | | | | | | TxAnchor.anon_list is protected by jfsTxnLock (TXN_LOCK), but there was a place in txLock() that was removing an entry from the list without holding the spinlock. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
| * | Merge with /home/shaggy/git/linus-clean/Dave Kleikamp2005-08-04
| |\ \ | | | | | | | | | | | | Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
| * | | JFS: Check for invalid inodes in jfs_delete_inodeDave Kleikamp2005-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Some error paths may iput an invalid inode with i_nlink=0. jfs should not try to actually delete such an inode. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
| * | | Merge with /home/shaggy/git/linus-clean/Dave Kleikamp2005-07-28
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | /home/shaggy/git/linus-clean/ /home/shaggy/git/linus-clean/ Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
| * \ \ \ Merge with /home/shaggy/git/linus-clean/Dave Kleikamp2005-07-27
| |\ \ \ \ | | | | | | | | | | | | | | | | | | Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
| * | | | | JFS: Improve sync barrier processingDave Kleikamp2005-07-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under heavy load, hot metadata pages are often locked by non-committed transactions, making them difficult to flush to disk. This prevents the sync point from advancing past a transaction that had modified the page. There is a point during the sync barrier processing where all outstanding transactions have been committed to disk, but no new transaction have been allowed to proceed. This is the best time to write the metadata. Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
* | | | | | Merge master.kernel.org:/pub/scm/linux/kernel/git/aia21/ntfs-2.6Linus Torvalds2005-08-16
|\ \ \ \ \ \
| * | | | | | NTFS: Fix bug in mft record writing where we forgot to set the device inAnton Altaparmakov2005-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the buffers when mapping them after the VM had discarded them. Thanks to Martin MOKREJŠ for the bug report. Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
* | | | | | | [PATCH] NFS: Ensure we always update inode->i_mode when doing O_EXCL createsTrond Myklebust2005-08-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the client performs an exclusive create and opens the file for writing, a Netapp filer will first create the file using the mode 01777. It does this since an NFSv3/v4 exclusive create cannot immediately set the mode bits. The 01777 mode then gets put into the inode->i_mode. After the file creation is successful, we then do a setattr to change the mode to the correct value (as per the NFS spec). The problem is that nfs_refresh_inode() no longer updates inode->i_mode, so the latter retains the 01777 mode. A bit later, the VFS notices this, and calls remove_suid(). This of course now resets the file mode to inode->i_mode & 0777. Hey presto, the file mode on the server is now magically changed to 0777. Duh... Fixes http://bugzilla.linux-nfs.org/show_bug.cgi?id=32 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | | [PATCH] NFS: Ensure ACL xdr code doesn't overflow.Trond Myklebust2005-08-16
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] inotify: add MOVE_SELF eventJohn McCutchan2005-08-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a MOVE_SELF event to inotify. It is sent whenever the inode you are watching is moved. We need this event so that we can catch something like this: - app1: watch /etc/mtab - app2: cp /etc/mtab /tmp/mtab-work mv /etc/mtab /etc/mtab~ mv /tmp/mtab-work /etc/mtab app1 still thinks it's watching /etc/mtab but it's actually watching /etc/mtab~. Signed-off-by: John McCutchan <ttb@tentacle.dhs.org> Signed-off-by: Robert Love <rml@novell.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] inotify: fix idr_get_new_above usageRobert Love2005-08-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are saving the wrong thing in ->last_wd. We want the wd, not the return value. Signed-off-by: Robert Love <rml@novell.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] CIFS: Fix path name conversion for long filenamesSteve French2005-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix path name conversion for long filenames when mapchars mount option was specified at mount time. Signed-off-by: Steve French (sfrench@us.ibm.com) Signed-off-by: Linus Torvalds <torvalds@osdl.org>
* | | | | | [PATCH] CIFS: Fix missing entries in search resultsSteve French2005-08-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix missing entries in search results when very long file names and more than 50 (or so) of such long search entries in the directory. FindNext could send corrupt last byte of resume name when resume key was a few hundred bytes long file name or longer. Fixes Samba Bug # 2932 Signed-off-by: Steve French (sfrench@us.ibm.com) Signed-off-by: Linus Torvalds <torvalds@osdl.org>