aboutsummaryrefslogtreecommitdiffstats
path: root/fs
Commit message (Collapse)AuthorAge
* sysfs: simply sysfs_get_dentryTejun Heo2007-10-12
| | | | | | | | | | | | | Now that we know the sysfs tree structure cannot change under us and sysfs shadow support is dropped, sysfs_get_dentry() can be simplified greatly. It can just look up from the root and there's no need to retry on failure. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: Introduce sysfs_rename_mutexEric W. Biederman2007-10-12
| | | | | | | | | | | | | | | | | | | | | Looking carefully at the rename code we have a subtle dependency that the structure of sysfs not change while we are performing a rename. If the parent directory of the object we are renaming changes while the rename is being performed nasty things could happen when we go to release our locks. So introduce a sysfs_rename_mutex to prevent this highly unlikely theoretical issue. In addition hold sysfs_rename_mutex over all calls to sysfs_get_dentry. Allowing sysfs_get_dentry to be simplified in the future. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: Rewrite sysfs_drop_dentry.Eric W. Biederman2007-10-12
| | | | | | | | | | | | | Currently we find the dentry to drop by looking at sd->s_dentry. We can just as easily accomplish the same task by looking up the sysfs inode and finding all of the dentries from there, with the added bonus that we don't need to play with the sysfs_assoc_lock. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: Simplify readdir.Eric W. Biederman2007-10-12
| | | | | | | | | | | | | | | | | | | | | | | | | | At some point someone wrote sysfs_readdir to insert a cursor into the list of sysfs_dirents to ensure that sysfs_readdir would restart properly. That works but it is complex code and tends to be expensive. The same effect can be achieved by keeping the sysfs_dirents in inode order and using the inode number as the f_pos. Then when we restart we just have to find the first dirent whose inode number is equal or greater then the last sysfs_dirent we attempted to return. Removing the sysfs directory cursor also allows the remove of all of the mysterious checks for sysfs_type(sd) != 0. Which were nonbovious checks to see if a cursor was in a directory list. tj: offset marker for EOF is changed from UINT_MAX to INT_MAX to avoid overflow in case offset is 32bit. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: In sysfs_lookup don't open code sysfs_find_direntEric W. Biederman2007-10-12
| | | | | | | | | | | This is a small cleanup patch that makes the code just a little bit cleaner. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: Make sysfs_mount staticEric W. Biederman2007-10-12
| | | | | | | | | | | | | | | This patch modifies the users of sysfs_mount to use sysfs_root instead (which is what they are looking for). It then makes sysfs_mount static to keep people from using it by accident. The net result is slightly faster and cleaner code. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: Use kill_anon_superEric W. Biederman2007-10-12
| | | | | | | | | | | | | | | Since sysfs no longer stores fs directory information in the dcache on a permanent basis kill_litter_super it is inappropriate and actively wrong. It will decrement the count on all dentries left in the dcache before trying to free them. At the moment this is not biting us only because we never unmount sysfs. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: Remove sysfs_instantiateEric W. Biederman2007-10-12
| | | | | | | | | | | Now that sysfs_get_inode is dropping the inode lock we no longer have a need from sysfs_instantiate. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: Move all of inode initialization into sysfs_init_inodeEric W. Biederman2007-10-12
| | | | | | | | Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: fix i_mutex locking in sysfs_get_dentry()Tejun Heo2007-10-12
| | | | | | | | | | | | | lookup_one_len_kern() should be called with the parent's i_mutex locked. Fix it. Spotted by Eric W. Biederman. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: Fix typos in fs/sysfs/file.cRolf Eike Beer2007-10-12
| | | | | | | Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: make sysfs_addrm_finish() return voidTejun Heo2007-10-12
| | | | | | | | | | | | | | | With the previous sysfs_add_one() update, there is only one user of the return value of sysfs_addrm_finish() and the user can switch to testing @sd easily. Make sysfs_addrm_finish() return void for cleaner semantics as suggested by Satyam Sharma. This patch doesn't introduce any noticeable behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Satyam Sharma <satyam.sharma@gmail.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: make sysfs_add_one() automatically check for duplicate entryTejun Heo2007-10-12
| | | | | | | | | | | | Make sysfs_add_one() check for duplicate entry and return -EEXIST if such entry exists. This simplifies node addition code a bit. This patch doesn't introduce any noticeable behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: make sysfs_add/remove_one() call link/unlink_sibling() implictlyTejun Heo2007-10-12
| | | | | | | | | | | | | | | | | When adding or removing a sysfs_dirent, the user used to be required to call link/unlink separately. It was for two reasons - code looked like that before sysfs_addrm_cxt conversion and to avoid looping through parent_sd->children list twice during removal. Performance optimization during removal just isn't worth it. Make sysfs_add/remove_one() call sysfs_link/unlink_sibing() implicitly. This makes code simpler albeit slightly less efficient. This change doesn't introduce any noticeable behavior change. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: simplify sysfs_rename_dir()Tejun Heo2007-10-12
| | | | | | | | | | | | | | | | | With the shadow directories gone, sysfs_rename_dir() can be simplified. * parent doesn't need to be grabbed separately. Just access old_dentry->d_parent. * parent sd can never change. Remove code to move under the new parent. * Massage comments a bit. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: cosmetic changes in sysfs_lookup()Tejun Heo2007-10-12
| | | | | | | | | | | | | * remove space between * and symbol name in variable declaration. * kill unnecessary new line. * kill 'found' and test 'sd' instead. Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: Remove first pass at shadow directory supportEric W. Biederman2007-10-12
| | | | | | | | | | | | | | | | | | | While shadow directories appear to be a good idea, the current scheme of controlling their creation and destruction outside of sysfs appears to be a locking and maintenance nightmare in the face of sysfs directories dynamically coming and going. Which can now occur for directories containing network devices when CONFIG_SYSFS_DEPRECATED is not set. This patch removes everything from the initial shadow directory support that allowed the shadow directory creation to be controlled at a higher level. So except for a few bits of sysfs_rename_dir everything from commit b592fcfe7f06c15ec11774b5be7ce0de3aa86e73 is now gone. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Tejun Heo <htejun@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs: cleanup semaphore.hDave Young2007-10-12
| | | | | | | | | Cleanup semaphore.h Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* sysfs/file.c - use mutex instead of semaphoreDave Young2007-10-12
| | | | | | | | Use mutex instead of semaphore in sysfs/file.c : sys_buffer. Signed-off-by: Dave Young <hidave.darkstar@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* debugfs: helper for decimal challengedRobin Getz2007-10-12
| | | | | | | | Allows debugfs helper functions to have a hex output, rather than just decimal Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* Drivers: clean up direct setting of the name of a ksetGreg Kroah-Hartman2007-10-12
| | | | | | | | | | | | A kset should not have its name set directly, so dynamically set the name at runtime. This is needed to remove the static array in the kobject structure which will be changed in a future patch. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
* kobjects: fix up improper use of the kobject name fieldGreg Kroah-Hartman2007-10-12
| | | | | | | A number of different drivers incorrect access the kobject name field directly. This is not correct as the name might not be in the array. Use the proper accessor function instead.
* Fix up more bio falloutAl Viro2007-10-12
| | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* Merge branch 'master' of ↵Linus Torvalds2007-10-12
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (408 commits) [POWERPC] Add memchr() to the bootwrapper [POWERPC] Implement logging of unhandled signals [POWERPC] Add legacy serial support for OPB with flattened device tree [POWERPC] Use 1TB segments [POWERPC] XilinxFB: Allow fixed framebuffer base address [POWERPC] XilinxFB: Add support for custom screen resolution [POWERPC] XilinxFB: Use pdata to pass around framebuffer parameters [POWERPC] PCI: Add 64-bit physical address support to setup_indirect_pci [POWERPC] 4xx: Kilauea defconfig file [POWERPC] 4xx: Kilauea DTS [POWERPC] 4xx: Add AMCC Kilauea eval board support to platforms/40x [POWERPC] 4xx: Add AMCC 405EX support to cputable.c [POWERPC] Adjust TASK_SIZE on ppc32 systems to 3GB that are capable [POWERPC] Use PAGE_OFFSET to tell if an address is user/kernel in SW TLB handlers [POWERPC] 85xx: Enable FP emulation in MPC8560 ADS defconfig [POWERPC] 85xx: Killed <asm/mpc85xx.h> [POWERPC] 85xx: Add cpm nodes for 8541/8555 CDS [POWERPC] 85xx: Convert mpc8560ads to the new CPM binding. [POWERPC] mpc8272ads: Remove muram from the CPM reg property. [POWERPC] Make clockevents work on PPC601 processors ... Fixed up conflict in Documentation/powerpc/booting-without-of.txt manually.
| * Merge branch 'linux-2.6' into for-2.6.24Paul Mackerras2007-10-03
| |\
| * \ Merge branch 'linux-2.6'Paul Mackerras2007-09-19
| |\ \
| * | | [POWERPC] spufs: Cleanup ELF coredump extra notes logicMichael Ellerman2007-09-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To start with, arch_notes_size() etc. is a little too ambiguous a name for my liking, so change the function names to be more explicit. Calling through macros is ugly, especially with hidden parameters, so don't do that, call the routines directly. Use ARCH_HAVE_EXTRA_ELF_NOTES as the only flag, and based on it decide whether we want the extern declarations or the empty versions. Since we have empty routines, actually use them in the coredump code to save a few #ifdefs. We want to change the handling of foffset so that the write routine updates foffset as it goes, instead of using file->f_pos (so that writing to a pipe works). So pass foffset to the write routine, and for now just set it to file->f_pos at the end of writing. It should also be possible for the write routine to fail, so change it to return int and treat a non-zero return as failure. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Jeremy Kerr <jk@ozlabs.org> Signed-off-by: Paul Mackerras <paulus@samba.org>
* | | | JFS: fix bio-related build breakageJeff Garzik2007-10-12
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Jeff Garzik <jgarzik@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'master' of ↵Linus Torvalds2007-10-11
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (867 commits) [SKY2]: status polling loop (post merge) [NET]: Fix NAPI completion handling in some drivers. [TCP]: Limit processing lost_retrans loop to work-to-do cases [TCP]: Fix lost_retrans loop vs fastpath problems [TCP]: No need to re-count fackets_out/sacked_out at RTO [TCP]: Extract tcp_match_queue_to_sack from sacktag code [TCP]: Kill almost unused variable pcount from sacktag [TCP]: Fix mark_head_lost to ignore R-bit when trying to mark L [TCP]: Add bytes_acked (ABC) clearing to FRTO too [IPv6]: Update setsockopt(IPV6_MULTICAST_IF) to support RFC 3493, try2 [NETFILTER]: x_tables: add missing ip6t_modulename aliases [NETFILTER]: nf_conntrack_tcp: fix connection reopening [QETH]: fix qeth_main.c [NETLINK]: fib_frontend build fixes [IPv6]: Export userland ND options through netlink (RDNSS support) [9P]: build fix with !CONFIG_SYSCTL [NET]: Fix dev_put() and dev_hold() comments [NET]: make netlink user -> kernel interface synchronious [NET]: unify netlink kernel socket recognition [NET]: cleanup 3rd argument in netlink_sendskb ... Fix up conflicts manually in Documentation/feature-removal-schedule.txt and my new least favourite crap, the "mod_devicetable" support in the files include/linux/mod_devicetable.h and scripts/mod/file2alias.c. (The latter files seem to be explicitly _designed_ to get conflicts when different subsystems work with them - that have an absolutely horrid lack of subsystem separation!) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | [NET]: make netlink user -> kernel interface synchroniousDenis V. Lunev2007-10-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch make processing netlink user -> kernel messages synchronious. This change was inspired by the talk with Alexey Kuznetsov about current netlink messages processing. He says that he was badly wrong when introduced asynchronious user -> kernel communication. The call netlink_unicast is the only path to send message to the kernel netlink socket. But, unfortunately, it is also used to send data to the user. Before this change the user message has been attached to the socket queue and sk->sk_data_ready was called. The process has been blocked until all pending messages were processed. The bad thing is that this processing may occur in the arbitrary process context. This patch changes nlk->data_ready callback to get 1 skb and force packet processing right in the netlink_unicast. Kernel -> user path in netlink_unicast remains untouched. EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock drop, but the process remains in the cycle until the message will be fully processed. So, there is no need to use this kludges now. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [FS] seq_file: Introduce the seq_open_private()Pavel Emelyanov2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function allocates the zeroed chunk of memory and call seq_open(). The __seq_open_private() helper returns the allocated memory to make it possible for the caller to initialize it. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [NETNS]: Move some code into __init section when CONFIG_NET_NS=nPavel Emelyanov2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the net namespaces many code leaved the __init section, thus making the kernel occupy more memory than it did before. Since we have a config option that prohibits the namespace creation, the functions that initialize/finalize some netns stuff are simply not needed and can be freed after the boot. Currently, this is almost not noticeable, since few calls are no longer in __init, but when the namespaces will be merged it will be possible to free more code. I propose to use the __net_init, __net_exit and __net_initdata "attributes" for functions/variables that are not used if the CONFIG_NET_NS is not set to save more space in memory. The exiting functions cannot just reside in the __exit section, as noticed by David, since the init section will have references on it and the compilation will fail due to modpost checks. These references can exist, since the init namespace never dies and the exit callbacks are never called. So I introduce the __exit_refok attribute just like it is already done with the __init_refok. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [NET]: Fix race when opening a proc file while a network namespace is exiting.Eric W. Biederman2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem: proc_net files remember which network namespace the are against but do not remember hold a reference count (as that would pin the network namespace). So we currently have a small window where the reference count on a network namespace may be incremented when opening a /proc file when it has already gone to zero. To fix this introduce maybe_get_net and get_proc_net. maybe_get_net increments the network namespace reference count only if it is greater then zero, ensuring we don't increment a reference count after it has gone to zero. get_proc_net handles all of the magic to go from a proc inode to the network namespace instance and call maybe_get_net on it. PROC_NET the old accessor is removed so that we don't get confused and use the wrong helper function. Then I fix up the callers to use get_proc_net and handle the case case where get_proc_net returns NULL. In that case I return -ENXIO because effectively the network namespace has already gone away so the files we are trying to access don't exist anymore. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [NETNS]: Fix export symbols.Daniel Lezcano2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the appropriate EXPORT_SYMBOLS for proc_net_create, proc_net_fops_create and proc_net_remove to fix errors when compiling allmodconfig Signed-off-by: Mark Nelson <markn@au1.ibm.com> Acked-by: Benjamin Thery <benjamin.thery@bull.net> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [NET]: Fix missed addition of fs/proc/proc_net.cDavid S. Miller2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | My bad. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [NET]: Make the device list and device lookups per namespace.Eric W. Biederman2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes most of the generic device layer network namespace safe. This patch makes dev_base_head a network namespace variable, and then it picks up a few associated variables. The functions: dev_getbyhwaddr dev_getfirsthwbytype dev_get_by_flags dev_get_by_name __dev_get_by_name dev_get_by_index __dev_get_by_index dev_ioctl dev_ethtool dev_load wireless_process_ioctl were modified to take a network namespace argument, and deal with it. vlan_ioctl_set and brioctl_set were modified so their hooks will receive a network namespace argument. So basically anthing in the core of the network stack that was affected to by the change of dev_base was modified to handle multiple network namespaces. The rest of the network stack was simply modified to explicitly use &init_net the initial network namespace. This can be fixed when those components of the network stack are modified to handle multiple network namespaces. For now the ifindex generator is left global. Fundametally ifindex numbers are per namespace, or else we will have corner case problems with migration when we get that far. At the same time there are assumptions in the network stack that the ifindex of a network device won't change. Making the ifindex number global seems a good compromise until the network stack can cope with ifindex changes when you change namespaces, and the like. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [NET]: Support multiple network namespaces with netlinkEric W. Biederman2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Each netlink socket will live in exactly one network namespace, this includes the controlling kernel sockets. This patch updates all of the existing netlink protocols to only support the initial network namespace. Request by clients in other namespaces will get -ECONREFUSED. As they would if the kernel did not have the support for that netlink protocol compiled in. As each netlink protocol is updated to be multiple network namespace safe it can register multiple kernel sockets to acquire a presence in the rest of the network namespaces. The implementation in af_netlink is a simple filter implementation at hash table insertion and hash table look up time. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [NET]: Make /proc/net per network namespaceEric W. Biederman2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes /proc/net per network namespace. It modifies the global variables proc_net and proc_net_stat to be per network namespace. The proc_net file helpers are modified to take a network namespace argument, and all of their callers are fixed to pass &init_net for that argument. This ensures that all of the /proc/net files are only visible and usable in the initial network namespace until the code behind them has been updated to be handle multiple network namespaces. Making /proc/net per namespace is necessary as at least some files in /proc/net depend upon the set of network devices which is per network namespace, and even more files in /proc/net have contents that are relevant to a single network namespace. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | | [NET]: Don't implement dev_ifname32 inlineEric W. Biederman2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current implementation of dev_ifname makes maintenance difficult because updates to the implementation of the ioctl have to made in two places. So this patch updates dev_ifname32 to do a classic 32/64 structure conversion and call sys_ioctl like the rest of the compat calls do. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | compat_ioctl: move floppy handlers to block/compat_ioctl.cArnd Bergmann2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The floppy ioctls are used by multiple drivers, so they should be handled in a shared location. Also, add minor cleanups. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | compat_ioctl: move cdrom handlers to block/compat_ioctl.cArnd Bergmann2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These are shared by all cd-rom drivers and should have common handlers. Do slight cosmetic cleanups in the process. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | compat_ioctl: move BLKPG handling to block/compat_ioctl.cArnd Bergmann2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BLKPG is common to all block devices, so it should be handled by common code. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | compat_ioctl: move hdio calls to block/compat_ioctl.cArnd Bergmann2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These are common to multiple block drivers, so they should be handled by the block layer. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | compat_ioctl: handle blk_trace ioctlsArnd Bergmann2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | blk_trace_setup is broken on x86_64 compat systems, this makes the code work correctly on all 64 bit architectures in compat mode. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | compat_ioctl: add compat_blkdev_driver_ioctl()Arnd Bergmann2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Handle those blockdev ioctl calls that are compatible directly from the compat_blkdev_ioctl() function, instead of having to go through the compat_ioctl hash lookup. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | compat_ioctl: move common block ioctls to compat_blkdev_ioctlArnd Bergmann2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make compat_blkdev_ioctl and blkdev_ioctl reflect the respective native versions. This is somewhat more efficient and makes it easier to keep the two in sync. Also get rid of the bogus handling for broken_blkgetsize and the duplicate entry for BLKRASET. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | Drop 'size' argument from bio_endio and bi_end_ioNeilBrown2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | Don't decrement bi_size in bio_endioNeilBrown2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The only caller of bio_endio that does not pass the full bi_size is end_that_request_first. Also, no ->bi_end_io method is really interested in bi_size being decremented. So move the decrement and related code into ll_rw_blk and merge it with order_bio_endio to form req_bio_endio which does endio functionality specific to request completion. As some ->bi_end_io methods do check bi_size of 0, we set it thus for now, but that will go in the next patch. Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./block/ll_rw_blk.c | 42 +++++++++++++++++++++++++++--------------- ./fs/bio.c | 23 +++++++++++------------ 2 files changed, 38 insertions(+), 27 deletions(-) diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | Only call bi_end_io once for any bioNeilBrown2007-10-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently bi_end_io can be called multiple times as sub-requests complete. However no ->bi_end_io function wants to know about that. So only call when the bio is complete. Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./fs/bio.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff .prev/fs/bio.c ./fs/bio.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
* | | | | Fix warnings with !CONFIG_BLOCKJens Axboe2007-10-10
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hide everything in blkdev.h with CONFIG_BLOCK isn't set, and fixup the (few) files that fail to build because they were relying on blkdev.h pulling in extra includes for them. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>