aboutsummaryrefslogtreecommitdiffstats
path: root/include/linux
Commit message (Collapse)AuthorAge
* KVM: PPC: Allow userspace to unset the IRQ lineAlexander Graf2010-05-17
| | | | | | | | | | | | | | | Userspace can tell us that it wants to trigger an interrupt. But so far it can't tell us that it wants to stop triggering one. So let's interpret the parameter to the ioctl that we have anyways to tell us if we want to raise or lower the interrupt line. Signed-off-by: Alexander Graf <agraf@suse.de> v2 -> v3: - Add CAP for unset irq Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: Add support for saving&restoring debug registersJan Kiszka2010-04-25
| | | | | | | | So far user space was not able to save and restore debug registers for migration or after reset. Plug this hole. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: x86: Save&restore interrupt shadow maskJan Kiszka2010-04-25
| | | | | | | | | | | | | The interrupt shadow created by STI or MOV-SS-like operations is part of the VCPU state and must be preserved across migration. Transfer it in the spare padding field of kvm_vcpu_events.interrupt. As a side effect we now have to make vmx_set_interrupt_shadow robust against both shadow types being set. Give MOV SS a higher priority and skip STI in that case to avoid that VMX throws a fault on next entry. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* KVM: PPC: Add capability for paired singlesAlexander Graf2010-04-25
| | | | | | | | We need to tell userspace that we can emulate paired single instructions. So let's add a capability export. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>
* Merge branch 'for-linus' of ↵Linus Torvalds2010-04-21
|\ | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lrg/voltage-2.6: mc13783-regulator: fix a memory leak in mc13783_regulator_remove regulator: Let drivers know when they use the stub API
| * regulator: Let drivers know when they use the stub APIJean Delvare2010-04-19
| | | | | | | | | | | | | | | | | | | | | | Have the stub variant of regulator_get() return NULL, so that drivers can (but still don't have to) handle this case specifically. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Cc: Jerome Oufella <jerome.oufella@savoirfairelinux.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk>
* | KVM: Increase NR_IOBUS_DEVS limit to 200Sridhar Samudrala2010-04-20
| | | | | | | | | | | | | | | | | | | | This patch increases the current hardcoded limit of NR_IOBUS_DEVS from 6 to 200. We are hitting this limit when creating a guest with more than 1 virtio-net device using vhost-net backend. Each virtio-net device requires 2 such devices to service notifications from rx/tx queues. Signed-off-by: Sridhar Samudrala <sri@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
* | KVM: fix the handling of dirty bitmaps to avoid overflowsTakuya Yoshikawa2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | Int is not long enough to store the size of a dirty bitmap. This patch fixes this problem with the introduction of a wrapper function to calculate the sizes of dirty bitmaps. Note: in mark_page_dirty(), we have to consider the fact that __set_bit() takes the offset as int, not long. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* | Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds2010-04-19
|\ \ | |/ |/| | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: Make RCU lockdep check the lockdep_recursion variable rcu: Update docs for rcu_access_pointer and rcu_dereference_protected rcu: Better explain the condition parameter of rcu_dereference_check() rcu: Add rcu_access_pointer and rcu_dereference_protected
| * rcu: Make RCU lockdep check the lockdep_recursion variablePaul E. McKenney2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lockdep facility temporarily disables lockdep checking by incrementing the current->lockdep_recursion variable. Such disabling happens in NMIs and in other situations where lockdep might expect to recurse on itself. This patch therefore checks current->lockdep_recursion, disabling RCU lockdep splats when this variable is non-zero. In addition, this patch removes the "likely()", as suggested by Lai Jiangshan. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Reported-by: David Miller <davem@davemloft.net> Tested-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: eric.dumazet@gmail.com LKML-Reference: <20100415195039.GA22623@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * rcu: Better explain the condition parameter of rcu_dereference_check()David Howells2010-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Better explain the condition parameter of rcu_dereference_check() that describes the conditions under which the dereference is permitted to take place (and incorporate Yong Zhang's suggestion). This condition is only checked under lockdep proving. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: eric.dumazet@gmail.com LKML-Reference: <1270852752-25278-2-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * rcu: Add rcu_access_pointer and rcu_dereference_protectedPaul E. McKenney2010-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds variants of rcu_dereference() that handle situations where the RCU-protected data structure cannot change, perhaps due to our holding the update-side lock, or where the RCU-protected pointer is only to be fetched, not dereferenced. These are needed due to some performance concerns with using rcu_dereference() where it is not required, aside from the need for lockdep/sparse checking. The new rcu_access_pointer() primitive is for the case where the pointer is be fetch and not dereferenced. This primitive may be used without protection, RCU or otherwise, due to the fact that it uses ACCESS_ONCE(). The new rcu_dereference_protected() primitive is for the case where updates are prevented, for example, due to holding the update-side lock. This primitive does neither ACCESS_ONCE() nor smp_read_barrier_depends(), so can only be used when updates are somehow prevented. Suggested-by: David Howells <dhowells@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com Cc: eric.dumazet@gmail.com LKML-Reference: <1270852752-25278-1-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | Merge branch 'for-linus' of ↵Linus Torvalds2010-04-15
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: firewire: cdev: change license of exported header files to MIT license firewire: cdev: comment fixlet firewire: cdev: iso packet documentation firewire: cdev: fix information leak firewire: cdev: require quadlet-aligned headers for transmit packets firewire: cdev: disallow receive packets without header
| * | firewire: cdev: change license of exported header files to MIT licenseStefan Richter2010-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Among else, this allows projects like libdc1394 to carry copies of the ABI related header files without them or distributors having to worry about effects on the project's overall license terms. Switch to MIT license as suggested by Kristian. Also update the year in the copyright statement according to source history. Cc: Jay Fenlason <fenlason@redhat.com> Acked-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
| * | firewire: cdev: comment fixletStefan Richter2010-04-10
| | | | | | | | | | | | Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
| * | firewire: cdev: iso packet documentationClemens Ladisch2010-04-10
| | | | | | | | | | | | | | | | | | | | | Add the missing documentation for iso packets. Signed-off-by: Clemens Ladisch <clemens@ladisch.de> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
* | | Merge branch 'for-linus' of ↵Linus Torvalds2010-04-15
|\ \ \ | |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: wacom - switch mode upon system resume Revert "Input: wacom - merge out and in prox events" Input: matrix_keypad - allow platform to disable key autorepeat Input: ALPS - add signature for HP Pavilion dm3 laptops Input: i8042 - spelling fix Input: sparse-keymap - implement safer freeing of the keymap Input: update the status of the Multitouch X driver project Input: clarify the no-finger event in multitouch protocol Input: bcm5974 - retract efi-broken suspend_resume Input: sparse-keymap - free the right keymap on error
| * | Input: matrix_keypad - allow platform to disable key autorepeatH Hartley Sweeten2010-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In an embedded system the matrix_keypad driver might be used to interface with an external control panel and not an actual keyboard. On the control panel some of the keys could be used to turn on/off various functions. If key autorepeat is enabled this causes the function to quickly toggle between the on and off states and makes operation difficult. Add an option in the platform-specific data to disable the key autorepeat. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
* | | NFSv4: fix delegated lockingTrond Myklebust2010-04-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Arnaud Giersch reports that NFSv4 locking is broken when we hold a delegation since commit 8e469ebd6dc32cbaf620e134d79f740bf0ebab79 (NFSv4: Don't allow posix locking against servers that don't support it). According to Arnaud, the lock succeeds the first time he opens the file (since we cannot do a delegated open) but then fails after we start using delegated opens. The following patch fixes it by ensuring that locking behaviour is governed by a per-filesystem capability flag that is initially set, but gets cleared if the server ever returns an OPEN without the NFS4_OPEN_RESULT_LOCKTYPE_POSIX flag being set. Reported-by: Arnaud Giersch <arnaud.giersch@iut-bm.univ-fcomte.fr> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: stable@kernel.org
* | | Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2010-04-09
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: (34 commits) cfq-iosched: Fix the incorrect timeslice accounting with forced_dispatch loop: Update mtime when writing using aops block: expose the statistics in blkio.time and blkio.sectors for the root cgroup backing-dev: Handle class_create() failure Block: Fix block/elevator.c elevator_get() off-by-one error drbd: lc_element_by_index() never returns NULL cciss: unlock on error path cfq-iosched: Do not merge queues of BE and IDLE classes cfq-iosched: Add additional blktrace log messages in CFQ for easier debugging i2o: Remove the dangerous kobj_to_i2o_device macro block: remove 16 bytes of padding from struct request on 64bits cfq-iosched: fix a kbuild regression block: make CONFIG_BLK_CGROUP visible Remove GENHD_FL_DRIVERFS block: Export max number of segments and max segment size in sysfs block: Finalize conversion of block limits functions block: Fix overrun in lcm() and move it to lib vfs: improve writeback_inodes_wb() paride: fix off-by-one test drbd: fix al-to-on-disk-bitmap for 4k logical_block_size ...
| * | | i2o: Remove the dangerous kobj_to_i2o_device macroFerenc Wagner2010-03-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This macro worked only when applied to variables named 'kobj'. While this could have been fixed by simply renaming the macro argument, a more type-safe replacement by an inline function would be preferred. However, nobody uses this macro, so it's simpler to just remove it. Signed-off-by: Ferenc Wagner <wferi@niif.hu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | block: remove 16 bytes of padding from struct request on 64bitsRichard Kennedy2010-03-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove alignment padding to shrink struct request from 336 to 320 bytes so needing one fewer cacheline and therefore removing 48 bytes from struct request_queue. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | Merge branch 'master' into for-linusJens Axboe2010-03-19
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: block/Kconfig Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | Remove GENHD_FL_DRIVERFSNeilBrown2010-03-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This flag is not used, so best discarded. Signed-off-by: NeilBrown <neilb@suse.de> -- Hi Jens, I came across this recently - these are the only two occurances of "GENHD_FL_DRIVERFS" in the kernel, so it cannot be needed. NeilBrown Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: Finalize conversion of block limits functionsMartin K. Petersen2010-03-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove compatibility wrappers and update remaining drivers. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | block: Fix overrun in lcm() and move it to libMartin K. Petersen2010-03-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lcm() was defined to take integer-sized arguments. The supplied arguments are multiplied, however, causing us to overflow given sufficiently large input. That in turn led to incorrect optimal I/O size reporting in some cases (RAID over RAID). Switch lcm() over to unsigned long similar to gcd() and move the function from blk-settings.c to lib. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | vfs: improve writeback_inodes_wb()Edward Shishkin2010-03-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not pin/unpin superblock for every inode in writeback_inodes_wb(), pin it for the whole group of inodes which belong to the same superblock and call writeback_sb_inodes() handler for them. Signed-off-by: Edward Shishkin <edward.shishkin@gmail.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
| * | | | drbd: Renamed overwrite_peer to primary_forcePhilipp Reisner2010-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
| * | | | drbd: --dry-run option for drbdsetup net ( drbdadm -- --dry-run connect <res> )Philipp Reisner2010-03-11
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* | | | | radix_tree_tag_get() is not as safe as the docs make out [ver #2]David Howells2010-04-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | radix_tree_tag_get() is not safe to use concurrently with radix_tree_tag_set() or radix_tree_tag_clear(). The problem is that the double tag_get() in radix_tree_tag_get(): if (!tag_get(node, tag, offset)) saw_unset_tag = 1; if (height == 1) { int ret = tag_get(node, tag, offset); may see the value change due to the action of set/clear. RCU is no protection against this as no pointers are being changed, no nodes are being replaced according to a COW protocol - set/clear alter the node directly. The documentation in linux/radix-tree.h, however, says that radix_tree_tag_get() is an exception to the rule that "any function modifying the tree or tags (...) must exclude other modifications, and exclude any functions reading the tree". The problem is that the next statement in radix_tree_tag_get() checks that the tag doesn't vary over time: BUG_ON(ret && saw_unset_tag); This has been seen happening in FS-Cache: https://www.redhat.com/archives/linux-cachefs/2010-April/msg00013.html To this end, remove the BUG_ON() from radix_tree_tag_get() and note in various comments that the value of the tag may change whilst the RCU read lock is held, and thus that the return value of radix_tree_tag_get() may not be relied upon unless radix_tree_tag_set/clear() and radix_tree_delete() are excluded from running concurrently with it. Reported-by: Romain DEGEZ <romain.degez@smartjog.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | slab: Generify kernel pointer validationPekka Enberg2010-04-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As suggested by Linus, introduce a kern_ptr_validate() helper that does some sanity checks to make sure a pointer is a valid kernel pointer. This is a preparational step for fixing SLUB kmem_ptr_validate(). Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Matt Mackall <mpm@selenic.com> Cc: Nick Piggin <npiggin@suse.de> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | libata: Fix accesses at LBA28 boundary (old bug, but nasty) (v2)Mark Lord2010-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most drives from Seagate, Hitachi, and possibly other brands, do not allow LBA28 access to sector number 0x0fffffff (2^28 - 1). So instead use LBA48 for such accesses. This bug could bite a lot of systems, especially when the user has taken care to align partitions to 4KB boundaries. On misaligned systems, it is less likely to be encountered, since a 4KB read would end at 0x10000000 rather than at 0x0fffffff. Signed-off-by: Mark Lord <mlord@pobox.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6Linus Torvalds2010-04-08
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide-2.6: ide: Fix IDE taskfile with cfq scheduler ide: Must hold queue lock when requeueing ide: Requeue request after DMA timeout
| * | | | | ide: Requeue request after DMA timeoutHerbert Xu2010-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed that my KVM virtual machines were experiencing IDE issues resulting in processes stuck on waiting for buffers to complete. The root cause is of course race conditions in the ancient qemu backend that I'm using. However, the fact that the guest isn't recovering is a bug. I've tracked it down to the change made last year to dequeue requests at the start rather than at the end in the IDE layer. commit 8f6205cd572fece673da0255d74843680f67f879 Author: Tejun Heo <tj@kernel.org> Date: Fri May 8 11:53:59 2009 +0900 ide: dequeue in-flight request The problem is that the function ide_dma_timeout_retry does not requeue the current request, causing one request to be lost for each DMA timeout. This patch fixes this by requeueing the request. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
* | | | | | virtio: disable multiport console support.Michael S. Tsirkin2010-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move MULTIPORT feature and related config changes out of exported headers, and disable the feature at runtime. At this point, it seems less risky to keep code around until we can enable it than rip it out completely. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
* | | | | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2010-04-07
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-tip: x86: Fix double enable_IR_x2apic() call on SMP kernel on !SMP boards x86: Increase CONFIG_NODES_SHIFT max to 10 ibft, x86: Change reserve_ibft_region() to find_ibft_region() x86, hpet: Fix bug in RTC emulation x86, hpet: Erratum workaround for read after write of HPET comparator bootmem, x86: Fix 32bit numa system without RAM on node 0 nobootmem, x86: Fix 32bit numa system without RAM on node 0 x86: Handle overlapping mptables x86: Make e820_remove_range to handle all covered case x86-32, resume: do a global tlb flush in S4 resume
| * | | | | | ibft, x86: Change reserve_ibft_region() to find_ibft_region()Yinghai Lu2010-04-01
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows arch code could decide the way to reserve the ibft. And we should reserve ibft as early as possible, instead of BOOTMEM stage, in case the table is in RAM range and is not reserved by BIOS (this will often be the case.) Move to just after find_smp_config(). Also when CONFIG_NO_BOOTMEM=y, We will not have reserve_bootmem() anymore. -v2: fix typo about ibft pointed by Konrad Rzeszutek Wilk <konrad@darnok.org> Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4BB510FB.80601@kernel.org> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Peter Jones <pjones@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad@kernel.org> CC: Jan Beulich <jbeulich@novell.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
* | | | | | memcg: fix race in file_mapped accountingKAMEZAWA Hiroyuki2010-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Presently, memcg's FILE_MAPPED accounting has following race with move_account (happens at rmdir()). increment page->mapcount (rmap.c) mem_cgroup_update_file_mapped() move_account() lock_page_cgroup() check page_mapped() if page_mapped(page)>1 { FILE_MAPPED -1 from old memcg FILE_MAPPED +1 to old memcg } ..... overwrite pc->mem_cgroup unlock_page_cgroup() lock_page_cgroup() FILE_MAPPED + 1 to pc->mem_cgroup unlock_page_cgroup() Then, old memcg (-1 file mapped) new memcg (+2 file mapped) This happens because move_account see page_mapped() which is not guarded by lock_page_cgroup(). This patch adds FILE_MAPPED flag to page_cgroup and move account information based on it. Now, all checks are synchronous with lock_page_cgroup(). Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Balbir Singh <balbir@in.ibm.com> Reviewed-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Andrea Righi <arighi@develer.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | pagemap: fix pfn calculation for hugepageNaoya Horiguchi2010-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we look into pagemap using page-types with option -p, the value of pfn for hugepages looks wrong (see below.) This is because pte was evaluated only once for one vma although it should be updated for each hugepage. This patch fixes it. $ page-types -p 3277 -Nl -b huge voffset offset len flags 7f21e8a00 11e400 1 ___U___________H_G________________ 7f21e8a01 11e401 1ff ________________TG________________ ^^^ 7f21e8c00 11e400 1 ___U___________H_G________________ 7f21e8c01 11e401 1ff ________________TG________________ ^^^ One hugepage contains 1 head page and 511 tail pages in x86_64 and each two lines represent each hugepage. Voffset and offset mean virtual address and physical address in the page unit, respectively. The different hugepages should not have the same offset value. With this patch applied: $ page-types -p 3386 -Nl -b huge voffset offset len flags 7fec7a600 112c00 1 ___UD__________H_G________________ 7fec7a601 112c01 1ff ________________TG________________ ^^^ 7fec7a800 113200 1 ___UD__________H_G________________ 7fec7a801 113201 1ff ________________TG________________ ^^^ OK More info: - This patch modifies walk_page_range()'s hugepage walker. But the change only affects pagemap_read(), which is the only caller of hugepage callback. - Without this patch, hugetlb_entry() callback is called per vma, that doesn't match the natural expectation from its name. - With this patch, hugetlb_entry() is called per hugepte entry and the callback can become much simpler. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | kernel.h: fix wrong usage of __ratelimit()Yong Zhang2010-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When __ratelimit() returns 1 this means that we can go ahead. Signed-off-by: Yong Zhang <yong.zhang@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | vfs: rename block_fsync() to blkdev_fsync()Andrew Morton2010-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Requested by hch, for consistency now it is exported. Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Anton Blanchard <anton@samba.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | raw: fsync method is now requiredAnton Blanchard2010-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 148f948ba877f4d3cdef036b1ff6d9f68986706a (vfs: Introduce new helpers for syncing after writing to O_SYNC file or IS_SYNC inode) broke the raw driver. We now call through generic_file_aio_write -> generic_write_sync -> vfs_fsync_range. vfs_fsync_range has: if (!fop || !fop->fsync) { ret = -EINVAL; goto out; } But drivers/char/raw.c doesn't set an fsync method. We have two options: fix it or remove the raw driver completely. I'm happy to do either, the fact this has been broken for so long suggests it is rarely used. The patch below adds an fsync method to the raw driver. My knowledge of the block layer is pretty sketchy so this could do with a once over. If we instead decide to remove the raw driver, this patch might still be useful as a backport to 2.6.33 and 2.6.32. Signed-off-by: Anton Blanchard <anton@samba.org> Reviewed-by: Jan Kara <jack@suse.cz> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Jens Axboe <jens.axboe@oracle.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Tested-by: Jeff Moyer <jmoyer@redhat.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | include/linux/kfifo.h: fix INIT_KFIFO()David Härdeman2010-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DECLARE_KFIFO creates a union with a struct kfifo and a buffer array with size [size + sizeof(struct kfifo)]. INIT_KFIFO then sets the buffer pointer in struct kfifo to point to the beginning of the buffer array which means that the first call to kfifo_in will overwrite members of the struct kfifo. Signed-off-by: David Härdeman <david@hardeman.nu> Acked-by: Stefani Seibold <stefani@seibold.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | bitops: remove temporary for_each_bit()Andrew Morton2010-04-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Migration has been completed so remove this now. There's one straggler in linux-next's drivers/mtd/sm_ftl.c. A patch has been sent. Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge branch 'upstream-linus' of ↵Linus Torvalds2010-04-06
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev: libata: unlock HPA if device shrunk libata: disable NCQ on Crucial C300 SSD libata: don't whine on spurious IRQ
| * | | | | | libata: unlock HPA if device shrunkTejun Heo2010-04-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some BIOSes don't configure HPA during boot but do so while resuming. This causes harddrives to shrink during resume making libata detach and reattach them. This can be worked around by unlocking HPA if old size equals native size. Add ATA_DFLAG_UNLOCK_HPA so that HPA unlocking can be controlled per-device and update ata_dev_revalidate() such that it sets ATA_DFLAG_UNLOCK_HPA and fails with -EIO when the above condition is detected. This patch fixes the following bug. https://bugzilla.kernel.org/show_bug.cgi?id=15396 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Oleksandr Yermolenko <yaa.bta@gmail.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
* | | | | | | Fix up possibly racy module refcountingNick Piggin2010-04-05
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Module refcounting is implemented with a per-cpu counter for speed. However there is a race when tallying the counter where a reference may be taken by one CPU and released by another. Reference count summation may then see the decrement without having seen the previous increment, leading to lower than expected count. A module which never has its actual reference drop below 1 may return a reference count of 0 due to this race. Module removal generally runs under stop_machine, which prevents this race causing bugs due to removal of in-use modules. However there are other real bugs in module.c code and driver code (module_refcount is exported) where the callers do not run under stop_machine. Fix this by maintaining running per-cpu counters for the number of module refcount increments and the number of refcount decrements. The increments are tallied after the decrements, so any decrement seen will always have its corresponding increment counted. The final refcount is the difference of the total increments and decrements, preventing a low-refcount from being returned. Signed-off-by: Nick Piggin <npiggin@suse.de> Acked-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | | | Merge branch 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/miscLinus Torvalds2010-04-05
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'slabh' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc: eeepc-wmi: include slab.h staging/otus: include slab.h from usbdrv.h percpu: don't implicitly include slab.h from percpu.h kmemcheck: Fix build errors due to missing slab.h include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h iwlwifi: don't include iwl-dev.h from iwl-devtrace.h x86: don't include slab.h from arch/x86/include/asm/pgtable_32.h Fix up trivial conflicts in include/linux/percpu.h due to is_kernel_percpu_address() having been introduced since the slab.h cleanup with the percpu_up.c splitup.
| * \ \ \ \ \ Merge branch 'master' into export-slabhTejun Heo2010-04-04
| |\ \ \ \ \ \
| * | | | | | | percpu: don't implicitly include slab.h from percpu.hTejun Heo2010-03-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | percpu.h has always been including slab.h to get k[mz]alloc/free() for UP inline implementation. percpu.h being used by very low level headers including module.h and sched.h, this meant that a lot files unintentionally got slab.h inclusion. Lee Schermerhorn was trying to make topology.h use percpu.h and got bitten by this implicit inclusion. The right thing to do is break this ultimately unnecessary dependency. The previous patch added explicit inclusion of either gfp.h or slab.h to the source files using them. This patch updates percpu.h such that slab.h is no longer included from percpu.h. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>