aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge branch 'core-fixes-for-linus' of ↵Linus Torvalds2008-08-11
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: lockdep: fix debug_lock_alloc lockdep: increase MAX_LOCKDEP_KEYS generic-ipi: fix stack and rcu interaction bug in smp_call_function_mask() lockdep: fix overflow in the hlock shrinkage code lockdep: rename map_[acquire|release]() => lock_map_[acquire|release]() lockdep: handle chains involving classes defined in modules mm: fix mm_take_all_locks() locking order lockdep: annotate mm_take_all_locks() lockdep: spin_lock_nest_lock() lockdep: lock protection locks lockdep: map_acquire lockdep: shrink held_lock structure lockdep: re-annotate scheduler runqueues lockdep: lock_set_subclass - reset a held lock's subclass lockdep: change scheduler annotation debug_locks: set oops_in_progress if we will log messages. lockdep: fix combinatorial explosion in lock subgraph traversal
| * Merge branch 'core/locking' into core/urgentIngo Molnar2008-08-11
| |\
| | * lockdep: fix debug_lock_allocPeter Zijlstra2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we enable DEBUG_LOCK_ALLOC but do not enable PROVE_LOCKING and or LOCK_STAT, lock_alloc() and lock_release() turn into nops, even though we should be doing hlock checking (check=1). This causes a false warning and a lockdep self-disable. Rectify this. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: increase MAX_LOCKDEP_KEYSIngo Molnar2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | certain configs produce: [ 70.076229] BUG: MAX_LOCKDEP_KEYS too low! [ 70.080230] turning off the locking correctness validator. tune them up. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: fix overflow in the hlock shrinkage codePeter Zijlstra2008-08-11
| | | | | | | | | | | | | | | | | | | | | There is a overflow by 1 case in the new shrunken hlock code. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: rename map_[acquire|release]() => lock_map_[acquire|release]()Ingo Molnar2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the names were too generic: drivers/uio/uio.c:87: error: expected identifier or '(' before 'do' drivers/uio/uio.c:87: error: expected identifier or '(' before 'while' drivers/uio/uio.c:113: error: 'map_release' undeclared here (not in a function) Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: handle chains involving classes defined in modulesRabin Vincent2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Solve this by marking the classes as unused and not printing information about the unused classes. Reported-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Rabin Vincent <rabin@rab.in> Acked-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * mm: fix mm_take_all_locks() locking orderPeter Zijlstra2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lockdep spotted: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.27-rc1 #270 ------------------------------------------------------- qemu-kvm/2033 is trying to acquire lock: (&inode->i_data.i_mmap_lock){----}, at: [<ffffffff802996cc>] mm_take_all_locks+0xc2/0xea but task is already holding lock: (&anon_vma->lock){----}, at: [<ffffffff8029967a>] mm_take_all_locks+0x70/0xea which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&anon_vma->lock){----}: [<ffffffff8025cd37>] __lock_acquire+0x11be/0x14d2 [<ffffffff8025d0a9>] lock_acquire+0x5e/0x7a [<ffffffff804c655b>] _spin_lock+0x3b/0x47 [<ffffffff8029a2ef>] vma_adjust+0x200/0x444 [<ffffffff8029a662>] split_vma+0x12f/0x146 [<ffffffff8029bc60>] mprotect_fixup+0x13c/0x536 [<ffffffff8029c203>] sys_mprotect+0x1a9/0x21e [<ffffffff8020c0db>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff -> #0 (&inode->i_data.i_mmap_lock){----}: [<ffffffff8025ca54>] __lock_acquire+0xedb/0x14d2 [<ffffffff8025d397>] lock_release_non_nested+0x1c2/0x219 [<ffffffff8025d515>] lock_release+0x127/0x14a [<ffffffff804c6403>] _spin_unlock+0x1e/0x50 [<ffffffff802995d9>] mm_drop_all_locks+0x7f/0xb0 [<ffffffff802a965d>] do_mmu_notifier_register+0xe2/0x112 [<ffffffff802a96a8>] mmu_notifier_register+0xe/0x10 [<ffffffffa0043b6b>] kvm_dev_ioctl+0x11e/0x287 [kvm] [<ffffffff802bd0ca>] vfs_ioctl+0x2a/0x78 [<ffffffff802bd36f>] do_vfs_ioctl+0x257/0x274 [<ffffffff802bd3e1>] sys_ioctl+0x55/0x78 [<ffffffff8020c0db>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff other info that might help us debug this: 5 locks held by qemu-kvm/2033: #0: (&mm->mmap_sem){----}, at: [<ffffffff802a95d0>] do_mmu_notifier_register+0x55/0x112 #1: (mm_all_locks_mutex){--..}, at: [<ffffffff8029963e>] mm_take_all_locks+0x34/0xea #2: (&anon_vma->lock){----}, at: [<ffffffff8029967a>] mm_take_all_locks+0x70/0xea #3: (&anon_vma->lock){----}, at: [<ffffffff8029967a>] mm_take_all_locks+0x70/0xea #4: (&anon_vma->lock){----}, at: [<ffffffff8029967a>] mm_take_all_locks+0x70/0xea stack backtrace: Pid: 2033, comm: qemu-kvm Not tainted 2.6.27-rc1 #270 Call Trace: [<ffffffff8025b7c7>] print_circular_bug_tail+0xb8/0xc3 [<ffffffff8025ca54>] __lock_acquire+0xedb/0x14d2 [<ffffffff80259bb1>] ? add_lock_to_list+0x7e/0xad [<ffffffff8029967a>] ? mm_take_all_locks+0x70/0xea [<ffffffff8029967a>] ? mm_take_all_locks+0x70/0xea [<ffffffff8025d397>] lock_release_non_nested+0x1c2/0x219 [<ffffffff802996cc>] ? mm_take_all_locks+0xc2/0xea [<ffffffff802996cc>] ? mm_take_all_locks+0xc2/0xea [<ffffffff8025b202>] ? trace_hardirqs_on_caller+0x4d/0x115 [<ffffffff802995d9>] ? mm_drop_all_locks+0x7f/0xb0 [<ffffffff8025d515>] lock_release+0x127/0x14a [<ffffffff804c6403>] _spin_unlock+0x1e/0x50 [<ffffffff802995d9>] mm_drop_all_locks+0x7f/0xb0 [<ffffffff802a965d>] do_mmu_notifier_register+0xe2/0x112 [<ffffffff802a96a8>] mmu_notifier_register+0xe/0x10 [<ffffffffa0043b6b>] kvm_dev_ioctl+0x11e/0x287 [kvm] [<ffffffff8033f9f2>] ? file_has_perm+0x83/0x8e [<ffffffff802bd0ca>] vfs_ioctl+0x2a/0x78 [<ffffffff802bd36f>] do_vfs_ioctl+0x257/0x274 [<ffffffff802bd3e1>] sys_ioctl+0x55/0x78 [<ffffffff8020c0db>] system_call_fastpath+0x16/0x1b Which the locking hierarchy in mm/rmap.c confirms as valid. Fix this by first taking all the mapping->i_mmap_lock instances and then take all anon_vma->lock instances. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: annotate mm_take_all_locks()Peter Zijlstra2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | The nesting is correct due to holding mmap_sem, use the new annotation to annotate this. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: spin_lock_nest_lock()Peter Zijlstra2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Expose the new lock protection lock. This can be used to annotate places where we take multiple locks of the same class and avoid deadlocks by always taking another (top-level) lock first. NOTE: we're still bound to the MAX_LOCK_DEPTH (48) limit. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: lock protection locksPeter Zijlstra2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On Fri, 2008-08-01 at 16:26 -0700, Linus Torvalds wrote: > On Fri, 1 Aug 2008, David Miller wrote: > > > > Taking more than a few locks of the same class at once is bad > > news and it's better to find an alternative method. > > It's not always wrong. > > If you can guarantee that anybody that takes more than one lock of a > particular class will always take a single top-level lock _first_, then > that's all good. You can obviously screw up and take the same lock _twice_ > (which will deadlock), but at least you cannot get into ABBA situations. > > So maybe the right thing to do is to just teach lockdep about "lock > protection locks". That would have solved the multi-queue issues for > networking too - all the actual network drivers would still have taken > just their single queue lock, but the one case that needs to take all of > them would have taken a separate top-level lock first. > > Never mind that the multi-queue locks were always taken in the same order: > it's never wrong to just have some top-level serialization, and anybody > who needs to take <n> locks might as well do <n+1>, because they sure as > hell aren't going to be on _any_ fastpaths. > > So the simplest solution really sounds like just teaching lockdep about > that one special case. It's not "nesting" exactly, although it's obviously > related to it. Do as Linus suggested. The lock protection lock is called nest_lock. Note that we still have the MAX_LOCK_DEPTH (48) limit to consider, so anything that spills that it still up shit creek. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: map_acquirePeter Zijlstra2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | Most the free-standing lock_acquire() usages look remarkably similar, sweep them into a new helper. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: shrink held_lock structureDave Jones2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct held_lock { u64 prev_chain_key; /* 0 8 */ struct lock_class * class; /* 8 8 */ long unsigned int acquire_ip; /* 16 8 */ struct lockdep_map * instance; /* 24 8 */ int irq_context; /* 32 4 */ int trylock; /* 36 4 */ int read; /* 40 4 */ int check; /* 44 4 */ int hardirqs_off; /* 48 4 */ /* size: 56, cachelines: 1 */ /* padding: 4 */ /* last cacheline: 56 bytes */ }; struct held_lock { u64 prev_chain_key; /* 0 8 */ long unsigned int acquire_ip; /* 8 8 */ struct lockdep_map * instance; /* 16 8 */ unsigned int class_idx:11; /* 24:21 4 */ unsigned int irq_context:2; /* 24:19 4 */ unsigned int trylock:1; /* 24:18 4 */ unsigned int read:2; /* 24:16 4 */ unsigned int check:2; /* 24:14 4 */ unsigned int hardirqs_off:1; /* 24:13 4 */ /* size: 32, cachelines: 1 */ /* padding: 4 */ /* bit_padding: 13 bits */ /* last cacheline: 32 bytes */ }; [mingo@elte.hu: shrunk hlock->class too] [peterz@infradead.org: fixup bit sizes] Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
| | * lockdep: re-annotate scheduler runqueuesPeter Zijlstra2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of using a per-rq lock class, use the regular nesting operations. However, take extra care with double_lock_balance() as it can release the already held rq->lock (and therefore change its nesting class). So what can happen is: spin_lock(rq->lock); // this rq subclass 0 double_lock_balance(rq, other_rq); // release rq // acquire other_rq->lock subclass 0 // acquire rq->lock subclass 1 spin_unlock(other_rq->lock); leaving you with rq->lock in subclass 1 So a subsequent double_lock_balance() call can try to nest a subclass 1 lock while already holding a subclass 1 lock. Fix this by introducing double_unlock_balance() which releases the other rq's lock, but also re-sets the subclass for this rq's lock to 0. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: lock_set_subclass - reset a held lock's subclassPeter Zijlstra2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | this can be used to reset a held lock's subclass, for arbitrary-depth iterated data structures such as trees or lists which have per-node locks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: change scheduler annotationPeter Zijlstra2008-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While thinking about David's graph walk lockdep patch it _finally_ dawned on me that there is no reason we have a lock class per cpu ... Sorry for being dense :-/ The below changes the annotation from a lock class per cpu, to a single nested lock, as the scheduler never holds more that 2 rq locks at a time anyway. If there was code requiring holding all rq locks this would not work and the original annotation would be the only option, but that not being the case, this is a much lighter one. Compiles and boots on a 2-way x86_64. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Miller <davem@davemloft.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * debug_locks: set oops_in_progress if we will log messages.David Miller2008-08-01
| | | | | | | | | | | | | | | | | | | | | | | | Otherwise lock debugging messages on runqueue locks can deadlock the system due to the wakeups performed by printk(). Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * lockdep: fix combinatorial explosion in lock subgraph traversalDavid Miller2008-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we traverse the graph, either forwards or backwards, we are interested in whether a certain property exists somewhere in a node reachable in the graph. Therefore it is never necessary to traverse through a node more than once to get a correct answer to the given query. Take advantage of this property using a global ID counter so that we need not clear all the markers in all the lock_class entries before doing a traversal. A new ID is choosen when we start to traverse, and we continue through a lock_class only if it's ID hasn't been marked with the new value yet. This short-circuiting is essential especially for high CPU count systems. The scheduler has a runqueue per cpu, and needs to take two runqueue locks at a time, which leads to long chains of backwards and forwards subgraphs from these runqueue lock nodes. Without the short-circuit implemented here, a graph traversal on a runqueue lock can take up to (1 << (N - 1)) checks on a system with N cpus. For anything more than 16 cpus or so, lockdep will eventually bring the machine to a complete standstill. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | generic-ipi: fix stack and rcu interaction bug in smp_call_function_mask()Nick Piggin2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Venki Pallipadi <venkatesh.pallipadi@intel.com> wrote: > Found a OOPS on a big SMP box during an overnight reboot test with > upstream git. > > Suresh and I looked at the oops and looks like the root cause is in > generic_smp_call_function_interrupt() and smp_call_function_mask() with > wait parameter. > > The actual oops looked like > > [ 11.277260] BUG: unable to handle kernel paging request at ffff8802ffffffff > [ 11.277815] IP: [<ffff8802ffffffff>] 0xffff8802ffffffff > [ 11.278155] PGD 202063 PUD 0 > [ 11.278576] Oops: 0010 [1] SMP > [ 11.279006] CPU 5 > [ 11.279336] Modules linked in: > [ 11.279752] Pid: 0, comm: swapper Not tainted 2.6.27-rc2-00020-g685d87f #290 > [ 11.280039] RIP: 0010:[<ffff8802ffffffff>] [<ffff8802ffffffff>] 0xffff8802ffffffff > [ 11.280692] RSP: 0018:ffff88027f1f7f70 EFLAGS: 00010086 > [ 11.280976] RAX: 00000000ffffffff RBX: 0000000000000000 RCX: 0000000000000000 > [ 11.281264] RDX: 0000000000004f4e RSI: 0000000000000001 RDI: 0000000000000000 > [ 11.281624] RBP: ffff88027f1f7f98 R08: 0000000000000001 R09: ffffffff802509af > [ 11.281925] R10: ffff8800280c2780 R11: 0000000000000000 R12: ffff88027f097d48 > [ 11.282214] R13: ffff88027f097d70 R14: 0000000000000005 R15: ffff88027e571000 > [ 11.282502] FS: 0000000000000000(0000) GS:ffff88027f1c3340(0000) knlGS:0000000000000000 > [ 11.283096] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b > [ 11.283382] CR2: ffff8802ffffffff CR3: 0000000000201000 CR4: 00000000000006e0 > [ 11.283760] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 11.284048] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [ 11.284337] Process swapper (pid: 0, threadinfo ffff88027f1f2000, task ffff88027f1f0640) > [ 11.284936] Stack: ffffffff80250963 0000000000000212 0000000000ee8c78 0000000000ee8a66 > [ 11.285802] ffff88027e571550 ffff88027f1f7fa8 ffffffff8021adb5 ffff88027f1f3e40 > [ 11.286599] ffffffff8020bdd6 ffff88027f1f3e40 <EOI> ffff88027f1f3ef8 0000000000000000 > [ 11.287120] Call Trace: > [ 11.287768] <IRQ> [<ffffffff80250963>] ? generic_smp_call_function_interrupt+0x61/0x12c > [ 11.288354] [<ffffffff8021adb5>] smp_call_function_interrupt+0x17/0x27 > [ 11.288744] [<ffffffff8020bdd6>] call_function_interrupt+0x66/0x70 > [ 11.289030] <EOI> [<ffffffff8024ab3b>] ? clockevents_notify+0x19/0x73 > [ 11.289380] [<ffffffff803b9b75>] ? acpi_idle_enter_simple+0x18b/0x1fa > [ 11.289760] [<ffffffff803b9b6b>] ? acpi_idle_enter_simple+0x181/0x1fa > [ 11.290051] [<ffffffff8053aeca>] ? cpuidle_idle_call+0x70/0xa2 > [ 11.290338] [<ffffffff80209f61>] ? cpu_idle+0x5f/0x7d > [ 11.290723] [<ffffffff8060224a>] ? start_secondary+0x14d/0x152 > [ 11.291010] > [ 11.291287] > [ 11.291654] Code: Bad RIP value. > [ 11.292041] RIP [<ffff8802ffffffff>] 0xffff8802ffffffff > [ 11.292380] RSP <ffff88027f1f7f70> > [ 11.292741] CR2: ffff8802ffffffff > [ 11.310951] ---[ end trace 137c54d525305f1c ]--- > > The problem is with the following sequence of events: > > - CPU A calls smp_call_function_mask() for CPU B with wait parameter > - CPU A sets up the call_function_data on the stack and does an rcu add to > call_function_queue > - CPU A waits until the WAIT flag is cleared > - CPU B gets the call function interrupt and starts going through the > call_function_queue > - CPU C also gets some other call function interrupt and starts going through > the call_function_queue > - CPU C, which is also going through the call_function_queue, starts referencing > CPU A's stack, as that element is still in call_function_queue > - CPU B finishes the function call that CPU A set up and as there are no other > references to it, rcu deletes the call_function_data (which was from CPU A > stack) > - CPU B sees the wait flag and just clears the flag (no call_rcu to free) > - CPU A which was waiting on the flag continues executing and the stack > contents change > > - CPU C is still in rcu_read section accessing the CPU A's stack sees > inconsistent call_funation_data and can try to execute > function with some random pointer, causing stack corruption for A > (by clearing the bits in mask field) and oops. Nice debugging work. I'd suggest something like the attached (boot tested) patch as the simple fix for now. I expect the benefits from the less synchronized, multiple-in-flight-data global queue will still outweigh the costs of dynamic allocations. But if worst comes to worst then we just go back to a globally synchronous one-at-a-time implementation, but that would be pretty sad! Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2008-08-11
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: fix 2.6.27rc1 cannot boot more than 8CPUs x86: make "apic" an early_param() on 32-bit, NULL check EFI, x86: fix function prototype x86, pci-calgary: fix function declaration x86: work around gcc 3.4.x bug x86: make "apic" an early_param() on 32-bit x86, debug: tone down arch/x86/kernel/mpparse.c debugging printk x86_64: restore the proper NR_IRQS define so larger systems work. x86: Restore proper vector locking during cpu hotplug x86: Fix broken VMI in 2.6.27-rc.. x86: fdiv bug detection fix
| * | | x86: fix 2.6.27rc1 cannot boot more than 8CPUsYinghai Lu2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Jeff Chua reported that booting a !bigsmp kernel on a 16-way box hangs silently. this is a long-standing issue, smp start AP cpu could check the apic id >=8 etc before trying to start it. achieve this by moving the def_to_bigsmp check later and skip the apicid id > 8 [ mingo@elte.hu: clean up the message that is printed. ] Reported-by: "Jeff Chua" <jeff.chua.linux@gmail.com> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> arch/x86/kernel/setup.c | 6 ------ arch/x86/kernel/smpboot.c | 10 ++++++++++ 2 files changed, 10 insertions(+), 6 deletions(-)
| * | | x86: make "apic" an early_param() on 32-bit, NULL checkRene Herman2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cyrill Gorcunov observed: > you turned it into early_param so now it's NULL injecting vulnerabled. > Could you please add checking for NULL str param? fix that. Also, change the name of 'str' into 'arg', to make it more apparent that this is an optional argument that can be NULL, not a string parameter that is empty when unset. Reported-by: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | EFI, x86: fix function prototypeRandy Dunlap2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix function prototype in header file to match source code: linux-next-20080807/arch/x86/kernel/efi_64.c:100:14: error: symbol 'efi_ioremap' redeclared with different type (originally declared at include2/asm/efi.h:89) - different address spaces Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, pci-calgary: fix function declarationRandy Dunlap2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix function declaration: linux-next-20080807/arch/x86/kernel/pci-calgary_64.c:1353:36: warning: non-ANSI function declaration of function 'get_tce_space_from_tar' Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Acked-by: Muli Ben-Yehuda <muli@il.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86: work around gcc 3.4.x bugJeremy Fitzhardinge2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simon Horman reported that gcc-3.4.x crashes when compiling pgd_prepopulate_pmd() when PREALLOCATED_PMDS == 0 and CONFIG_DEBUG_INFO is enabled. Adding an extra check for PREALLOCATED_PMDS == 0 [which is compiled out by gcc] seems to avoid the problem. Reported-by: Simon Horman <horms@verge.net.au> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Simon Horman <horms@verge.net.au> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86: make "apic" an early_param() on 32-bitRene Herman2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 32-bit, "apic" is a __setup() param meaning it is parsed rather late in the game. Make it an early_param() for apic_printk() use by arch/x86/kernel/mpparse.c. On 64-bit, it already is an early_param(). Signed-off-by: Rene Herman <rene.herman@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86, debug: tone down arch/x86/kernel/mpparse.c debugging printkRene Herman2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 11a62a056093a7f25f1595fbd8bd5f93559572b6 turns some formerly nopped debugging printks in arch/x86/kernel/mppparse.c into regular ones. The one at the top of smp_scan_config() in particular also prints on !CONFIG_SMP/CONFIG_X86_LOCAL_APIC kernels and UP machines without anything resembling MP tables which makes their lowly UP owners wonder... Turn the former Dprintk()s into apic_printk()s instead meaning that their printing is dependent on passing the apic=verbose (or =debug) command line param. On 32-bit, "apic" is a __setup() param which isn't early enough for this code and therefore needs a followup changing it into an early_param(). On 64-bit, it already is. Signed-off-by: Rene Herman <rene.herman@gmail.com> Cc: Andrew Morton <akpm@osdl.org> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86_64: restore the proper NR_IRQS define so larger systems work.Eric W. Biederman2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As pointed out and tracked by Yinghai Lu <yhlu.kernel@gmail.com>: Dhaval Giani got: kernel BUG at arch/x86/kernel/io_apic_64.c:357! invalid opcode: 0000 [1] SMP CPU 24 ... his system (x3950) has 8 ioapic, irq > 256 This was caused by: commit 9b7dc567d03d74a1fbae84e88949b6a60d922d82 Author: Thomas Gleixner <tglx@linutronix.de> Date: Fri May 2 20:10:09 2008 +0200 x86: unify interrupt vector defines The interrupt vector defines are copied 4 times around with minimal differences. Move them all into asm-x86/irq_vectors.h It appears that Thomas did not notice that x86_64 does something completely different when he merge irq_vectors.h We can solve this for 2.6.27 by simply reintroducing the old heuristic for setting NR_IRQS on x86_64 to a usable value, which trivially removes the regression. Long term it would be nice to harmonize the handling of ioapic interrupts of x86_32 and x86_64 so we don't have this kind of confusion. Dhaval Giani <dhaval@linux.vnet.ibm.com> tested an earlier version of this patch by YH which confirms simply increasing NR_IRQS fixes the problem. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Yinghai Lu <yhlu.kernel@gmail.com> Cc: Dhaval Giani <dhaval@linux.vnet.ibm.com> Cc: Mike Travis <travis@sgi.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | x86: Restore proper vector locking during cpu hotplugEric W. Biederman2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Having cpu_online_map change during assign_irq_vector can result in some really nasty and weird things happening. The one that bit me last time was accessing non existent per cpu memory for non existent cpus. This locking was removed in a sloppy x86_64 and x86_32 merge patch. Guys can we please try and avoid subtly breaking x86 when we are merging files together? Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | x86: Fix broken VMI in 2.6.27-rc..Alok Kataria2008-08-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The lowmem mapping table created by VMI need not depend on max_low_pfn at all. Instead we now create an extra large mapping which covers all possible lowmem instead of the physical ram that is actually available. This allows the vmi initialization to be done before max_low_pfn could be computed. We also move the vmi_init code very early in the boot process so that nobody accidentally breaks the fixmap dependancy. Signed-off-by: Alok N Kataria <akataria@vmware.com> Acked-by: Zachary Amsden <zach@vmware.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
| * | | x86: fdiv bug detection fixKrzysztof Helt2008-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The fdiv detection code writes s32 integer into the boot_cpu_data.fdiv_bug. However, the boot_cpu_data.fdiv_bug is only char (s8) field so the detection overwrites already set fields for other bugs, e.g. the f00f bug field. Use local s32 variable to receive result. This is a partial fix to Bugzilla #9928 - fixes wrong information about the f00f bug (tested) and probably for coma bug (I have no cpu to test this). Signed-off-by: Krzysztof Helt <krzysztof.h1@wp.pl> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
* | | | make struct scsi_dh_devlist's staticAdrian Bunk2008-08-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes several needlessly global struct scsi_dh_devlist's static. Signed-off-by: Adrian Bunk <bunk@kernel.org> Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* | | | Merge branch 'for-linus' of git://git.o-hand.com/linux-mfdLinus Torvalds2008-08-11
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.o-hand.com/linux-mfd: mfd: tc6393 cleanup and update mfd: have TMIO drivers and subdevices depend on ARM mfd: TMIO MMC driver mfd: driver for the TMIO NAND controller mfd: t7l66 MMC platform data mfd: tc6387 MMC platform data mfd: Fix 7l66 and 6387 according to the new mfd-core API mfd: Fix tc6393 according to the new tmio.h mfd: driver for the TC6387XB TMIO controller. mfd: driver for the T7L66XB TMIO SoC mfd: TMIO MMC structures and accessors.
| * | | | mfd: tc6393 cleanup and updateIan Molton2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patchset cleans up the TC6393XB support. * Add provision for the MMC subdevice * Disable / enable clocks on suspend / resume * Remove fragments of badly merged code (eg. linux/fb include etc.) * Use a device specific clock name to break dependancy on ARM/PXA2XX * Drop unnecessary resource names * Switch to tmio_io* accessors Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: have TMIO drivers and subdevices depend on ARMSamuel Ortiz2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The TMIO chips are only found (and thus tested) on ARM machines. Moreover, we don't want the TMIO cells to be built if one of the TMIO driver is not selected (which indirectly make the TMIO cells drivers depend on ARM as well). Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: TMIO MMC driverIan Molton2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for the MMC subdevice 'cell' commonly found in TMIO based MFDs. Signed-off-by: Ian Molton <spyro@f2s.com> Acked-by: Pierre Ossman <drzeus@drzeus.cx> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: driver for the TMIO NAND controllerIan Molton2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for the NAND controller commonly found in TMIO based MFDs. Signed-off-by: Ian Molton <spyro@f2s.com> Acked-By: David Woodhouse <David.Woodhouse@intel.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: t7l66 MMC platform dataIan Molton2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The tmio MMC driver needs the cell to be passed as a platform data. Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: tc6387 MMC platform dataIan Molton2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to pass the cell as the platform data. Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: Fix 7l66 and 6387 according to the new mfd-core APISamuel Ortiz2008-08-10
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: Fix tc6393 according to the new tmio.hSamuel Ortiz2008-08-10
| | | | | | | | | | | | | | | | | | | | Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: driver for the TC6387XB TMIO controller.Ian Molton2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for the TC6387XB. Unlike other TMIO devices this one has only one subdevice and no interrupt mux, however using the MFD framework allows it to share the TMIO MMC driver. Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: driver for the T7L66XB TMIO SoCIan Molton2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patchset provides support for the core functinality of the T7L66XB SoC from Toshiba. Supported in this patchset is the IRQ MUX, MMC controller and NAND flash controller. Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
| * | | | mfd: TMIO MMC structures and accessors.Ian Molton2008-08-10
| | |/ / | |/| | | | | | | | | | | | | | Signed-off-by: Ian Molton <spyro@f2s.com> Signed-off-by: Samuel Ortiz <sameo@openedhand.com>
* | | | Merge branch 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6Linus Torvalds2008-08-11
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'i2c-for-linus' of git://jdelvare.pck.nerim.net/jdelvare-2.6: hwmon: (lm75) Drop legacy i2c driver i2c: correct some size_t printk formats i2c: Check for address business before creating clients i2c: Let users select algorithm drivers manually again i2c: Fix NULL pointer dereference in i2c_new_probed_device i2c: Fix oops on bus multiplexer driver loading
| * | | | hwmon: (lm75) Drop legacy i2c driverJean Delvare2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drop the legacy lm75 driver, and add a detect callback to the new-style driver to achieve the same functionality. Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: David Brownell <david-b@pacbell.net>
| * | | | i2c: correct some size_t printk formatsDavid Brownell2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix various printk format strings where %zd was passed a size_t; those should be %zu instead. (Courtesy of a version of GCC which warns when these details are wrong.) Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Jean Delvare <khali@linux-fr.org>
| * | | | i2c: Check for address business before creating clientsJean Delvare2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We check for address business in i2c_probe_address(), i2c_detect_address() and i2c_new_probed_device(), but this isn't sufficient. Drivers can call i2c_attach_client() and i2c_new_device() on any address, so we must check the address there as well. This fixes bug #11239: http://bugzilla.kernel.org/show_bug.cgi?id=11239 Signed-off-by: Jean Delvare <khali@linux-fr.org>
| * | | | i2c: Let users select algorithm drivers manually againJean Delvare2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In kernel 2.6.26, the ability to select I2C algorithm drivers manually was removed, as all in-kernel drivers do that automatically. However there were some complaints that it was a problem for out-of-tree I2C bus drivers. In order to address these complaints, let's allow manual selection of these drivers again, but still hide them by default for better general user experience. This closes bug #11140: http://bugzilla.kernel.org/show_bug.cgi?id=11140 Signed-off-by: Jean Delvare <khali@linux-fr.org>
| * | | | i2c: Fix NULL pointer dereference in i2c_new_probed_deviceHans Verkuil2008-08-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a NULL pointer dereference that happened when calling i2c_new_probed_device on one of the addresses for which we use byte reads instead of quick write for detection purpose (that is: 0x30-0x37 and 0x50-0x5f). Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Jean Delvare <khali@linux-fr.org>