litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	Merge remote-tracking branch 'efi/urgent' into x86/urgent	H. Peter Anvin	2013-04-19
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Matt Fleming (1): x86, efivars: firmware bug workarounds should be in platform code Matthew Garrett (3): Move utf16 functions to kernel core and rename efi: Pass boot services variable info to runtime code efi: Distinguish between "remaining space" and actually used space Richard Weinberger (2): x86,efi: Check max_size only if it is non-zero. x86,efi: Implement efi_no_storage_paranoia parameter Sergey Vlasov (2): x86/Kconfig: Make EFI select UCS2_STRING efi: Export efi_query_variable_store() for efivars.ko Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
\| *	x86,efi: Implement efi_no_storage_paranoia parameter	Richard Weinberger	2013-04-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using this parameter one can disable the storage_size/2 check if he is really sure that the UEFI does sane gc and fulfills the spec. This parameter is useful if a devices uses more than 50% of the storage by default. The Intel DQSW67 desktop board is such a sucker for exmaple. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
\| *	efi: Export efi_query_variable_store() for efivars.ko	Sergey Vlasov	2013-04-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes build with CONFIG_EFI_VARS=m which was broken after the commit "x86, efivars: firmware bug workarounds should be in platform code". Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
\| *	x86/Kconfig: Make EFI select UCS2_STRING	Sergey Vlasov	2013-04-16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The commit "efi: Distinguish between "remaining space" and actually used space" added usage of ucs2_*() functions to arch/x86/platform/efi/efi.c, but the only thing which selected UCS2_STRING was EFI_VARS, which is technically optional and can be built as a module. Signed-off-by: Sergey Vlasov <vsu@altlinux.ru> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
\| *	efi: Distinguish between "remaining space" and actually used space	Matthew Garrett	2013-04-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EFI implementations distinguish between space that is actively used by a variable and space that merely hasn't been garbage collected yet. Space that hasn't yet been garbage collected isn't available for use and so isn't counted in the remaining_space field returned by QueryVariableInfo(). Combined with commit 68d9298 this can cause problems. Some implementations don't garbage collect until the remaining space is smaller than the maximum variable size, and as a result check_var_size() will always fail once more than 50% of the variable store has been used even if most of that space is marked as available for garbage collection. The user is unable to create new variables, and deleting variables doesn't increase the remaining space. The problem that 68d9298 was attempting to avoid was one where certain platforms fail if the actively used space is greater than 50% of the available storage space. We should be able to calculate that by simply summing the size of each available variable and subtracting that from the total storage space. With luck this will fix the problem described in https://bugzilla.kernel.org/show_bug.cgi?id=55471 without permitting damage to occur to the machines 68d9298 was attempting to fix. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
\| *	efi: Pass boot services variable info to runtime code	Matthew Garrett	2013-04-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EFI variables can be flagged as being accessible only within boot services. This makes it awkward for us to figure out how much space they use at runtime. In theory we could figure this out by simply comparing the results from QueryVariableInfo() to the space used by all of our variables, but that fails if the platform doesn't garbage collect on every boot. Thankfully, calling QueryVariableInfo() while still inside boot services gives a more reliable answer. This patch passes that information from the EFI boot stub up to the efi platform code. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
\| *	Move utf16 functions to kernel core and rename	Matthew Garrett	2013-04-15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want to be able to use the utf16 functions that are currently present in the EFI variables code in platform-specific code as well. Move them to the kernel core, and in the process rename them to accurately describe what they do - they don't handle UTF16, only UCS2. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
\| *	x86,efi: Check max_size only if it is non-zero.	Richard Weinberger	2013-04-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some EFI implementations return always a MaximumVariableSize of 0, check against max_size only if it is non-zero. My Intel DQ67SW desktop board has such an implementation. Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
\| *	x86, efivars: firmware bug workarounds should be in platform code	Matt Fleming	2013-04-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let's not burden ia64 with checks in the common efivars code that we're not writing too much data to the variable store. That kind of thing is an x86 firmware bug, plain and simple. efi_query_variable_store() provides platforms with a wrapper in which they can perform checks and workarounds for EFI variable storage bugs. Cc: H. Peter Anvin <hpa@zytor.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>
* \|	x86, microcode: Verify the family before dispatching microcode patching	H. Peter Anvin	2013-04-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For each CPU vendor that implements CPU microcode patching, there will be a minimum family for which this is implemented. Verify this minimum level of support. This can be done in the dispatch function or early in the application functions. Doing the latter turned out to be somewhat awkward because of the ineviable split between the BSP and the AP paths, and rather than pushing deep into the application functions, do this in the dispatch function. Reported-by: "Bryan O'Donoghue" <bryan.odonoghue.lkml@nexus-software.ie> Suggested-by: Borislav Petkov <bp@alien8.de> Cc: Fenghua Yu <fenghua.yu@intel.com> Link: http://lkml.kernel.org/r/1366392183-4149-1-git-send-email-bryan.odonoghue.lkml@nexus-software.ie
* \|	x86, hyperv: Handle Xen emulation of Hyper-V more gracefully	K. Y. Srinivasan	2013-04-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Install the Hyper-V specific interrupt handler only when needed. This would permit us to get rid of the Xen check. Note that when the vmbus drivers invokes the call to register its handler, we are sure to be running on Hyper-V. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Link: http://lkml.kernel.org/r/1366299886-6399-1-git-send-email-kys@microsoft.com Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* \|	x86/mm: Flush lazy MMU when DEBUG_PAGEALLOC is set	Boris Ostrovsky	2013-04-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When CONFIG_DEBUG_PAGEALLOC is set page table updates made by kernel_map_pages() are not made visible (via TLB flush) immediately if lazy MMU is on. In environments that support lazy MMU (e.g. Xen) this may lead to fatal page faults, for example, when zap_pte_range() needs to allocate pages in __tlb_remove_page() -> tlb_next_batch(). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: konrad.wilk@oracle.com Link: http://lkml.kernel.org/r/1365703192-2089-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \|	x86/mm/cpa/selftest: Fix false positive in CPA self test	Andrea Arcangeli	2013-04-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the pmd is not present, _PAGE_PSE will not be set anymore. Fix the false positive. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Stefan Bader <stefan.bader@canonical.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1365687369-30802-1-git-send-email-aarcange@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \|	x86/mm/cpa: Convert noop to functional fix	Andrea Arcangeli	2013-04-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit: a8aed3e0752b ("x86/mm/pageattr: Prevent PSE and GLOABL leftovers to confuse pmd/pte_present and pmd_huge") introduced a valid fix but one location that didn't trigger the bug that lead to finding those (small) problems, wasn't updated using the right variable. The wrong variable was also initialized for no good reason, that may have been the source of the confusion. Remove the noop initialization accordingly. Commit a8aed3e0752b also erroneously removed one canon_pgprot pass meant to clear pmd bitflags not supported in hardware by older CPUs, that automatically gets corrected by this patch too by applying it to the right variable in the new location. Reported-by: Stefan Bader <stefan.bader@canonical.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Andy Whitcroft <apw@canonical.com> Cc: Mel Gorman <mgorman@suse.de> Link: http://lkml.kernel.org/r/1365600505-19314-1-git-send-email-aarcange@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* \|	x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal	Boris Ostrovsky	2013-04-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Invoking arch_flush_lazy_mmu_mode() results in calls to preempt_enable()/disable() which may have performance impact. Since lazy MMU is not used on bare metal we can patch away arch_flush_lazy_mmu_mode() so that it is never called in such environment. [ hpa: the previous patch "Fix vmalloc_fault oops during lazy MMU updates" may cause a minor performance regression on bare metal. This patch resolves that performance regression. It is somewhat unclear to me if this is a good -stable candidate. ] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: http://lkml.kernel.org/r/1364045796-10720-2-git-send-email-konrad.wilk@oracle.com Tested-by: Josh Boyer <jwboyer@redhat.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Borislav Petkov <bp@suse.de> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> SEE NOTE ABOVE
* \|	x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates	Samu Kallio	2013-04-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In paravirtualized x86_64 kernels, vmalloc_fault may cause an oops when lazy MMU updates are enabled, because set_pgd effects are being deferred. One instance of this problem is during process mm cleanup with memory cgroups enabled. The chain of events is as follows: - zap_pte_range enables lazy MMU updates - zap_pte_range eventually calls mem_cgroup_charge_statistics, which accesses the vmalloc'd mem_cgroup per-cpu stat area - vmalloc_fault is triggered which tries to sync the corresponding PGD entry with set_pgd, but the update is deferred - vmalloc_fault oopses due to a mismatch in the PUD entries The OOPs usually looks as so: ------------[ cut here ]------------ kernel BUG at arch/x86/mm/fault.c:396! invalid opcode: 0000 [#1] SMP .. snip .. CPU 1 Pid: 10866, comm: httpd Not tainted 3.6.10-4.fc18.x86_64 #1 RIP: e030:[<ffffffff816271bf>] [<ffffffff816271bf>] vmalloc_fault+0x11f/0x208 .. snip .. Call Trace: [<ffffffff81627759>] do_page_fault+0x399/0x4b0 [<ffffffff81004f4c>] ? xen_mc_extend_args+0xec/0x110 [<ffffffff81624065>] page_fault+0x25/0x30 [<ffffffff81184d03>] ? mem_cgroup_charge_statistics.isra.13+0x13/0x50 [<ffffffff81186f78>] __mem_cgroup_uncharge_common+0xd8/0x350 [<ffffffff8118aac7>] mem_cgroup_uncharge_page+0x57/0x60 [<ffffffff8115fbc0>] page_remove_rmap+0xe0/0x150 [<ffffffff8115311a>] ? vm_normal_page+0x1a/0x80 [<ffffffff81153e61>] unmap_single_vma+0x531/0x870 [<ffffffff81154962>] unmap_vmas+0x52/0xa0 [<ffffffff81007442>] ? pte_mfn_to_pfn+0x72/0x100 [<ffffffff8115c8f8>] exit_mmap+0x98/0x170 [<ffffffff810050d9>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e [<ffffffff81059ce3>] mmput+0x83/0xf0 [<ffffffff810624c4>] exit_mm+0x104/0x130 [<ffffffff8106264a>] do_exit+0x15a/0x8c0 [<ffffffff810630ff>] do_group_exit+0x3f/0xa0 [<ffffffff81063177>] sys_exit_group+0x17/0x20 [<ffffffff8162bae9>] system_call_fastpath+0x16/0x1b Calling arch_flush_lazy_mmu_mode immediately after set_pgd makes the changes visible to the consistency checks. Cc: <stable@vger.kernel.org> RedHat-Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=914737 Tested-by: Josh Boyer <jwboyer@redhat.com> Reported-and-Tested-by: Krishna Raman <kraman@redhat.com> Signed-off-by: Samu Kallio <samu.kallio@aberdeencloud.com> Link: http://lkml.kernel.org/r/1364045796-10720-1-git-send-email-konrad.wilk@oracle.com Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* \|	Linux 3.9-rc6	Linus Torvalds	2013-04-07
\| \|
* \|	Merge git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	2013-04-07
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull KVM fix from Gleb Natapov: "Bugfix for the regression introduced by commit c300aa64ddf5" * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: Allow cross page reads and writes from cached translations.
\| * \|	KVM: Allow cross page reads and writes from cached translations.	Andrew Honig	2013-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for kvm_gfn_to_hva_cache_init functions for reads and writes that will cross a page. If the range falls within the same memslot, then this will be a fast operation. If the range is split between two memslots, then the slower kvm_read_guest and kvm_write_guest are used. Tested: Test against kvm_clock unit tests. Signed-off-by: Andrew Honig <ahonig@google.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
* \| \|	Merge branch 'x86-urgent-for-linus' of ↵	Linus Torvalds	2013-04-07
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Two quite small fixes: one a build problem, and the other fixes seccomp filters on x32." * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Fix rebuild with EFI_STUB enabled x86: remove the x32 syscall bitmask from syscall_get_nr()
\| * \| \|	x86: Fix rebuild with EFI_STUB enabled	Jan Beulich	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	eboot.o and efi_stub_$(BITS).o didn't get added to "targets", and hence their .cmd files don't get included by the build machinery, leading to the files always getting rebuilt. Rather than adding the two files individually, take the opportunity and add $(VMLINUX_OBJS) to "targets" instead, thus allowing the assignment at the top of the file to be shrunk quite a bit. At the same time, remove a pointless flags override line - the variable assigned to was misspelled anyway, and the options added are meaningless for assembly sources. [ hpa: the patch is not minimal, but I am taking it for -urgent anyway since the excess impact of the patch seems to be small enough. ] Signed-off-by: Jan Beulich <jbeulich@suse.com> Link: http://lkml.kernel.org/r/515C5D2502000078000CA6AD@nat28.tlf.novell.com Cc: Matthew Garrett <mjg@redhat.com> Cc: Matt Fleming <matt.fleming@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
\| * \| \|	x86: remove the x32 syscall bitmask from syscall_get_nr()	Paul Moore	2013-04-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit fca460f95e928bae373daa8295877b6905bc62b8 simplified the x32 implementation by creating a syscall bitmask, equal to 0x40000000, that could be applied to x32 syscalls such that the masked syscall number would be the same as a x86_64 syscall. While that patch was a nice way to simplify the code, it went a bit too far by adding the mask to syscall_get_nr(); returning the masked syscall numbers can cause confusion with callers that expect syscall numbers matching the x32 ABI, e.g. unmasked syscall numbers. This patch fixes this by simply removing the mask from syscall_get_nr() while preserving the other changes from the original commit. While there are several syscall_get_nr() callers in the kernel, most simply check that the syscall number is greater than zero, in this case this patch will have no effect. Of those remaining callers, they appear to be few, seccomp and ftrace, and from my testing of seccomp without this patch the original commit definitely breaks things; the seccomp filter does not correctly filter the syscalls due to the difference in syscall numbers in the BPF filter and the value from syscall_get_nr(). Applying this patch restores the seccomp BPF filter functionality on x32. I've tested this patch with the seccomp BPF filters as well as ftrace and everything looks reasonable to me; needless to say general usage seemed fine as well. Signed-off-by: Paul Moore <pmoore@redhat.com> Link: http://lkml.kernel.org/r/20130215172143.12549.10292.stgit@localhost Cc: <stable@vger.kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
* \| \| \|	alpha: irq: remove deprecated use of IRQF_DISABLED	Will Deacon	2013-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Interrupt handlers are always invoked with interrupts disabled, so remove all uses of the deprecated IRQF_DISABLED flag. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \|	alpha: irq: run all handlers with interrupts disabled	Will Deacon	2013-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Linux has expected that interrupt handlers are executed with local interrupts disabled for a while now, so ensure that this is the case on Alpha even for non-device interrupts such as IPIs. Without this patch, secondary boot results in the following backtrace: warning: at kernel/softirq.c:139 __local_bh_enable+0xb8/0xd0() trace: __local_bh_enable+0xb8/0xd0 irq_enter+0x74/0xa0 scheduler_ipi+0x50/0x100 handle_ipi+0x84/0x260 do_entint+0x1ac/0x2e0 irq_exit+0x60/0xa0 handle_irq+0x98/0x100 do_entint+0x2c8/0x2e0 ret_from_sys_call+0x0/0x10 load_balance+0x3e4/0x870 cpu_idle+0x24/0x80 rcu_eqs_enter_common.isra.38+0x0/0x120 cpu_idle+0x40/0x80 rest_init+0xc0/0xe0 _stext+0x1c/0x20 A similar dump occurs if you try to reboot using magic-sysrq. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \|	alpha: makefile: don't enforce small data model for kernel builds	Will Deacon	2013-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Due to all of the goodness being packed into today's kernels, the resulting image isn't as slim as it once was. In light of this, don't pass -msmall-data to gcc, which otherwise results in link failures due to impossible relocations when compiling anything but the most trivial configurations. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Thorsten Kranzkowski <dl8bcu@dl8bcu.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \|	alpha: Add irongate_io to PCI bus resources	Jay Estabrook	2013-04-07
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes a NULL pointer dereference at boot on UP1500. Cc: stable@vger.kernel.org Reviewed-and-Tested-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Jay Estabrook <jay.estabrook@gmail.com> Signed-off-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Michael Cree <mcree@orcon.net.nz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \| \|	Merge tag 'dm-3.9-fixes-2' of ↵	Linus Torvalds	2013-04-05
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm Pull device-mapper fixes from Alasdair Kergon: "A pair of patches to fix the writethrough mode of the device-mapper cache target when the device being cached is not itself wrapped with device-mapper." * tag 'dm-3.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-dm: dm cache: reduce bio front_pad size in writeback mode dm cache: fix writes to cache device in writethrough mode
\| * \| \| \|	dm cache: reduce bio front_pad size in writeback mode	Mike Snitzer	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A recent patch to fix the dm cache target's writethrough mode extended the bio's front_pad to include a 1056-byte struct dm_bio_details. Writeback mode doesn't need this, so this patch reduces the per_bio_data_size to 16 bytes in this case instead of 1096. The dm_bio_details structure was added in "dm cache: fix writes to cache device in writethrough mode" which fixed commit e2e74d617e ("dm cache: fix race in writethrough implementation"). In writeback mode we avoid allocating the writethrough-specific members of the per_bio_data structure (the dm_bio_details structure included). Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
\| * \| \| \|	dm cache: fix writes to cache device in writethrough mode	Darrick J. Wong	2013-04-05
\| \|/ / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The dm-cache writethrough strategy introduced by commit e2e74d617eadc15 ("dm cache: fix race in writethrough implementation") issues a bio to the origin device, remaps and then issues the bio to the cache device. This more conservative in-series approach was selected to favor correctness over performance (of the previous parallel writethrough). However, this in-series implementation that reuses the same bio to write both the origin and cache device didn't take into account that the block layer's req_bio_endio() modifies a completing bio's bi_sector and bi_size. So the new writethrough strategy needs to preserve these bio fields, and restore them before submission to the cache device, otherwise nothing gets written to the cache (because bi_size is 0). This patch adds a struct dm_bio_details field to struct per_bio_data, and uses dm_bio_record() and dm_bio_restore() to ensure the bio is restored before reissuing to the cache device. Adding such a large structure to the per_bio_data is not ideal but we can improve this later, for now correctness is the important thing. This problem initially went unnoticed because the dm-cache test-suite uses a linear DM device for the dm-cache device's origin device. Writethrough worked as expected because DM submits a clone of the original bio, so the original bio which was reused for the cache was never touched. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com>
* \| \| \|	Merge tag 'pci-v3.9-fixes-1' of ↵	Linus Torvalds	2013-04-05
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "PCI updates for v3.9: ASPM Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus" kexec PCI: Don't try to disable Bus Master on disconnected PCI devices Platform ROM images PCI: Add PCI ROM helper for platform-provided ROM images nouveau: Attempt to use platform-provided ROM image radeon: Attempt to use platform-provided ROM image Hotplug PCI/ACPI: Always resume devices on ACPI wakeup notifications PCI/PM: Disable runtime PM of PCIe ports EISA EISA/PCI: Fix bus res reference EISA/PCI: Init EISA early, before PNP" * tag 'pci-v3.9-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI/PM: Disable runtime PM of PCIe ports PCI/ACPI: Always resume devices on ACPI wakeup notifications PCI: Don't try to disable Bus Master on disconnected PCI devices Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus" radeon: Attempt to use platform-provided ROM image nouveau: Attempt to use platform-provided ROM image EISA/PCI: Init EISA early, before PNP EISA/PCI: Fix bus res reference PCI: Add PCI ROM helper for platform-provided ROM images
\| * \| \| \|	PCI/PM: Disable runtime PM of PCIe ports	Rafael J. Wysocki	2013-04-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The runtime PM of PCIe ports turns out to be quite fragile, as in some cases things work while in some other cases they don't and we don't seem to have a good way to determine whether or not they are going to work in advance. For this reason, avoid enabling runtime PM for PCIe ports by keeping their runtime PM reference counters always above 0 for the time being. When a PCIe port is suspended, it can no longer report events like hotplug, so hotplug below the port may not work, as in the bug report below. [bhelgaas: changelog, stable] Reference: https://bugzilla.kernel.org/show_bug.cgi?id=53811 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: stable@vger.kernel.org # v3.6+
\| * \| \| \|	PCI/ACPI: Always resume devices on ACPI wakeup notifications	Rafael J. Wysocki	2013-04-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It turns out that the _Lxx control methods provided by some BIOSes clear the PME Status bit of PCI devices they handle, which means that pci_acpi_wake_dev() cannot really use that bit to check whether or not the device has signalled wakeup. One symptom of the problem is, for example, that when an affected PCI USB controller is runtime-suspended, then plugging in a new USB device into one of the controller's ports will not wake up the controller, which should happen. For this reason, make pci_acpi_wake_dev() always attempt to resume the device it is called for regardless of the device's PME Status bit value (that bit still has to be cleared if set at this point, though). Reported-by: Sarah Sharp <sarah.a.sharp@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Matthew Garrett <mjg59@srcf.ucam.org> CC: stable@vger.kernel.org # v3.7+
\| * \| \| \|	PCI: Don't try to disable Bus Master on disconnected PCI devices	Konstantin Khlebnikov	2013-04-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a fix for commit 7897e60227 ("PCI: Disable Bus Master unconditionally in pci_device_shutdown()"). Vivek reported that with this commit, kexec failed because none of his SATA disks came up. A ->shutdown() callback might put the device in D3cold, which means config space is no longer available. [bhelgaas: changelog] Link: https://lkml.org/lkml/2013/3/12/529 Reported-and-Tested-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
\| * \| \| \|	Revert "PCI/ACPI: Request _OSC control before scanning PCI root bus"	Bjorn Helgaas	2013-04-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 8c33f51df406e1a1f7fa4e9b244845b7ebd61fa6. Conflicts: drivers/acpi/pci_root.c This commit broke some pre-1.1 PCIe devices by leaving them with ASPM enabled. Previously, we had disabled ASPM on these devices because many of them don't implement it correctly (per 149e1637). Requesting _OSC control early means that aspm_disabled may be set before we scan the PCI bus and configure link ASPM state. But the ASPM configuration currently skips the check for pre-PCIe 1.1 devices when aspm_disabled is set, like this: acpi_pci_root_add acpi_pci_osc_support if (flags != base_flags) pcie_no_aspm aspm_disabled = 1 pci_acpi_scan_root ... pcie_aspm_init_link_state pcie_aspm_sanity_check if (!aspm_disabled) /* check for pre-PCIe 1.1 device */ Therefore, setting aspm_disabled early means that we leave ASPM enabled on these pre-PCIe 1.1 devices, which is a regression for some devices. The best fix would be to clean up the ASPM init so we can evaluate _OSC before scanning the bug (that way boot-time and hot-add discovery will work the same), but that requires significant rework. For now, we'll just revert the _OSC change as the lowest-risk fix. Reference: https://bugzilla.kernel.org/show_bug.cgi?id=55211 Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Yinghai Lu <yinghai@kernel.org> CC: stable@vger.kernel.org # v3.8+
\| * \| \| \|	Merge branch 'pci/yinghai-eisa' into for-linus	Bjorn Helgaas	2013-04-02
\| \|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* pci/yinghai-eisa: EISA/PCI: Init EISA early, before PNP EISA/PCI: Fix bus res reference
\| \| * \| \| \|	EISA/PCI: Init EISA early, before PNP	Yinghai Lu	2013-04-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Matthew reported kernels fail the pci_eisa probe and are later successful with the virtual_eisa_root_init force probe without slot0. The reason for that is: PNP probing is before pci_eisa_init gets called as pci_eisa_init is called via pci_driver. pnp 00:0f has 0xc80 - 0xc84 reserved. [ 9.700409] pnp 00:0f: [io 0x0c80-0x0c84] so eisa_probe will fail from pci_eisa_init ==>eisa_root_register ==>eisa_probe path. as force_probe is not set in pci_eisa_root, it will bail early when slot0 is not probed and initialized. Try to use subsys_initcall_sync instead, and will keep following sequence: pci_subsys_init pci_eisa_init_early pnpacpi_init/isapnp_init After this patch EISA can be initialized properly, and PNP overlapping resource will not be reserved. [ 10.104434] system 00:0f: [io 0x0c80-0x0c84] could not be reserved Reported-by: Matthew Whitehead <mwhitehe@redhat.com> Tested-by: Matthew Whitehead <mwhitehe@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org
\| \| * \| \| \|	EISA/PCI: Fix bus res reference	Yinghai Lu	2013-04-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Matthew found that 3.8.3 is having problems with an old (ancient) PCI-to-EISA bridge, the Intel 82375. It worked with the 3.2 kernel. He identified the 82375, but doesn't assign the struct resource *res pointer inside the struct eisa_root_device, and panics. pci_eisa_init() was using bus->resource[] directly instead of pci_bus_resource_n(). The bus->resource[] array is a PCI-internal implementation detail, and after commit 45ca9e97 (PCI: add helpers for building PCI bus resource lists) and commit 0efd5aab (PCI: add struct pci_host_bridge_window with CPU/bus address offset), bus->resource[] is not used for PCI root buses any more. The 82375 is a subtractive-decode PCI device, so handle it the same way we handle PCI-PCI bridges in subtractive-decode mode in pci_read_bridge_bases(). [bhelgaas: changelog] Reported-by: Matthew Whitehead <mwhitehe@redhat.com> Tested-by: Matthew Whitehead <mwhitehe@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v3.3+
\| * \| \| \| \|	Merge branch 'pci/mjg-rom' into for-linus	Bjorn Helgaas	2013-04-01
\| \|\ \ \ \ \ \| \| \|/ / / / \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* pci/mjg-rom: radeon: Attempt to use platform-provided ROM image nouveau: Attempt to use platform-provided ROM image PCI: Add PCI ROM helper for platform-provided ROM images
\| \| * \| \| \|	radeon: Attempt to use platform-provided ROM image	Matthew Garrett	2013-04-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some platforms only provide their PCI ROM via a platform-specific interface. Fall back to attempting that if all other sources fail. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
\| \| * \| \| \|	nouveau: Attempt to use platform-provided ROM image	Matthew Garrett	2013-04-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some platforms only provide their PCI ROM via a platform-specific interface. Fall back to attempting that if all other sources fail. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
\| \| * \| \| \|	PCI: Add PCI ROM helper for platform-provided ROM images	Matthew Garrett	2013-03-26
\| \|/ / / / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It turns out that some UEFI systems provide apparently an apparently valid PCI ROM BAR that turns out to contain garbage, so the attempt in 547b52463 to prefer the ROM from the BAR actually breaks a different set of machines. As Linus pointed out, the graphics drivers are probably in the best position to make this judgement, so this basically reverts 547b52463 and f9a37be0f and adds a new helper function. Followup patches will add support to nouveau and radeon for probing this ROM source if they can't find a ROM from some other source. [bhelgaas: added reporter and bugzilla pointers, s/f4eb5ff05/547b52463] Reference: https://bugzilla.redhat.com/show_bug.cgi?id=927451 Reference: http://lkml.kernel.org/r/kg69ef$vdb$1@ger.gmane.org Reported-by: Mantas Mikulėnas <grawity@gmail.com> Reported-by: Chris Murphy <bugzilla@colorremedies.com> Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
* \| \| \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	2013-04-05
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Pull networking fixes from David Miller: 1) Fix erroneous sock_orphan() leading to crashes and double kfree_skb() in NFC protocol. From Thierry Escande and Samuel Ortiz. 2) Fix use after free in remain-on-channel mac80211 code, from Johannes Berg. 3) nf_reset() needs to reset the NF tracing cookie, otherwise we can leak it from one namespace into another. Fix from Gao Feng and Patrick McHardy. 4) Fix overflow in channel scanning array of mwifiex driver, from Stone Piao. 5) Fix loss of link after suspend/shutdown in r8169, from Hayes Wang. 6) Synchronization of unicast address lists to the undelying device doesn't work because whether to sync is maintained as a boolean rather than a true count. Fix from Vlad Yasevich. 7) Fix corruption of TSO packets in atl1e by limiting the segmented packet length. From Hannes Frederic Sowa. 8) Revert bogus AF_UNIX credential passing change and fix the coalescing issue properly, from Eric W Biederman. 9) Changes of ipv4 address lifetime settings needs to generate a notification, from Jiri Pirko. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (22 commits) netfilter: don't reset nf_trace in nf_reset() net: ipv4: notify when address lifetime changes ixgbe: fix registration order of driver and DCA nofitication af_unix: If we don't care about credentials coallesce all messages Revert "af_unix: dont send SCM_CREDENTIAL when dest socket is NULL" bonding: remove sysfs before removing devices atl1e: limit gso segment size to prevent generation of wrong ip length fields net: count hw_addr syncs so that unsync works properly. r8169: fix auto speed down issue netfilter: ip6t_NPT: Fix translation for non-multiple of 32 prefix lengths mwifiex: limit channel number not to overflow memory NFC: microread: Fix build failure due to a new MEI bus API iwlwifi: dvm: fix the passive-no-RX workaround netfilter: nf_conntrack: fix error return code NFC: llcp: Keep the connected socket parent pointer alive mac80211: fix idle handling sequence netfilter: nfnetlink_acct: return -EINVAL if object name is empty netfilter: nfnetlink_queue: fix error return code in nfnetlink_queue_init() netfilter: reset nf_trace in nf_reset mac80211: fix remain-on-channel cancel crash ...
\| * \| \| \| \|	netfilter: don't reset nf_trace in nf_reset()	Patrick McHardy	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 130549fe ("netfilter: reset nf_trace in nf_reset") added code to reset nf_trace in nf_reset(). This is wrong and unnecessary. nf_reset() is used in the following cases: - when passing packets up the the socket layer, at which point we want to release all netfilter references that might keep modules pinned while the packet is queued. nf_trace doesn't matter anymore at this point. - when encapsulating or decapsulating IPsec packets. We want to continue tracing these packets after IPsec processing. - when passing packets through virtual network devices. Only devices on that encapsulate in IPv4/v6 matter since otherwise nf_trace is not used anymore. Its not entirely clear whether those packets should be traced after that, however we've always done that. - when passing packets through virtual network devices that make the packet cross network namespace boundaries. This is the only cases where we clearly want to reset nf_trace and is also what the original patch intended to fix. Add a new function nf_reset_trace() and use it in dev_forward_skb() to fix this properly. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \| \| \|	net: ipv4: notify when address lifetime changes	Jiri Pirko	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	if userspace changes lifetime of address, send netlink notification and call notifier. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \| \| \|	ixgbe: fix registration order of driver and DCA nofitication	Jakub Kicinski	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ixgbe_notify_dca cannot be called before driver registration because it expects driver's klist_devices to be allocated and initialized. While on it make sure debugfs files are removed when registration fails. Cc: stable <stable@vger.kernel.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \| \| \|	af_unix: If we don't care about credentials coallesce all messages	Eric W. Biederman	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It was reported that the following LSB test case failed https://lsbbugs.linuxfoundation.org/attachment.cgi?id=2144 because we were not coallescing unix stream messages when the application was expecting us to. The problem was that the first send was before the socket was accepted and thus sock->sk_socket was NULL in maybe_add_creds, and the second send after the socket was accepted had a non-NULL value for sk->socket and thus we could tell the credentials were not needed so we did not bother. The unnecessary credentials on the first message cause unix_stream_recvmsg to start verifying that all messages had the same credentials before coallescing and then the coallescing failed because the second message had no credentials. Ignoring credentials when we don't care in unix_stream_recvmsg fixes a long standing pessimization which would fail to coallesce messages when reading from a unix stream socket if the senders were different even if we did not care about their credentials. I have tested this and verified that the in the LSB test case mentioned above that the messages do coallesce now, while the were failing to coallesce without this change. Reported-by: Karel Srot <ksrot@redhat.com> Reported-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \| \| \|	Revert "af_unix: dont send SCM_CREDENTIAL when dest socket is NULL"	Eric W. Biederman	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 14134f6584212d585b310ce95428014b653dfaf6. The problem that the above patch was meant to address is that af_unix messages are not being coallesced because we are sending unnecesarry credentials. Not sending credentials in maybe_add_creds totally breaks unconnected unix domain sockets that wish to send credentails to other sockets. In practice this break some versions of udev because they receive a message and the sending uid is bogus so they drop the message. Reported-by: Sven Joachim <svenjoac@gmx.de> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \| \| \|	bonding: remove sysfs before removing devices	Veaceslav Falico	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have a race condition if we try to rmmod bonding and simultaneously add a bond master through sysfs. In bonding_exit() we first remove the devices (through rtnl_link_unregister() ) and only after that we remove the sysfs. If we manage to add a device through sysfs after that the devices were removed - we'll end up with that device/sysfs structure and with the module unloaded. Fix this by first removing the sysfs and only after that calling rtnl_link_unregister(). Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \| \| \|	atl1e: limit gso segment size to prevent generation of wrong ip length fields	Hannes Frederic Sowa	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The limit of 0x3c00 is taken from the windows driver. Suggested-by: Huang, Xiong <xiong@qca.qualcomm.com> Cc: Huang, Xiong <xiong@qca.qualcomm.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \| \| \|	net: count hw_addr syncs so that unsync works properly.	Vlad Yasevich	2013-04-05
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A few drivers use dev_uc_sync/unsync to synchronize the address lists from master down to slave/lower devices. In some cases (bond/team) a single address list is synched down to multiple devices. At the time of unsync, we have a leak in these lower devices, because "synced" is treated as a boolean and the address will not be unsynced for anything after the first device/call. Treat "synced" as a count (same as refcount) and allow all unsync calls to work. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>