litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	intel-iommu: Set a more specific taint flag for invalid BIOS DMAR tables	Ben Hutchings	2010-05-19
\| \| \| \| \| \| \| \|	We now know how to deal with these tables so that they are harmless. Set TAINT_FIRMWARE_WORKAROUND instead of the default TAINT_WARN. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: Combine the BIOS DMAR table warning messages	Ben Hutchings	2010-05-19
\| \| \| \| \| \| \| \|	We have nearly the same code for warnings repeated four times. Move it into a separate function. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	panic: Add taint flag TAINT_FIRMWARE_WORKAROUND ('I')	Ben Hutchings	2010-05-19
\| \| \| \| \| \| \| \|	This taint flag will initially be used when warning about invalid ACPI DMAR tables. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	panic: Allow warnings to set different taint flags	Ben Hutchings	2010-05-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	WARN() is used in some places to report firmware or hardware bugs that are then worked-around. These bugs do not affect the stability of the kernel and should not set the flag for TAINT_WARN. To allow for this, add WARN_TAINT() and WARN_TAINT_ONCE() macros that take a taint number as argument. Architectures that implement warnings using trap instructions instead of calls to warn_slowpath_*() now implement __WARN_TAINT(taint) instead of __WARN(). Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Acked-by: Helge Deller <deller@gmx.de> Tested-by: Paul Mundt <lethal@linux-sh.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: intel_iommu_map_range failed at very end of address space	Tom Lyon	2010-05-17
\| \| \| \| \| \| \| \| \|	intel_iommu_map_range() doesn't allow allocation at the very end of the address space; that code has been simplified and corrected. Signed-off-by: Tom Lyon <pugs@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: errors with smaller iommu widths	Tom Lyon	2010-05-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using iommu_domain_alloc with the Intel iommu, the domain address width is always initialized to 48 bits (agaw 2). This domain->agaw value is then used by pfn_to_dma_pte to (always) build a 4 level page table. However, not all systems support iommu width of 48 or 4 level page tables. In particular, the Core i5-660 and i5-670 support an address width of 36 bits (not 39!), an agaw of only 1, and only 3 level page tables. This version of the patch simply lops off extra levels of the page tables if the agaw value of the iommu is less than what is currently allocated for the domain (in intel_iommu_attach_device). If there were already allocated addresses above what the new iommu can handle, EFAULT is returned. Signed-off-by: Tom Lyon <pugs@cisco.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: Fix boot inside 64bit virtualbox with io-apic disabled	Arnaud Patard	2010-04-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 074835f0143b83845af5044af2739c52c9f53808 ("intel-iommu: Fix kernel hand if interrupt remapping disabled in BIOS") is adding a check for interrupt remapping disabled and is dereferencing the dmar_tbl pointer without checking its value. Unfortunately, this value is null when booting inside a 64bit virtual box guest with io-apic disabled, leading to a crash. With a check on it, the guest is now booting. It's triggering a WARN() in clockevent_delta2ns but it's better than not booting at all and allows the user to see there's something wrong on their io-apic setup. Signed-off-by: Arnaud Patard <apatard@mandriva.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: use physfn to search drhd for VF	Yinghai	2010-04-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When virtfn is used, we should use physfn to find correct drhd -v2: add pci_physfn() Suggested by Roland Dreier <rdreier@cisco.com> do can remove ifdef in dmar.c -v3: Chris pointed out we need that for dma_find_matched_atsr_unit too also change dmar_pci_device_match() static Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Roland Dreier <rdreier@cisco.com> Acked-by: Chris Wright <chrisw@sous-sol.org> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: Print out iommu seq_id	Yinghai Lu	2010-04-09
\| \| \| \| \| \| \|	more info on system with more than one IOMMU Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: Don't complain that ACPI_DMAR_SCOPE_TYPE_IOAPIC is not supported	Yinghai Lu	2010-04-09
\| \| \| \| \|	Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: Avoid global flushes with caching mode.	Nadav Amit	2010-04-09
\| \| \| \| \| \| \| \| \| \| \|	While it may be efficient on real hardware, emulation of global invalidations is very expensive as all shadow entries must be examined. This patch changes the behaviour when caching mode is enabled (which is the case when IOMMU emulation takes place). In this case, page specific invalidation is used instead. Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: Use correct domain ID when caching mode is enabled	Nadav Amit	2010-04-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In caching-mode mappings of pages (changes from non-present to present) require invalidation. Currently, this IOTLB flush is performed with domain ID of zero. This is not according to the VT-d spec and causes big problems for emulating software. This patch uses the correct domain ID in IOTLB flushes. Device IOTLB invalidation is performed only on present to non-present changes. This decision is now based on explicit parameter instead of zero domain-ID. Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu mistakenly uses offset_pfn when caching mode is enabled	Nadav Amit	2010-04-09
\| \| \| \| \| \| \| \|	intel_map_sg used offset_pfn which was set to zero when invalidating the IOTLB. intel_map_sg now uses size variable for this matter. Signed-off-by: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	intel-iommu: use for_each_set_bit()	Akinobu Mita	2010-04-09
\| \| \| \| \| \| \| \|	Replace open-coded loop with for_each_set_bit(). Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
*	Merge branch 'master' of ↵	David Woodhouse	2010-04-09
\|\ \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6
\| *	Linux 2.6.34-rc3v2.6.34-rc3	Linus Torvalds	2010-03-30
\| \|
\| *	KEYS: Add MAINTAINERS record	David Howells	2010-03-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a MAINTAINERS record for the key management facility. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
\| *	Merge branch 'for-linus' of ↵	Linus Torvalds	2010-03-30
\| \|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: CRED: Fix memory leak in error handling
\| \| *	CRED: Fix memory leak in error handling	Mathieu Desnoyers	2010-03-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a memory leak on an OOM condition in prepare_usermodehelper_creds(). Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <jmorris@namei.org>
\| * \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs	Linus Torvalds	2010-03-30
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/joern/logfs: [LogFS] Erase new journal segments [LogFS] Move reserved segments with journal [LogFS] Clear PagePrivate when moving journal Simplify and fix pad_wbuf Prevent data corruption in logfs_rewrite_block() Use deactivate_locked_super Fix logfs_get_sb_final error path Write out both superblocks on mismatch Prevent schedule while atomic in __logfs_readdir Plug memory leak in writeseg_end_io Limit max_pages for insane devices Open segment file before using it
\| \| * \|	[LogFS] Erase new journal segments	Joern Engel	2010-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the device contains on old logfs image and the journal is moved to segment that have never been used by the current logfs and not all journal segments are erased before the next mount, the old content can confuse mount code. To prevent this, always erase the new journal segments. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	[LogFS] Move reserved segments with journal	Joern Engel	2010-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes a GC livelock. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	[LogFS] Clear PagePrivate when moving journal	Joern Engel	2010-03-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	do_logfs_journal_wl_pass() must call freeseg(), thereby clear PagePrivate on all pages of the current journal segment. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	Simplify and fix pad_wbuf	Joern Engel	2010-03-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A comment in the old code read: /* The math in this function can surely use some love */ And indeed it did. In the case that area->a_used_bytes is exactly 4096 bytes below segment size it fell apart. pad_wbuf is now split into two helpers that are significantly less complicated. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	Prevent data corruption in logfs_rewrite_block()	Joern Engel	2010-03-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The comment was correct, so make the code match the comment. As the new comment indicates, we might be able to do a little less work. But for the current -rc series let's keep it simple and just fix the bug. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	Use deactivate_locked_super	Joern Engel	2010-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Found by Al Viro. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	Fix logfs_get_sb_final error path	Joern Engel	2010-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	rootdir was already allocated, so we must iput it again. Found by Al Viro. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	Write out both superblocks on mismatch	Joern Engel	2010-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the first superblock is wrong and the second gets written, there will still be a mismatch on next mount. Write both to make sure they match. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	Prevent schedule while atomic in __logfs_readdir	Joern Engel	2010-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Apparently filldir can sleep, which forbids kmap_atomic. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	Plug memory leak in writeseg_end_io	Joern Engel	2010-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	Limit max_pages for insane devices	Joern Engel	2010-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Intel SSDs have a limit of 0xffff as queue_max_hw_sectors(q). Such a limit may make sense from a hardware pov, but it causes bio_alloc() to return NULL. Signed-off-by: Joern Engel <joern@logfs.org>
\| \| * \|	Open segment file before using it	Joern Engel	2010-03-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	logfs_recover_sb() needs it open. Signed-off-by: Joern Engel <joern@logfs.org>
\| * \| \|	Merge branch 'x86-fixes-for-linus' of ↵	Linus Torvalds	2010-03-30
\| \|\ \ \ \| \| \|_\|/ \| \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: Do not free zero sized per cpu areas x86: Make sure free_init_pages() frees pages on page boundary x86: Make smp_locks end with page alignment
\| \| * \|	x86: Do not free zero sized per cpu areas	Ian Campbell	2010-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This avoids an infinite loop in free_early_partial(). Add a warning to free_early_partial() to catch future problems. -v5: put back start > end back into WARN_ONCE() -v6: use one line for warning, suggested by Linus -v7: more tests -v8: remove the function name as suggested by Johannes WARN_ONCE() will print out that function name. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Joel Becker <joel.becker@oracle.com> Tested-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1269830604-26214-4-git-send-email-yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| \| * \|	x86: Make sure free_init_pages() frees pages on page boundary	Yinghai Lu	2010-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When CONFIG_NO_BOOTMEM=y, it could use memory more effiently, or in a more compact fashion. Example: Allocated new RAMDISK: 00ec2000 - 0248ce57 Move RAMDISK from 000000002ea04000 - 000000002ffcee56 to 00ec2000 - 0248ce56 The new RAMDISK's end is not page aligned. Last page could be shared with other users. When free_init_pages are called for initrd or .init, the page could be freed and we could corrupt other data. code segment in free_init_pages(): \| for (; addr < end; addr += PAGE_SIZE) { \| ClearPageReserved(virt_to_page(addr)); \| init_page_count(virt_to_page(addr)); \| memset((void *)(addr & ~(PAGE_SIZE-1)), \| POISON_FREE_INITMEM, PAGE_SIZE); \| free_page(addr); \| totalram_pages++; \| } last half page could be used as one whole free page. So page align the boundaries. -v2: make the original initramdisk to be aligned, according to Johannes, otherwise we have the chance to lose one page. we still need to keep initrd_end not aligned, otherwise it could confuse decompressor. -v3: change to WARN_ON instead, suggested by Johannes. -v4: use PAGE_ALIGN, suggested by Johannes. We may fix that macro name later to PAGE_ALIGN_UP, and PAGE_ALIGN_DOWN Add comments about assuming ramdisk start is aligned in relocate_initrd(), change to re get ramdisk_image instead of save it to make diff smaller. Add warning for wrong range, suggested by Johannes. -v6: remove one WARN() We need to align beginning in free_init_pages() do not copy more than ramdisk_size, noticed by Johannes Reported-by: Stanislaw Gruszka <sgruszka@redhat.com> Tested-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1269830604-26214-3-git-send-email-yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| \| * \|	x86: Make smp_locks end with page alignment	Yinghai Lu	2010-03-29
\| \| \|/ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix: ------------[ cut here ]------------ WARNING: at arch/x86/mm/init.c:342 free_init_pages+0x4c/0xfa() free_init_pages: range [0x40daf000, 0x40db5c24] is not aligned Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.34-rc2-tip-03946-g4f16b23-dirty #50 Call Trace: [<40232e9f>] warn_slowpath_common+0x65/0x7c [<4021c9f0>] ? free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40232eea>] warn_slowpath_fmt+0x24/0x27 [<4021c9f0>] free_init_pages+0x4c/0xfa [<40881434>] ? _etext+0x0/0x24 [<40d3f4bd>] alternative_instructions+0xf6/0x100 [<40d3fe4f>] check_bugs+0xbd/0xbf [<40d398a7>] start_kernel+0x2d5/0x2e4 [<40d390ce>] i386_start_kernel+0xce/0xd5 ---[ end trace 4eaa2a86a8e2da22 ]--- Comments in vmlinux.lds.S already said: \| /* \| * smp_locks might be freed after init \| * start/end must be page aligned \| */ Signed-off-by: Yinghai Lu <yinghai@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Miller <davem@davemloft.net> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1269830604-26214-2-git-send-email-yinghai@kernel.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
\| * \|	Merge branch 'upstream-linus' of ↵	Linus Torvalds	2010-03-29
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2: Fix a race in o2dlm lockres mastery Ocfs2: Handle deletion of reflinked oprhan inodes correctly. Ocfs2: Journaling i_flags and i_orphaned_slot when adding inode to orphan dir. ocfs2: Clear undo bits when local alloc is freed ocfs2: Init meta_ac properly in ocfs2_create_empty_xattr_block. ocfs2: Fix the update of name_offset when removing xattrs ocfs2: Always try for maximum bits with new local alloc windows ocfs2: set i_mode on disk during acl operations ocfs2: Update i_blocks in reflink operations. ocfs2: Change bg_chain check for ocfs2_validate_gd_parent. [PATCH] Skip check for mandatory locks when unlocking
\| \| * \|	ocfs2: Fix a race in o2dlm lockres mastery	Srinivas Eeda	2010-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In o2dlm, the master of a lock resource keeps a map of all interested nodes. This prevents the master from purging the resource before an interested node can create a lock. A race between the mastery thread and the mastery handler allowed an interested node to discover who the master is without informing the master directly. This is easily fixed by holding the dlm spinlock a little longer in the mastery handler. Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	Ocfs2: Handle deletion of reflinked oprhan inodes correctly.	Tristan Ye	2010-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The rule is that all inodes in the orphan dir have ORPHANED_FL, otherwise we treated it as an ERROR. This rule works well except for some rare cases of reflink operation: http://oss.oracle.com/bugzilla/show_bug.cgi?id=1215 The problem is caused by how reflink and our orphan_scan thread interact. * The orphan scan pulls the orphans into a queue first, then runs the queue at a later time. We only hold the orphan_dir's lock during scanning. * Reflink create a oprhaned target in orphan_dir as its first step. It removes the target and clears the flag as the final step. These two steps take the orphan_dir's lock, but it is not held for the duration. Based on the above semantics, a reflink inode can be moved out of the orphan dir and have its ORPHANED_FL cleared before the queue of orphans is run. This leads to a ERROR in ocfs2_query_wipde_inode(). This patch teaches ocfs2_query_wipe_inode() to detect previously orphaned reflink targets. If a reflink fails or a crash occurs during the relfink operation, the inode will retain ORPHANED_FL and will be properly wiped. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	Ocfs2: Journaling i_flags and i_orphaned_slot when adding inode to orphan dir.	Tristan Ye	2010-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, some callers were missing to journal the dirty inode after adding it to orphan dir. Now we're going to journal such modifications within the ocfs2_orphan_add() itself, It's safe to do so, though some existing caller may duplicate this, and it makes the logic look more straightforward anyway. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	ocfs2: Clear undo bits when local alloc is freed	Mark Fasheh	2010-03-23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the local alloc file changes windows, unused bits are freed back to the global bitmap. By defnition, those bits can not be in use by any file. Also, the local alloc will never have been able to allocate those bits if they were part of a previous truncate. Therefore it makes sense that we should clear unused local alloc bits in the undo buffer so that they can be used immediatly. [ Modified to call it ocfs2_release_clusters() -- Joel ] Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	ocfs2: Init meta_ac properly in ocfs2_create_empty_xattr_block.	Tao Ma	2010-03-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	You can't store a pointer that you haven't filled in yet and expect it to work. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	ocfs2: Fix the update of name_offset when removing xattrs	Tao Ma	2010-03-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When replacing a xattr's value, in some case we wipe its name/value first and then re-add it. The wipe is done by ocfs2_xa_block_wipe_namevalue() when the xattr is in the inode or block. We currently adjust name_offset for all the entries which have (offset < name_offset). This does not adjust the entrie we're replacing. Since we are replacing the entry, we don't adjust the total entry count. When we calculate a new namevalue location, we trust the entries now-wrong offset in ocfs2_xa_get_free_start(). The solution is to also adjust the name_offset for the replaced entry, allowing ocfs2_xa_get_free_start() to calculate the new namevalue location correctly. The following script can trigger a kernel panic easily. echo 'y'\|mkfs.ocfs2 --fs-features=local,xattr -b 4K $DEVICE mount -t ocfs2 $DEVICE $MNT_DIR FILE=$MNT_DIR/$RANDOM for((i=0;i<76;i++)) do string_76="a$string_76" done string_78="aa$string_76" string_82="aaaa$string_78" touch $FILE setfattr -n 'user.test1234567890' -v $string_76 $FILE setfattr -n 'user.test1234567890' -v $string_78 $FILE setfattr -n 'user.test1234567890' -v $string_82 $FILE Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	ocfs2: Always try for maximum bits with new local alloc windows	Mark Fasheh	2010-03-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	What we were doing before was to ask for the current window size as the maximum allocation. This had the effect of limiting the amount of allocation we could get for the local alloc during times when the window size was shrunk due to fragmentation. In some cases, that could actually increase fragmentation by artificially limiting the number of bits we can accept. So while we still want to ask for a minimum number of bits equal to window size, there is no reason why we should limit the number of bits the local alloc should accept. Hence always allow the maximum number of local alloc bits. Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	ocfs2: set i_mode on disk during acl operations	Mark Fasheh	2010-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ocfs2_set_acl() and ocfs2_init_acl() were setting i_mode on the in-memory inode, but never setting it on the disk copy. Thus, acls were some times not getting propagated between nodes. This patch fixes the issue by adding a helper function ocfs2_acl_set_mode() which does this the right way. ocfs2_set_acl() and ocfs2_init_acl() are then updated to call ocfs2_acl_set_mode(). Signed-off-by: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	ocfs2: Update i_blocks in reflink operations.	Tao Ma	2010-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In reflink, we need to upate i_blocks for the target inode. Reported-by: Jie Liu <jeff.liu@oracle.com> Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	ocfs2: Change bg_chain check for ocfs2_validate_gd_parent.	Tao Ma	2010-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In ocfs2_validate_gd_parent, we check bg_chain against the cl_next_free_rec of the dinode. Actually in resize, we have the chance of bg_chain == cl_next_free_rec. So add some additional condition check for it. I also rename paramter "clean_error" to "resize", since the old one is not clearly enough to indicate that we should only meet with this case in resize. btw, the correpsonding bug is http://oss.oracle.com/bugzilla/show_bug.cgi?id=1230. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| \| * \|	[PATCH] Skip check for mandatory locks when unlocking	Sachin Prabhu	2010-03-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ocfs2_lock() will skip locks on file which has mode set to 02666. This is a problem in cases where the mode of the file is changed after a process has obtained a lock on the file. ocfs2_lock() should skip the check for mandatory locks when unlocking a file. Signed-off-by: Sachin Prabhu <sprabhu@redhat.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>
\| * \| \|	Merge branch 'for-linus' of ↵	Linus Torvalds	2010-03-29
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (28 commits) ceph: update discussion list address in MAINTAINERS ceph: some documentations fixes ceph: fix use after free on mds __unregister_request ceph: avoid loaded term 'OSD' in documention ceph: fix possible double-free of mds request reference ceph: fix session check on mds reply ceph: handle kmalloc() failure ceph: propagate mds session allocation failures to caller ceph: make write_begin wait propagate ERESTARTSYS ceph: fix snap rebuild condition ceph: avoid reopening osd connections when address hasn't changed ceph: rename r_sent_stamp r_stamp ceph: fix connection fault con_work reentrancy problem ceph: prevent dup stale messages to console for restarting mds ceph: fix pg pool decoding from incremental osdmap update ceph: fix mds sync() race with completing requests ceph: only release unused caps with mds requests ceph: clean up handle_cap_grant, handle_caps wrt session mutex ceph: fix session locking in handle_caps, ceph_check_caps ceph: drop unnecessary WARN_ON in caps migration ...
\| \| * \| \|	ceph: update discussion list address in MAINTAINERS	Sage Weil	2010-03-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@newdream.net>