litmus-rt-imx6.git - LITMUS^RT and MC^2 V1 support for the i.MX6 processor family.

	Commit message (Collapse)	Author	Age
*	powerpc: Only print clockevent settings once	Anton Blanchard	2010-02-08
\| \| \| \| \| \| \| \|	The clockevent multiplier and shift is useful information, but we only need to print it once. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Clear MSR_RI during RTAS calls	Anton Blanchard	2010-02-08
\| \| \| \| \| \| \| \| \| \|	RTAS should never cause an exception but if it does (for example accessing outside our RMO) then we might go a long way through the kernel before oopsing. If we unset MSR_RI we should at least stop things on exception exit. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Remove trailing space in messages	Frans Pop	2010-02-08
\| \| \| \| \| \| \| \|	Signed-off-by: Frans Pop <elendil@planet.nl> Cc: linuxppc-dev@ozlabs.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Make powerpc_firmware_features __read_mostly	Anton Blanchard	2010-02-08
\| \| \| \| \| \| \| \|	We use firmware_has_feature quite a lot these days, so it's worth putting powerpc_firmware_features into __read_mostly. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Reformat SD_NODE_INIT to match x86	Anton Blanchard	2010-02-08
\| \| \| \| \| \| \| \|	Clean up SD_NODE_INITS so we can easily compare it to x86. Similar to the work in 47734f89be0614b5acbd6a532390f9c72f019648 (sched: Clean up topology.h) Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Convert mmu context allocator from idr to ida	Anton Blanchard	2010-02-08
\| \| \| \| \| \| \| \|	We can use the much more lightweight ida allocator since we don't need the pointer storage idr provides. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Add last sysfs file and dump of ftrace buffer to oops printout	Anton Blanchard	2010-02-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add printout of last accessed sysfs file, added to x86 in ae87221d3ce49d9de1e43756da834fd0bf05a2ad (sysfs: crash debugging) Also add the notify_die hook that allows us to print out the ftrace buffer on oops. This is useful in conjunction with ftrace function_graph: Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=128 NUMA pSeries last sysfs file: /sys/class/net/tunl0/type Dumping ftrace buffer: ... 0) \| .sysrq_handle_crash() { 0) 0.476 us \| .hash_page(); 0) 0.488 us \| .xmon_fault_handler(); 0) \| .bad_page_fault() { 0) \| .search_exception_tables() { 0) 0.590 us \| .search_module_extables(); 0) 2.546 us \| } 0) \| .printk() { 0) \| .vprintk() { 0) 0.488 us \| ._raw_spin_lock(); 0) 0.572 us \| .emit_log_char(); Showing the function graph of a sysrq-c crash. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Reduce differences between pseries and ppc64 defconfigs	Anton Blanchard	2010-02-08
\| \| \| \| \| \| \| \| \| \| \| \|	The pseries and ppc64 defconfigs have drifted apart over the years. Reduce some of the differences while still keeping the idea that the ppc64 defconfig is cross platform but enables fewer features than pseries, eg NR_CPUS is lower. Also enable a number of common adapters as modules. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc/pseries: Quieten cede latency printk	Anton Blanchard	2010-02-08
\| \| \| \| \| \| \| \| \|	The cede latency stuff is relatively new and we don't need to complain about it not working on older firmware. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	arch/powerpc: Fix continuation line formats	Joe Perches	2010-02-08
\| \| \| \| \| \| \| \|	String constants that are continued on subsequent lines with \ are not good. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc/pseries: Hypervisor call tracepoints hcall_stats touchup	Will Schmidt	2010-02-08
\| \| \| \| \| \| \| \| \| \| \|	The tb_total and purr_total values reported via the hcall_stats code should be cumulative, rather than being replaced by the latest delta tb or purr value. Tested-by: Will Schmidt <will_schmidt@vnet.ibm.com> Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com> Acked-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc/pseries: Pass more accurate number of supported cores to firmware	Benjamin Herrenschmidt	2010-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Updated variant of a patch by Joel Schopp. The field containing the number of supported cores which we pass to firmware via the ibm,client-architecture call was set by a previous patch statically as high as is possible (NR_CPUS). However, that value isn't quite right for a system that supports multiple threads per core, thus permitting the firmware to assign more cores to a Linux partition than it can really cope with. This patch improves it by using the device-tree to determine the number of threads supported by the processors in order to adjust the value passed to firmware. Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Add static fields to ibm,client-architecture call	jschopp@austin.ibm.com	2010-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds 2 fields to the ibm_architecture_vec array. The first of these fields indicates the number of cores which Linux can boot. It does not account for SMT, so it may result in cpus assigned to Linux which cannot be booted. A second patch follows that dynamically updates this for SMT. The second field just indicates that our OS is Linux, and not another OS. The system may or may not use this hint to performance tune settings for Linux. Signed-off-by: Joel Schopp <jschopp@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Fix typo s/leve/level/ in TLB code	Thadeu Lima de Souza Cascardo	2010-02-03
\| \| \| \| \|	Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@holoscopio.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	lmb: Add lmb_free()	Michael Ellerman	2010-02-03
\| \| \| \| \| \| \| \| \| \| \|	We can free memory allocated with lmb_alloc() by removing it from the list of reserved LMBs. Rework lmb_remove() to allow that possibility and add lmb_free() which exploits it. BenH: Removed some useless parenthesis Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Increase NR_IRQS Kconfig maximum to 32768	Anton Blanchard	2010-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With dynamic irq descriptors the overhead of a large NR_IRQS is much lower than it used to be. With more MSI-X capable adapters and drivers exploiting multiple vectors we may as well allow the user to increase it beyond the current maximum of 512. 32768 seems large enough that we'd never have to bump it again (although I bet my prediction is horribly wrong). It boot tests OK and the vmlinux footprint increase is only around 500kB due to: struct irq_map_entry irq_map[NR_IRQS]; We format /proc/interrupts correctly with the previous changes: CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 286: 0 0 0 0 0 0 516: 0 0 0 0 0 0 16689: 1833 0 0 0 0 0 17157: 0 0 0 0 0 0 17158: 319 0 0 0 0 0 25092: 0 0 0 0 0 0 Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	macintosh/hwmon/ams: Fix device removal sequence	Jean Delvare	2010-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some code that is in ams_exit() (the module exit code) should instead be called when the device (not module) is removed. It probably doesn't make much of a difference in the PMU case, but in the I2C case it does matter. I make no guarantee that my fix isn't racy, I'm not familiar enough with the ams driver code to tell for sure. Signed-off-by: Jean Delvare <khali@linux-fr.org> Tested-by: Christian Kujau <lists@nerdbynature.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Stelian Pop <stelian@popies.net> Cc: Michael Hanselmann <linux-kernel@hansmi.ch> Cc: stable@kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	macintosh/therm_adt746x: Fix sysfs attributes lifetime	Jean Delvare	2010-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Looking at drivers/macintosh/therm_adt746x.c, the sysfs files are created in thermostat_init() and removed in thermostat_exit(), which are the driver's init and exit functions. These files are backed-up by a per-device structure, so it looks like the wrong thing to do: the sysfs files have a lifetime longer than the data structure that is backing it up. I think that sysfs files creation should be moved to the end of probe_thermostat() and sysfs files removal should be moved to the beginning of remove_thermostat(). Signed-off-by: Jean Delvare <khali@linux-fr.org> Tested-by: Christian Kujau <lists@nerdbynature.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Colin Leroy <colin@colino.net> Cc: stable@kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	hvc_console: Remove __devinit annotation from hvc_alloc	Amit Shah	2010-02-03
\| \| \| \| \| \| \| \| \| \| \| \|	Virtio consoles can be hotplugged, so hvc_alloc gets called from multiple sites: from the initial probe() routine as well as later on from workqueue handlers which aren't __devinit code. So, drop the __devinit annotation for hvc_alloc. Signed-off-by: Amit Shah <amit.shah@redhat.com> Cc: linuxppc-dev@ozlabs.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	hvc_console: Make the ops pointer const.	Rusty Russell	2010-02-03
\| \| \| \| \| \| \| \| \| \| \|	This is nicer for modern R/O protection. And noone needs it non-const, so constify the callers as well. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Amit Shah <amit.shah@redhat.com> To: Christian Borntraeger <borntraeger@de.ibm.com> Cc: linuxppc-dev@ozlabs.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc/85xx: Fix SMP when "cpu-release-addr" is in lowmem	Peter Tyser	2010-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent U-Boot commit 5ccd29c3679b3669b0bde5c501c1aa0f325a7acb caused the "cpu-release-addr" device tree property to contain the physical RAM location that secondary cores were spinning at. Previously, the "cpu-release-addr" property contained a value referencing the boot page translation address range of 0xfffffxxx, which then indirectly accessed RAM. The "cpu-release-addr" is currently ioremapped and the secondary cores kicked. However, due to the recent change in "cpu-release-addr", it sometimes points to a memory location in low memory that cannot be ioremapped. For example on a P2020-based board with 512MB of RAM the following error occurs on bootup: <...> mpic: requesting IPIs ... __ioremap(): phys addr 0x1ffff000 is RAM lr c05df9a0 Unable to handle kernel paging request for data at address 0x00000014 Faulting instruction address: 0xc05df9b0 Oops: Kernel access of bad area, sig: 11 [#1] SMP NR_CPUS=2 P2020 RDB Modules linked in: <... eventual kernel panic> Adding logic to conditionally ioremap or access memory directly resolves the issue. Signed-off-by: Peter Tyser <ptyser@xes-inc.com> Signed-off-by: Nate Case <ncase@xes-inc.com> Reported-by: Dipen Dudhat <B09055@freescale.com> Tested-by: Dipen Dudhat <B09055@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Mark some variables in the page fault path __read_mostly	Anton Blanchard	2010-02-03
\| \| \| \| \| \| \| \| \|	Using perf to trace L1 dcache misses and dumping data addresses I found a few variables taking a lot of misses. Since they are almost never written, they should go into the __read_mostly section. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Replace per_cpu(, smp_processor_id()) with __get_cpu_var()	Anton Blanchard	2010-02-03
\| \| \| \| \| \| \| \|	The cputime code has a few places that do per_cpu(, smp_processor_id()). Replace them with __get_cpu_var(). Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc: Simplify param.h by including <asm-generic/param.h>	Robert P. J. Day	2010-02-03
\| \| \| \| \|	Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	powerpc/viodasd: Remove VIOD_KERN_<level> macros for printks	Joe Perches	2010-02-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Use #define pr_fmt(fmt) "viod: " fmt Remove #define VIOD_KERN_WARNING and VIOD_KERN_INFO Convert printk(VIOD_KERN_<level> to pr_<level> Coalesce long format strings Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> drivers/block/viodasd.c \| 86 +++++++++++++++++++--------------------------- 1 files changed, 36 insertions(+), 50 deletions(-) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
*	Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-block	Linus Torvalds	2010-02-02
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'for-linus' of git://git.kernel.dk/linux-2.6-block: cfq-iosched: Do not idle on async queues blk-cgroup: Fix potential deadlock in blk-cgroup block: fix bugs in bio-integrity mempool usage block: fix bio_add_page for non trivial merge_bvec_fn case drbd: null dereference bug drbd: fix max_segment_size initialization
\| *	cfq-iosched: Do not idle on async queues	Vivek Goyal	2010-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Few weeks back, Shaohua Li had posted similar patch. I am reposting it with more test results. This patch does two things. - Do not idle on async queues. - It also changes the write queue depth CFQ drives (cfq_may_dispatch()). Currently, we seem to driving queue depth of 1 always for WRITES. This is true even if there is only one write queue in the system and all the logic of infinite queue depth in case of single busy queue as well as slowly increasing queue depth based on last delayed sync request does not seem to be kicking in at all. This patch will allow deeper WRITE queue depths (subjected to the other WRITE queue depth contstraints like cfq_quantum and last delayed sync request). Shaohua Li had reported getting more out of his SSD. For me, I have got one Lun exported from an HP EVA and when pure buffered writes are on, I can get more out of the system. Following are test results of pure buffered writes (with end_fsync=1) with vanilla and patched kernel. These results are average of 3 sets of run with increasing number of threads. AVERAGE[bufwfs][vanilla] ------- job Set NR ReadBW(KB/s) MaxClat(us) WriteBW(KB/s) MaxClat(us) --- --- -- ------------ ----------- ------------- ----------- bufwfs 3 1 0 0 95349 474141 bufwfs 3 2 0 0 100282 806926 bufwfs 3 4 0 0 109989 2.7301e+06 bufwfs 3 8 0 0 116642 3762231 bufwfs 3 16 0 0 118230 6902970 AVERAGE[bufwfs] [patched kernel] ------- bufwfs 3 1 0 0 270722 404352 bufwfs 3 2 0 0 206770 1.06552e+06 bufwfs 3 4 0 0 195277 1.62283e+06 bufwfs 3 8 0 0 260960 2.62979e+06 bufwfs 3 16 0 0 299260 1.70731e+06 I also ran buffered writes along with some sequential reads and some buffered reads going on in the system on a SATA disk because the potential risk could be that we should not be driving queue depth higher in presence of sync IO going to keep the max clat low. With some random and sequential reads going on in the system on one SATA disk I did not see any significant increase in max clat. So it looks like other WRITE queue depth control logic is doing its job. Here are the results. AVERAGE[brr, bsr, bufw together] [vanilla] ------- job Set NR ReadBW(KB/s) MaxClat(us) WriteBW(KB/s) MaxClat(us) --- --- -- ------------ ----------- ------------- ----------- brr 3 1 850 546345 0 0 bsr 3 1 14650 729543 0 0 bufw 3 1 0 0 23908 8274517 brr 3 2 981.333 579395 0 0 bsr 3 2 14149.7 1175689 0 0 bufw 3 2 0 0 21921 1.28108e+07 brr 3 4 898.333 1.75527e+06 0 0 bsr 3 4 12230.7 1.40072e+06 0 0 bufw 3 4 0 0 19722.3 2.4901e+07 brr 3 8 900 3160594 0 0 bsr 3 8 9282.33 1.91314e+06 0 0 bufw 3 8 0 0 18789.3 23890622 AVERAGE[brr, bsr, bufw mixed] [patched kernel] ------- job Set NR ReadBW(KB/s) MaxClat(us) WriteBW(KB/s) MaxClat(us) --- --- -- ------------ ----------- ------------- ----------- brr 3 1 837 417973 0 0 bsr 3 1 14357.7 591275 0 0 bufw 3 1 0 0 24869.7 8910662 brr 3 2 1038.33 543434 0 0 bsr 3 2 13351.3 1205858 0 0 bufw 3 2 0 0 18626.3 13280370 brr 3 4 913 1.86861e+06 0 0 bsr 3 4 12652.3 1430974 0 0 bufw 3 4 0 0 15343.3 2.81305e+07 brr 3 8 890 2.92695e+06 0 0 bsr 3 8 9635.33 1.90244e+06 0 0 bufw 3 8 0 0 17200.3 24424392 So looks like it might make sense to include this patch. Thanks Vivek Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| *	blk-cgroup: Fix potential deadlock in blk-cgroup	Gui Jianfeng	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I triggered a lockdep warning as following. ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.33-rc2 #1 ------------------------------------------------------- test_io_control/7357 is trying to acquire lock: (blkio_list_lock){+.+...}, at: [<c053a990>] blkiocg_weight_write+0x82/0x9e but task is already holding lock: (&(&blkcg->lock)->rlock){......}, at: [<c053a949>] blkiocg_weight_write+0x3b/0x9e which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&(&blkcg->lock)->rlock){......}: [<c04583b7>] validate_chain+0x8bc/0xb9c [<c0458dba>] __lock_acquire+0x723/0x789 [<c0458eb0>] lock_acquire+0x90/0xa7 [<c0692b0a>] _raw_spin_lock_irqsave+0x27/0x5a [<c053a4e1>] blkiocg_add_blkio_group+0x1a/0x6d [<c053cac7>] cfq_get_queue+0x225/0x3de [<c053eec2>] cfq_set_request+0x217/0x42d [<c052c8a6>] elv_set_request+0x17/0x26 [<c0532a0f>] get_request+0x203/0x2c5 [<c0532ae9>] get_request_wait+0x18/0x10e [<c0533470>] __make_request+0x2ba/0x375 [<c0531985>] generic_make_request+0x28d/0x30f [<c0532da7>] submit_bio+0x8a/0x8f [<c04d827a>] submit_bh+0xf0/0x10f [<c04d91d2>] ll_rw_block+0xc0/0xf9 [<f86e9705>] ext3_find_entry+0x319/0x544 [ext3] [<f86eae58>] ext3_lookup+0x2c/0xb9 [ext3] [<c04c3e1b>] do_lookup+0xd3/0x172 [<c04c56c8>] link_path_walk+0x5fb/0x95c [<c04c5a65>] path_walk+0x3c/0x81 [<c04c5b63>] do_path_lookup+0x21/0x8a [<c04c66cc>] do_filp_open+0xf0/0x978 [<c04c0c7e>] open_exec+0x1b/0xb7 [<c04c1436>] do_execve+0xbb/0x266 [<c04081a9>] sys_execve+0x24/0x4a [<c04028a2>] ptregs_execve+0x12/0x18 -> #1 (&(&q->__queue_lock)->rlock){..-.-.}: [<c04583b7>] validate_chain+0x8bc/0xb9c [<c0458dba>] __lock_acquire+0x723/0x789 [<c0458eb0>] lock_acquire+0x90/0xa7 [<c0692b0a>] _raw_spin_lock_irqsave+0x27/0x5a [<c053dd2a>] cfq_unlink_blkio_group+0x17/0x41 [<c053a6eb>] blkiocg_destroy+0x72/0xc7 [<c0467df0>] cgroup_diput+0x4a/0xb2 [<c04ca473>] dentry_iput+0x93/0xb7 [<c04ca4b3>] d_kill+0x1c/0x36 [<c04cb5c5>] dput+0xf5/0xfe [<c04c6084>] do_rmdir+0x95/0xbe [<c04c60ec>] sys_rmdir+0x10/0x12 [<c04027cc>] sysenter_do_call+0x12/0x32 -> #0 (blkio_list_lock){+.+...}: [<c0458117>] validate_chain+0x61c/0xb9c [<c0458dba>] __lock_acquire+0x723/0x789 [<c0458eb0>] lock_acquire+0x90/0xa7 [<c06929fd>] _raw_spin_lock+0x1e/0x4e [<c053a990>] blkiocg_weight_write+0x82/0x9e [<c0467f1e>] cgroup_file_write+0xc6/0x1c0 [<c04bd2f3>] vfs_write+0x8c/0x116 [<c04bd7c6>] sys_write+0x3b/0x60 [<c04027cc>] sysenter_do_call+0x12/0x32 other info that might help us debug this: 1 lock held by test_io_control/7357: #0: (&(&blkcg->lock)->rlock){......}, at: [<c053a949>] blkiocg_weight_write+0x3b/0x9e stack backtrace: Pid: 7357, comm: test_io_control Not tainted 2.6.33-rc2 #1 Call Trace: [<c045754f>] print_circular_bug+0x91/0x9d [<c0458117>] validate_chain+0x61c/0xb9c [<c0458dba>] __lock_acquire+0x723/0x789 [<c0458eb0>] lock_acquire+0x90/0xa7 [<c053a990>] ? blkiocg_weight_write+0x82/0x9e [<c06929fd>] _raw_spin_lock+0x1e/0x4e [<c053a990>] ? blkiocg_weight_write+0x82/0x9e [<c053a990>] blkiocg_weight_write+0x82/0x9e [<c0467f1e>] cgroup_file_write+0xc6/0x1c0 [<c0454df5>] ? trace_hardirqs_off+0xb/0xd [<c044d93a>] ? cpu_clock+0x2e/0x44 [<c050e6ec>] ? security_file_permission+0xf/0x11 [<c04bcdda>] ? rw_verify_area+0x8a/0xad [<c0467e58>] ? cgroup_file_write+0x0/0x1c0 [<c04bd2f3>] vfs_write+0x8c/0x116 [<c04bd7c6>] sys_write+0x3b/0x60 [<c04027cc>] sysenter_do_call+0x12/0x32 To prevent deadlock, we should take locks as following sequence: blkio_list_lock -> queue_lock -> blkcg_lock. The following patch should fix this bug. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| *	block: fix bugs in bio-integrity mempool usage	Chuck Ebbert	2010-01-30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix two bugs in the bio integrity code: use_bip_pool() always returns 0 because it checks against the wrong limit, causing the mempool to be used only when regular allocation fails. When the mempool is used as a fallback we don't free the data properly. Signed-Off-By: Chuck Ebbert <cebbert@redhat.com> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| *	block: fix bio_add_page for non trivial merge_bvec_fn case	Dmitry Monakhov	2010-01-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We have to properly decrease bi_size in order to merge_bvec_fn return right result. Otherwise this result in false merge rejects for two absolutely valid bio_vecs. This may cause significant performance penalty for example fs_block_size == 1k and block device is raid0 with small chunk_size = 8k. Then it is impossible to merge 7-th fs-block in to bio which already has 6 fs-blocks. Cc: <stable@kernel.org> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
\| *	Merge branch 'for-jens' of git://git.drbd.org/linux-2.6-drbd into for-linus	Jens Axboe	2010-01-25
\| \|\
\| \| *	drbd: null dereference bug	Dan Carpenter	2010-01-25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	epoch is always NULL here. Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
\| \| *	drbd: fix max_segment_size initialization	Lars Ellenberg	2010-01-22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	blk_queue_make_request() internally calls blk_set_default_limits(), so calling blk_queue_max_segment_size() before is useless. Ergo: move the call to blk_queue_max_segment_size() down a few lines. Impact: If, after a fresh modprobe, you first connect a Diskless drbd, then attach, this could result in a DRBD Protocol Error at first. The next connection attempt would then succeeded. Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* \| \|	mm: purge fragmented percpu vmap blocks	Nick Piggin	2010-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improve handling of fragmented per-CPU vmaps. We previously don't free up per-CPU maps until all its addresses have been used and freed. So fragmented blocks could fill up vmalloc space even if they actually had no active vmap regions within them. Add some logic to allow all CPUs to have these blocks purged in the case of failure to allocate a new vm area, and also put some logic to trim such blocks of a current CPU if we hit them in the allocation path (so as to avoid a large build up of them). Christoph reported some vmap allocation failures when using the per CPU vmap APIs in XFS, which cannot be reproduced after this patch and the previous bug fix. Cc: linux-mm@kvack.org Cc: stable@kernel.org Tested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Nick Piggin <npiggin@suse.de> -- Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \|	mm: percpu-vmap fix RCU list walking	Nick Piggin	2010-02-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RCU list walking of the per-cpu vmap cache was broken. It did not use RCU primitives, and also the union of free_list and rcu_head is obviously wrong (because free_list is indeed the list we are RCU walking). While we are there, remove a couple of unused fields from an earlier iteration. These APIs aren't actually used anywhere, because of problems with the XFS conversion. Christoph has now verified that the problems are solved with these patches. Also it is an exported interface, so I think it will be good to be merged now (and Christoph wants to get the XFS changes into their local tree). Cc: stable@kernel.org Cc: linux-mm@kvack.org Tested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Nick Piggin <npiggin@suse.de> -- Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6	Linus Torvalds	2010-02-02
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: random: Remove unused inode variable crypto: padlock-sha - Add import/export support random: drop weird m_time/a_time manipulation
\| * \| \|	random: Remove unused inode variable	Herbert Xu	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The previous changeset left behind an unused inode variable. This patch removes it. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \| \|	crypto: padlock-sha - Add import/export support	Herbert Xu	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the padlock driver for SHA uses a software fallback to perform partial hashing, it must implement custom import/export functions. Otherwise hmac which depends on import/export for prehashing will not work with padlock-sha. Reported-by: Wolfgang Walter <wolfgang.walter@stwm.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
\| * \| \|	random: drop weird m_time/a_time manipulation	Matt Mackall	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No other driver does anything remotely like this that I know of except for the tty drivers, and I can't see any reason for random/urandom to do it. In fact, it's a (trivial, harmless) timing information leak. And obviously, it generates power- and flash-cycle wasting I/O, especially if combined with something like hwrngd. Also, it breaks ubifs's expectations. Signed-off-by: Matt Mackall <mpm@selenic.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
* \| \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes	Linus Torvalds	2010-02-02
\|\ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes: GFS2: Use GFP_NOFS for alloc structure GFS2: Fix previous patch GFS2: Don't withdraw on partial rindex entries GFS2: Fix refcnt leak on gfs2_follow_link() error path
\| * \| \| \|	GFS2: Use GFP_NOFS for alloc structure	Steven Whitehouse	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is called under a glock, so its a good plan to use GFP_NOFS Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \| \| \|	GFS2: Fix previous patch	Steven Whitehouse	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The do_div() call needs to remain. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \| \| \|	GFS2: Don't withdraw on partial rindex entries	Benjamin Marzinski	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ince gfs2 writes the rindex file a block at a time, and releases the exclusive lock after each block, it is possible that another process will grab the lock in the middle of the write. Since rindex entries are not an even divisor of blocks, that other process may see partial entries. On grows, this is fine. The process can simply ignore the the partial entires. Previously, the code withdrew when it saw partial entries. Now it simply ignores them. Signed-off-by: Benjamin Marzinski <bmarzins@redhat.com> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
\| * \| \| \|	GFS2: Fix refcnt leak on gfs2_follow_link() error path	OGAWA Hirofumi	2010-01-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If ->follow_link handler return the error, it should decrement nd->path refcnt. This patch fix it. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
* \| \| \| \|	Merge branch 'sh/for-2.6.33' of ↵	Linus Torvalds	2010-02-02
\|\ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: sh: Fix access to released memory in clk_debugfs_register_one() sh: Fix access to released memory in dwarf_unwinder_cleanup() usb: r8a66597-hdc disable interrupts fix spi: spi_sh_msiof: Fixed data sampling on the correct edge
\| * \| \| \| \|	sh: Fix access to released memory in clk_debugfs_register_one()	Marek Skuczynski	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Marek Skuczynski <mareksk7@gmail.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
\| * \| \| \| \|	sh: Fix access to released memory in dwarf_unwinder_cleanup()	Marek Skuczynski	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Marek Skuczynski <mareksk7@gmail.com> Acked-by: Matt Fleming <matt@console-pimps.org> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
\| * \| \| \| \|	usb: r8a66597-hdc disable interrupts fix	Magnus Damm	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch improves disable_controller() in the r8a66597-hdc driver to disable all interrupts and clear status flags. It also makes sure that disable_controller() is called during probe(). This fixes the relatively rare case of unexpected pending interrupts after kexec reboot. Signed-off-by: Magnus Damm <damm@opensource.se> Acked-by: Yoshihiro Shimoda <shimoda.yoshihiro@renesas.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
\| * \| \| \| \|	spi: spi_sh_msiof: Fixed data sampling on the correct edge	Markus Pietrek	2010-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The spi_sh_msiof.c driver presently misconfigures REDG and TEDG. TEDG==0 outputs data at the rising edge of the clock and REDG==0 samples data at the falling edge of the clock. Therefore for SPI, TEDG must be equal to REDG, otherwise the last byte received is not sampled in SPI mode 3. This brings the driver in line with the SH7723 HW Reference Manual settings documented in Figures 20.20 and 20.21 ("SPI Clock and data timing"). Signed-off-by: Markus Pietrek <Markus.Pietrek@emtrion.de> Acked-by: Magnus Damm <damm@opensource.se> Signed-off-by: Paul Mundt <lethal@linux-sh.org>
* \| \| \| \| \|	Merge branch 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus	Linus Torvalds	2010-02-02
\|\ \ \ \ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus: MIPS: 64-bit: Detect virtual memory size MIPS: AR7: Fix USB slave mem range typo MIPS: Alchemy: Fix dbdma ring destruction memory debugcheck.