aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* powerpc/85xx: P2020 DTS: re-organize dts filesPrabhakar Kushwaha2011-05-19
| | | | | | | | Creates P2020si.dtsi, containing information for P2020 SoC. Modifies dts files for P2020 based systems to use dtsi file. Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/85xx: P1020 DTS : re-organize dts filesPrabhakar Kushwaha2011-05-19
| | | | | | | | | Creates P1020si.dtsi, containing information for the P1020 SoC. Modifies dts files for P1020 based systems to use dtsi file Signed-off-by: Prabhakar Kushwaha <prabhakar@freescale.com> Acked-by: Grant Likely <grant.likelY@secretlab.ca> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc: Adding bindings for flexcan controllerBhaskar Upadhaya2011-05-19
| | | | | | Signed-off-by: Bhaskar Upadhaya <bhaskar.upadhaya@freescale.com> Acked-By: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/fsl: enable verbose bug outputScott Wood2011-05-19
| | | | | | | | | | This debug option has no overhead other than a slight increase in kernel size, and makes bug reports more useful. While some end users may prefer to save the space, as a default on a kernel config aimed primarily at development on reference boards, it should be enabled. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/e5500: add networking to defconfigScott Wood2011-05-19
| | | | | | | | | | | | | | Even though support for the p5020's on-chip ethernet is not yet upstream, it is not appropriate to disable all networking support (including loopback, unix domain sockets, external ethernet devices, etc) in the defconfig. The networking settings are taken from mpc85xx_smp_defconfig, minus the drivers for ethernet devices not found on any current e5500 chip. The other changes are the result of running "make savedefconfig". Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/mpic: add the mpic global timer supportScott Wood2011-05-19
| | | | | | | | | | Add support for MPIC timers as requestable interrupt sources. Based on http://patchwork.ozlabs.org/patch/20941/ by Dave Liu. Signed-off-by: Dave Liu <daveliu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/mpic: parse 4-cell intspec types other than zeroScott Wood2011-05-19
| | | | | Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/p1022ds: fix broken mpic timer nodeScott Wood2011-05-19
| | | | | | | | | | | | There is no hardware interrupt 0xf7. But now we can express the timer interrupt using 4-cell interrupts. This requires converting all of the other interrupt specifiers in the tree as well. Also add the second timer group, and fix the reg property to only describe the timer registers. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc: Add fsl mpic timer bindingScott Wood2011-05-19
| | | | | | | | | | | | | | | | | | Update the existing example in the general mpic binding to have a separate TCRx region. Currently the example doesn't describe TCRx at all. The one upstream device tree with an mpic timer node (p1022ds) uses one large reg region to describe both, even though there are other unrelated registers in between. That device tree also contains a bogus interrupt specifier, and there's no upstream software that uses this yet, so changing this shouldn't be a problem. Add a full binding for the MPIC timer node, not just an example of 4-cell interrupts in the MPIC binding. Add fsl,available-ranges, similar to msi-available-ranges. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/86xx: don't pretend that we support 8-bit pixels on the MPC8610 HPCDTimur Tabi2011-05-19
| | | | | | | | | | | | | | If the video mode is set to 16-, 24-, or 32-bit pixels, then the pixel data contains actual levels of red, blue, and green. However, if the video mode is set to 8-bit pixels, then the 8-bit value represents an index into color table. This is called "palette mode" on the Freescale DIU video controller. The DIU driver does not currently support palette mode, but the MPC8610 HPCD board file returned a non-zero (although incorrect) pixel format value for 8-bit mode. Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/mpc8610_hpcd: Do not use "/" in interrupt namesGeert Uytterhoeven2011-05-19
| | | | | | | | | It may trigger a warning in fs/proc/generic.c:__xlate_proc_name() when trying to add an entry for the interrupt handler to sysfs. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/e5500: set non-base IVORsScott Wood2011-05-19
| | | | | | | | | Without this, we attempt to use doorbells for IPIs, and end up branching to some bad address. Plus, even for the exceptions we don't implement, it's good to handle it and get a message out. Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* powerpc/fsl-booke64: Add support for Debug Level exception handlerKumar Gala2011-05-19
| | | | Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
* Merge remote branch 'benh/merge' into benh-nextKumar Gala2011-05-19
|\
| * powerpc/4xx: Fix regression in SMP on 476kerstin jonsson2011-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | commit c56e58537d504706954a06570b4034c04e5b7500 breaks SMP support in PPC_47x chip. secondary_ti must be set to current thread info before callin kick_cpu or else start_secondary_47x will jump into void when trying to return to c-code. In the current setup secondary_ti is initialized before the CPU idle task is started and only the boot core will start. I am not sure this is the correct solution, but it makes SMP possible in my chip. Note! The HOTPLUG support probably need some fixing to, There is no trampoline code available in head_44x.S - start_secondary_resume? Signed-off-by: Kerstin Jonsson <kerstin.jonsson@ericsson.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/kexec: Fix build failure on 32-bit SMPBen Hutchings2011-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b987812b3fcaf70fdf0037589e5d2f5f2453e6ce left crash_kexec_wait_realmode() undefined for UP. Commit 7c7a81b53e581d727d069cc45df5510516faac31 defined it for UP but left it undefined for 32-bit SMP. Seems like people are getting confused by nested #ifdef's, so move the definitions of crash_kexec_wait_realmode() after the #ifdef CONFIG_SMP section. Compile-tested with 32-bit UP, 32-bit SMP and 64-bit SMP configurations. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Tested-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * powerpc/smp: Make start_secondary_resume available to all CPU variantsBenjamin Herrenschmidt2011-05-18
| | | | | | | | | | | | This should fix SMP & Hotplug builds on FSL BookE and 476 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
| * Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6Linus Torvalds2011-05-18
| |\ | | | | | | | | | | | | | | | * 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6: drivercore: revert addition of of_match to struct device of: fix race when matching drivers
| | * drivercore: revert addition of of_match to struct deviceGrant Likely2011-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit b826291c, "drivercore/dt: add a match table pointer to struct device" added an of_match pointer to struct device to cache the of_match_table entry discovered at driver match time. This was unsafe because matching is not an atomic operation with probing a driver. If two or more drivers are attempted to be matched to a driver at the same time, then the cached matching entry pointer could get overwritten. This patch reverts the of_match cache pointer and reworks all users to call of_match_device() directly instead. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
| | * of: fix race when matching driversMilton Miller2011-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If two drivers are probing devices at the same time, both will write their match table result to the dev->of_match cache at the same time. Only write the result if the device matches. In a thread titled "SBus devices sometimes detected, sometimes not", Meelis reported his SBus hme was not detected about 50% of the time. From the debug suggested by Grant it was obvious another driver matched some devices between the call to match the hme and the hme discovery failling. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Milton Miller <miltonm@bga.com> [grant.likely: modified to only call of_match_device() once] Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
| * | Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linusLinus Torvalds2011-05-18
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: MIPS: Kludge IP27 build for 2.6.39. MIPS: AR7: Fix GPIO register size for Titan variant. MIPS: Fix duplicate invocation of notify_die. MIPS: RB532: Fix iomap resource size miscalculation.
| | * | MIPS: Kludge IP27 build for 2.6.39.Ralf Baechle2011-05-18
| | | | | | | | | | | | | | | | Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: AR7: Fix GPIO register size for Titan variant.Florian Fainelli2011-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The 'size' variable contains the correct register size for both AR7 and Titan, but we never used it to ioremap the correct register size. This problem only shows up on Titan. [ralf@linux-mips.org: Fixed the fix. The original patch as in patchwork recognizes the problem correctly then fails to fix it ...] Reported-by: Alexander Clouter <alex@digriz.org.uk> Signed-off-by: Florian Fainelli <florian@openwrt.org> Patchwork: https://patchwork.linux-mips.org/patch/2380/ Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| | * | MIPS: Fix duplicate invocation of notify_die.Ralf Baechle2011-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Initial patch by Yury Polyanskiy <ypolyans@princeton.edu>. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Patchwork: https://patchwork.linux-mips.org/patch/2373/
| | * | MIPS: RB532: Fix iomap resource size miscalculation.Ralf Baechle2011-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the MIPS portion of Joe Perches <joe@perches.com>'s https://patchwork.linux-mips.org/patch/2172/ which seems to have been lost in time and space. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
| * | | Merge branch 'for-linus' of git://git.kernel.dk/linux-2.6-blockLinus Torvalds2011-05-18
| |\ \ \ | | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: don't delay blk_run_queue_async scsi: remove performance regression due to async queue run blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup block: rescan partitions on invalidated devices on -ENOMEDIA too cdrom: always check_disk_change() on open block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers
| | * | block: don't delay blk_run_queue_asyncShaohua Li2011-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Let's check a scenario: 1. blk_delay_queue(q, SCSI_QUEUE_DELAY); 2. blk_run_queue_async(); the second one will became a noop, because q->delay_work already has WORK_STRUCT_PENDING_BIT set, so the delayed work will still run after SCSI_QUEUE_DELAY. But blk_run_queue_async actually hopes the delayed work runs immediately. Fix this by doing a cancel on potentially pending delayed work before queuing an immediate run of the workqueue. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * | scsi: remove performance regression due to async queue runJens Axboe2011-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit c21e6beb removed our queue request_fn re-enter protection, and defaulted to always running the queues from kblockd to be safe. This was a known potential slow down, but should be safe. Unfortunately this is causing big performance regressions for some, so we need to improve this logic. Looking into the details of the re-enter, the real issue is on requeue of requests. Requeue of requests upon seeing a BUSY condition from the device ends up re-running the queue, causing traces like this: scsi_request_fn() scsi_dispatch_cmd() scsi_queue_insert() __scsi_queue_insert() scsi_run_queue() scsi_request_fn() ... potentially causing the issue we want to avoid. So special case the requeue re-run of the queue, but improve it to offload the entire run of local queue and starved queue from a single workqueue callback. This is a lot better than potentially kicking off a workqueue run for each device seen. This also fixes the issue of the local device going into recursion, since the above mentioned commit never moved that queue run out of line. Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * | blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroupVivek Goyal2011-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currentlly we first map the task to cgroup and then cgroup to blkio_cgroup. There is a more direct way to get to blkio_cgroup from task using task_subsys_state(). Use that. The real reason for the fix is that it also avoids a race in generic cgroup code. During remount/umount rebind_subsystems() is called and it can do following with and rcu protection. cgrp->subsys[i] = NULL; That means if somebody got hold of cgroup under rcu and then it tried to do cgroup->subsys[] to get to blkio_cgroup, it would get NULL which is wrong. I was running into this race condition with ltp running on a upstream derived kernel and that lead to crash. So ideally we should also fix cgroup generic code to wait for rcu grace period before setting pointer to NULL. Li Zefan is not very keen on introducing synchronize_wait() as he thinks it will slow down moun/remount/umount operations. So for the time being atleast fix the kernel crash by taking a more direct route to blkio_cgroup. One tester had reported a crash while running LTP on a derived kernel and with this fix crash is no more seen while the test has been running for over 6 days. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * | block: rescan partitions on invalidated devices on -ENOMEDIA tooTejun Heo2011-04-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __blkdev_get() doesn't rescan partitions if disk->fops->open() fails, which leads to ghost partition devices lingering after medimum removal is known to both the kernel and userland. The behavior also creates a subtle inconsistency where O_NONBLOCK open, which doesn't fail even if there's no medium, clears the ghots partitions, which is exploited to work around the problem from userland. Fix it by updating __blkdev_get() to issue partition rescan after -ENOMEDIA too. This was reported in the following bz. https://bugzilla.kernel.org/show_bug.cgi?id=13029 Stable: 2.6.38 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Zeuthen <zeuthen@gmail.com> Reported-by: Martin Pitt <martin.pitt@ubuntu.com> Reported-by: Kay Sievers <kay.sievers@vrfy.org> Tested-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * | cdrom: always check_disk_change() on openTejun Heo2011-04-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | cdrom_open() called check_disk_change() after the rest of open path succeeded which leads to the following bizarre behavior. * After media change, if the device opened without O_NONBLOCK, open_for_data() naturally fails with -ENOMEDIA and check_disk_change() is never called. The media is known to be gone and the open failure makes it obvious to the userland but device invalidation never happens. * But if the device is opened with O_NONBLOCK, all the checks are bypassed and cdrom_open() doesn't notice that the media is not there and check_disk_change() is called and invalidation happens. There's nothing to be gained by avoiding calling check_disk_change() on open failure. Common cases end up calling check_disk_change() anyway. All we get is inconsistent behavior. Fix it by moving check_disk_change() invocation to the top of cdrom_open() so that it always gets called regardless of how the rest of open proceeds. Stable: 2.6.38 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Amit Shah <amit.shah@redhat.com> Tested-by: Amit Shah <amit.shah@redhat.com> Cc: stable@kernel.org Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| | * | block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe driversTejun Heo2011-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In-kernel disk event polling doesn't matter for legacy/fringe drivers and may lead to infinite event loop if ->check_events() implementation generates events on level condition instead of edge. Now that block layer supports suppressing exporting unlisted events, simply leaving disk->events cleared allows these drivers to keep the internal revalidation behavior intact while avoiding weird interactions with userland event handler. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
| * | | Merge branch 'v4l_for_linus' of ↵Linus Torvalds2011-05-18
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6 * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: [media] V4L: soc-camera: regression fix: calculate .sizeimage in soc_camera.c [media] v4l2-subdev: fix broken subdev control enumeration [media] Fix cx88 remote control input [media] v4l: Release module if subdev registration fails
| | * | | [media] V4L: soc-camera: regression fix: calculate .sizeimage in soc_camera.cSergio Aguirre2011-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent patch has given individual soc-camera host drivers a possibility to calculate .sizeimage and .bytesperline pixel format fields internally, however, some drivers relied on the core calculating these values for them, following a default algorithm. This patch restores the default calculation for such drivers. Based on initial patch by Guennadi Liakhovetski, found here: http://www.spinics.net/lists/linux-media/msg31282.html Except that this covers try_fmt aswell. Signed-off-by: Sergio Aguirre <saaguirre@ti.com> Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| | * | | [media] v4l2-subdev: fix broken subdev control enumerationHans Verkuil2011-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The v4l2_subdev_* functions are meant for older V4L2 drivers that do not use the control framework yet. These functions should not be used by subdev_do_ioctl. Most of those backwards compatibility functions are just stubs, but commit 87a0c94ce616b231f3c0bd09d7dbd39d43b0557a actually changed the behavior of v4l2_subdev_queryctrl, so calling that one from subdev_do_ioctl broke the control enumeration in subdev nodes. The fix is simply not to use those compatibility functions in v4l2-subdev.c. Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| | * | | [media] Fix cx88 remote control inputLawrence Rust2011-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the IR interrupt handler of cx88-input.c there's a 32-bit multiply overflow which causes IR pulse durations to be incorrectly calculated. This is a regression caused by commit 2997137be8eba. Cc: stable@kernel.org Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| | * | | [media] v4l: Release module if subdev registration failsLaurent Pinchart2011-05-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If v4l2_device_register_subdev() fails, the reference to the subdev module taken by the function isn't released. Fix this. Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: stable@kernel.org Acked-by: Hans Verkuil <hverkuil@xs4all.nl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | | | Merge branch 'x86-fixes-for-linus' of ↵Linus Torvalds2011-05-18
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, AMD: Fix ARAT feature setting again Revert "x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors" x86, apic: Fix spurious error interrupts triggering on all non-boot APs x86, mce, AMD: Fix leaving freed data in a list x86: Fix UV BAU for non-consecutive nasids x86, UV: Fix NMI handler for UV platforms
| | * | | | x86, AMD: Fix ARAT feature setting againBorislav Petkov2011-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trying to enable the local APIC timer on early K8 revisions uncovers a number of other issues with it, in conjunction with the C1E enter path on AMD. Fixing those causes much more churn and troubles than the benefit of using that timer brings so don't enable it on K8 at all, falling back to the original functionality the kernel had wrt to that. Reported-and-bisected-by: Nick Bowler <nbowler@elliptictech.com> Cc: Boris Ostrovsky <Boris.Ostrovsky@amd.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Cc: Hans Rosenfeld <hans.rosenfeld@amd.com> Cc: Nick Bowler <nbowler@elliptictech.com> Cc: Joerg-Volker-Peetz <jvpeetz@web.de> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Link: http://lkml.kernel.org/r/1305636919-31165-3-git-send-email-bp@amd64.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | Revert "x86, AMD: Fix APIC timer erratum 400 affecting K8 Rev.A-E processors"Borislav Petkov2011-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit e20a2d205c05cef6b5783df339a7d54adeb50962, as it crashes certain boxes with specific AMD CPU models. Moving the lower endpoint of the Erratum 400 check to accomodate earlier K8 revisions (A-E) opens a can of worms which is simply not worth to fix properly by tweaking the errata checking framework: * missing IntPenging MSR on revisions < CG cause #GP: http://marc.info/?l=linux-kernel&m=130541471818831 * makes earlier revisions use the LAPIC timer instead of the C1E idle routine which switches to HPET, thus not waking up in deeper C-states: http://lkml.org/lkml/2011/4/24/20 Therefore, leave the original boundary starting with K8-revF. Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, apic: Fix spurious error interrupts triggering on all non-boot APsYouquan Song2011-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a bug reported by a customer, who found that many unreasonable error interrupts reported on all non-boot CPUs (APs) during the system boot stage. According to Chapter 10 of Intel Software Developer Manual Volume 3A, Local APIC may signal an illegal vector error when an LVT entry is set as an illegal vector value (0~15) under FIXED delivery mode (bits 8-11 is 0), regardless of whether the mask bit is set or an interrupt actually happen. These errors are seen as error interrupts. The initial value of thermal LVT entries on all APs always reads 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and LVT registers are reset to 0s except for the mask bits which are set to 1s when APs receive INIT IPI. When the BIOS takes over the thermal throttling interrupt, the LVT thermal deliver mode should be SMI and it is required from the kernel to keep AP's LVT thermal monitoring register programmed as such as well. This issue happens when BIOS does not take over thermal throttling interrupt, AP's LVT thermal monitor register will be restored to 0x10000 which means vector 0 and fixed deliver mode, so all APs will signal illegal vector error interrupts. This patch check if interrupt delivery mode is not fixed mode before restoring AP's LVT thermal monitor register. Signed-off-by: Youquan Song <youquan.song@intel.com> Acked-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Yong Wang <yong.y.wang@intel.com> Cc: hpa@linux.intel.com Cc: joe@perches.com Cc: jbaron@redhat.com Cc: trenn@suse.de Cc: kent.liu@intel.com Cc: chaohong.guo@intel.com Cc: <stable@kernel.org> # As far back as possible Link: http://lkml.kernel.org/r/1303402963-17738-1-git-send-email-youquan.song@intel.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, mce, AMD: Fix leaving freed data in a listJulia Lawall2011-05-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | b may be added to a list, but is not removed before being freed in the case of an error. This is done in the corresponding deallocation function, so the code here has been changed to follow that. The sematic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression E,E1,E2; identifier l; @@ *list_add(&E->l,E1); ... when != E1 when != list_del(&E->l) when != list_del_init(&E->l) when != E = E2 *kfree(E);// </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Cc: Borislav Petkov <borislav.petkov@amd.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/1305294731-12127-1-git-send-email-julia@diku.dk Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86: Fix UV BAU for non-consecutive nasidsCliff Wickman2011-05-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fix for the SGI Altix-UV Broadcast Assist Unit code, which is used for TLB flushing. Certain hardware configurations (that customers are ordering) cause nasids (numa address space id's) to be non-consecutive. Specifically, once you have more than 4 blades in a IRU (Individual Rack Unit - or 1/2 rack) but less than the maximum of 16, the nasid numbering becomes non-consecutive. This currently results in a 'catastrophic error' (CATERR) detected by the firmware during OS boot. The BAU is generating an 'INTD' request that is targeting a non-existent nasid value. Such configurations may also occur when a blade is configured off because of hardware errors. (There is one UV hub per blade.) This patch is required to support such configurations. The problem with the tlb_uv.c code is that is using the consecutive hub numbers as indices to the BAU distribution bit map. These are simply the ordinal position of the hub or blade within its partition. It should be using physical node numbers (pnodes), which correspond to the physical nasid values. Use of the hub number only works as long as the nasids in the partition are consecutive and increase with a stride of 1. This patch changes the index to be the pnode number, thus allowing nasids to be non-consecutive. It also provides a table in local memory for each cpu to translate target cpu number to target pnode and nasid. And it improves naming to properly reflect 'node' and 'uvhub' versus 'nasid'. Signed-off-by: Cliff Wickman <cpw@sgi.com> Cc: <stable@kernel.org> Link: http://lkml.kernel.org/r/E1QJmxX-0002Mz-Fk@eag09.americas.sgi.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| | * | | | x86, UV: Fix NMI handler for UV platformsJack Steiner2011-05-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes problems seen on UV systems handling NMIs from the node controller. I isolated the "dazed..." messages that I saw earlier to a bug in the BMC on our platform. It was sending NMIs w/o properly setting a register that indicated the source of NMI. So rather than _assuming_ any unhandled NMI came from the UV system maintenance console (SMC), add a check to verify that the SMC actually sent the NMI. Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: gorcunov@gmail.com Cc: dzickus@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds2011-05-18
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf evlist: Fix per thread mmap setup perf tools: Honour the cpu list parameter when also monitoring a thread list kprobes, x86: Disable irqs during optimized callback
| | * \ \ \ \ Merge branch 'perf/urgent' of ↵Ingo Molnar2011-05-15
| | |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent
| | | * | | | | perf evlist: Fix per thread mmap setupArnaldo Carvalho de Melo2011-05-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using --pid when monitoring multithreaded apps, as we can only share a ring buffer for events on the same thread if not doing per cpu. Fix it by using per thread ring buffers. Tested with: [root@felicio ~]# tuna -t 26131 -CP | nl 1 thread ctxt_switches 2 pid SCHED_ rtpri affinity voluntary nonvoluntary cmd 3 26131 OTHER 0 0,1 10814276 2397830 chromium-browse 4 642 OTHER 0 0,1 14688 0 chromium-browse 5 26148 OTHER 0 0,1 713602 115479 chromium-browse 6 26149 OTHER 0 0,1 801958 2262 chromium-browse 7 26150 OTHER 0 0,1 1271128 248 chromium-browse 8 26151 OTHER 0 0,1 3 0 chromium-browse 9 27049 OTHER 0 0,1 36796 9 chromium-browse 10 618 OTHER 0 0,1 14711 0 chromium-browse 11 661 OTHER 0 0,1 14593 0 chromium-browse 12 29048 OTHER 0 0,1 28125 0 chromium-browse 13 26143 OTHER 0 0,1 2202789 781 chromium-browse [root@felicio ~]# So 11 threads under pid 26131, then: [root@felicio ~]# perf record -F 50000 --pid 26131 [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# 11 mmaps, one per thread since we didn't specify any CPU list, so we need one mmap per thread and: [root@felicio ~]# perf record -F 50000 --pid 26131 ^M ^C[ perf record: Woken up 79 times to write data ] [ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 371310 26131 2 96516 26148 3 95694 26149 4 95203 26150 5 7291 26143 6 87 27049 7 76 661 8 60 29048 9 47 618 10 43 642 [root@felicio ~]# Ok, one of the threads, 26151 was quiescent, so no samples there, but all the others are there. Then, if I specify one CPU: [root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1 ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 8444 26131 2 2584 26149 3 2518 26148 4 2324 26150 5 123 26143 6 9 661 7 9 29048 [root@felicio ~]# This machine has two cores, so fewer threads appeared on the radar, and: [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Just one mmap, as now we can use just one per-cpu buffer instead of the per-thread needed in the previous case. For global profiling: [root@felicio ~]# perf record -F 50000 -a ^C[ perf record: Woken up 26 times to write data ] [ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ] [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] 2 7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# It uses per-cpu buffers. For just one thread: [root@felicio ~]# perf record -F 50000 --tid 26148 ^C[ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ] [root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl 1 9969 26148 [root@felicio ~]# [root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl 1 7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event] [root@felicio ~]# Tested-by: David Ahern <dsahern@gmail.com> Tested-by: Lin Ming <ming.m.lin@intel.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | | * | | | | perf tools: Honour the cpu list parameter when also monitoring a thread listArnaldo Carvalho de Melo2011-05-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The perf_evlist__create_maps was discarding the --cpu parameter when a --pid or --tid was specified, fix that. Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
| | * | | | | | kprobes, x86: Disable irqs during optimized callbackJiri Olsa2011-05-11
| | | |/ / / / | | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Disable irqs during optimized callback, so we dont miss any in-irq kprobes. The following commands: # cd /debug/tracing/ # echo "p mutex_unlock" >> kprobe_events # echo "p _raw_spin_lock" >> kprobe_events # echo "p smp_apic_timer_interrupt" >> ./kprobe_events # echo 1 > events/enable Cause the optimized kprobes to be missed. None is missed with the fix applied. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Link: http://lkml.kernel.org/r/20110511110613.GB2390@jolsa.brq.redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6Linus Torvalds2011-05-18
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: cifs: fix cifsConvertToUCS() for the mapchars case cifs: add fallback in is_path_accessible for old servers