aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* sched: stop the unbound recursion in preempt_schedule_context()Oleg Nesterov2014-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | preempt_schedule_context() does preempt_enable_notrace() at the end and this can call the same function again; exception_exit() is heavy and it is quite possible that need-resched is true again. 1. Change this code to dec preempt_count() and check need_resched() by hand. 2. As Linus suggested, we can use the PREEMPT_ACTIVE bit and avoid the enable/disable dance around __schedule(). But in this case we need to move into sched/core.c. 3. Cosmetic, but x86 forgets to declare this function. This doesn't really matter because it is only called by asm helpers, still it make sense to add the declaration into asm/preempt.h to match preempt_schedule(). Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Graf <agraf@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Lameter <cl@linux.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Chuck Ebbert <cebbert.lkml@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20141005202322.GB27962@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* sched/fair: Fix division by zero sysctl_numa_balancing_scan_sizeKirill Tkhai2014-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | File /proc/sys/kernel/numa_balancing_scan_size_mb allows writing of zero. This bash command reproduces problem: $ while :; do echo 0 > /proc/sys/kernel/numa_balancing_scan_size_mb; \ echo 256 > /proc/sys/kernel/numa_balancing_scan_size_mb; done divide error: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 24112 Comm: bash Not tainted 3.17.0+ #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88013c852600 ti: ffff880037a68000 task.ti: ffff880037a68000 RIP: 0010:[<ffffffff81074191>] [<ffffffff81074191>] task_scan_min+0x21/0x50 RSP: 0000:ffff880037a6bce0 EFLAGS: 00010246 RAX: 0000000000000a00 RBX: 00000000000003e8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88013c852600 RBP: ffff880037a6bcf0 R08: 0000000000000001 R09: 0000000000015c90 R10: ffff880239bf6c00 R11: 0000000000000016 R12: 0000000000003fff R13: ffff88013c852600 R14: ffffea0008d1b000 R15: 0000000000000003 FS: 00007f12bb048700(0000) GS:ffff88007da00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000001505678 CR3: 0000000234770000 CR4: 00000000000006f0 Stack: ffff88013c852600 0000000000003fff ffff880037a6bd18 ffffffff810741d1 ffff88013c852600 0000000000003fff 000000000002bfff ffff880037a6bda8 ffffffff81077ef7 ffffea0008a56d40 0000000000000001 0000000000000001 Call Trace: [<ffffffff810741d1>] task_scan_max+0x11/0x40 [<ffffffff81077ef7>] task_numa_fault+0x1f7/0xae0 [<ffffffff8115a896>] ? migrate_misplaced_page+0x276/0x300 [<ffffffff81134a4d>] handle_mm_fault+0x62d/0xba0 [<ffffffff8103e2f1>] __do_page_fault+0x191/0x510 [<ffffffff81030122>] ? native_smp_send_reschedule+0x42/0x60 [<ffffffff8106dc00>] ? check_preempt_curr+0x80/0xa0 [<ffffffff8107092c>] ? wake_up_new_task+0x11c/0x1a0 [<ffffffff8104887d>] ? do_fork+0x14d/0x340 [<ffffffff811799bb>] ? get_unused_fd_flags+0x2b/0x30 [<ffffffff811799df>] ? __fd_install+0x1f/0x60 [<ffffffff8103e67c>] do_page_fault+0xc/0x10 [<ffffffff8150d322>] page_fault+0x22/0x30 RIP [<ffffffff81074191>] task_scan_min+0x21/0x50 RSP <ffff880037a6bce0> ---[ end trace 9a826d16936c04de ]--- Also fix race in task_scan_min (it depends on compiler behaviour). Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dario Faggioli <raistlin@linux.it> Cc: David Rientjes <rientjes@google.com> Cc: Jens Axboe <axboe@fb.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/r/1413455977.24793.78.camel@tkhai Signed-off-by: Ingo Molnar <mingo@kernel.org>
* sched/fair: Care divide error in update_task_scan_period()Yasuaki Ishimatsu2014-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While offling node by hot removing memory, the following divide error occurs: divide error: 0000 [#1] SMP [...] Call Trace: [...] handle_mm_fault [...] ? try_to_wake_up [...] ? wake_up_state [...] __do_page_fault [...] ? do_futex [...] ? put_prev_entity [...] ? __switch_to [...] do_page_fault [...] page_fault [...] RIP [<ffffffff810a7081>] task_numa_fault RSP <ffff88084eb2bcb0> The issue occurs as follows: 1. When page fault occurs and page is allocated from node 1, task_struct->numa_faults_buffer_memory[] of node 1 is incremented and p->numa_faults_locality[] is also incremented as follows: o numa_faults_buffer_memory[] o numa_faults_locality[] NR_NUMA_HINT_FAULT_TYPES | 0 | 1 | ---------------------------------- ---------------------- node 0 | 0 | 0 | remote | 0 | node 1 | 0 | 1 | locale | 1 | ---------------------------------- ---------------------- 2. node 1 is offlined by hot removing memory. 3. When page fault occurs, fault_types[] is calculated by using p->numa_faults_buffer_memory[] of all online nodes in task_numa_placement(). But node 1 was offline by step 2. So the fault_types[] is calculated by using only p->numa_faults_buffer_memory[] of node 0. So both of fault_types[] are set to 0. 4. The values(0) of fault_types[] pass to update_task_scan_period(). 5. numa_faults_locality[1] is set to 1. So the following division is calculated. static void update_task_scan_period(struct task_struct *p, unsigned long shared, unsigned long private){ ... ratio = DIV_ROUND_UP(private * NUMA_PERIOD_SLOTS, (private + shared)); } 6. But both of private and shared are set to 0. So divide error occurs here. The divide error is rare case because the trigger is node offline. This patch always increments denominator for avoiding divide error. Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/54475703.8000505@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* sched/numa: Fix unsafe get_task_struct() in task_numa_assign()Kirill Tkhai2014-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unlocked access to dst_rq->curr in task_numa_compare() is racy. If curr task is exiting this may be a reason of use-after-free: task_numa_compare() do_exit() ... current->flags |= PF_EXITING; ... release_task() ... ~~delayed_put_task_struct()~~ ... schedule() rcu_read_lock() ... cur = ACCESS_ONCE(dst_rq->curr) ... ... rq->curr = next; ... context_switch() ... finish_task_switch() ... put_task_struct() ... __put_task_struct() ... free_task_struct() task_numa_assign() ... get_task_struct() ... As noted by Oleg: <<The lockless get_task_struct(tsk) is only safe if tsk == current and didn't pass exit_notify(), or if this tsk was found on a rcu protected list (say, for_each_process() or find_task_by_vpid()). IOW, it is only safe if release_task() was not called before we take rcu_read_lock(), in this case we can rely on the fact that delayed_put_pid() can not drop the (potentially) last reference until rcu_read_unlock(). And as Kirill pointed out task_numa_compare()->task_numa_assign() path does get_task_struct(dst_rq->curr) and this is not safe. The task_struct itself can't go away, but rcu_read_lock() can't save us from the final put_task_struct() in finish_task_switch(); this reference goes away without rcu gp>> The patch provides simple check of PF_EXITING flag. If it's not set, this guarantees that call_rcu() of delayed_put_task_struct() callback hasn't happened yet, so we can safely do get_task_struct() in task_numa_assign(). Locked dst_rq->lock protects from concurrency with the last schedule(). Reusing or unmapping of cur's memory may happen without it. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1413962231.19914.130.camel@tkhai Signed-off-by: Ingo Molnar <mingo@kernel.org>
* sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer()Juri Lelli2014-10-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dl_task_timer() is racy against several paths. Daniel noticed that the replenishment timer may experience a race condition against an enqueue_dl_entity() called from rt_mutex_setprio(). With his own words: rt_mutex_setprio() resets p->dl.dl_throttled. So the pattern is: start_dl_timer() throttled = 1, rt_mutex_setprio() throlled = 0, sched_switch() -> enqueue_task(), dl_task_timer-> enqueue_task() throttled is 0 => BUG_ON(on_dl_rq(dl_se)) fires as the scheduling entity is already enqueued on the -deadline runqueue. As we do for the other races, we just bail out in the replenishment timer code. Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: vincent@legout.info Cc: Dario Faggioli <raistlin@linux.it> Cc: Michael Trimarchi <michael@amarulasolutions.com> Cc: Fabio Checconi <fchecconi@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1414142198-18552-5-git-send-email-juri.lelli@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* sched/deadline: Don't replenish from a !SCHED_DEADLINE entityJuri Lelli2014-10-28
| | | | | | | | | | | | | | | | | | | | | | | In the deboost path, right after the dl_boosted flag has been reset, we can currently end up replenishing using -deadline parameters of a !SCHED_DEADLINE entity. This of course causes a bug, as those parameters are empty. In the case depicted above it is safe to simply bail out, as the deboosted task is going to be back to its original scheduling class anyway. Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: vincent@legout.info Cc: Dario Faggioli <raistlin@linux.it> Cc: Michael Trimarchi <michael@amarulasolutions.com> Cc: Fabio Checconi <fchecconi@gmail.com> Link: http://lkml.kernel.org/r/1414142198-18552-4-git-send-email-juri.lelli@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
* sched: Fix race between task_group and sched_task_groupKirill Tkhai2014-10-28
| | | | | | | | | | | | | | | | | | | | | | | The race may happen when somebody is changing task_group of a forking task. Child's cgroup is the same as parent's after dup_task_struct() (there just memory copying). Also, cfs_rq and rt_rq are the same as parent's. But if parent changes its task_group before it's called cgroup_post_fork(), we do not reflect this situation on child. Child's cfs_rq and rt_rq remain the same, while child's task_group changes in cgroup_post_fork(). To fix this we introduce fork() method, which calls sched_move_task() directly. This function changes sched_task_group on appropriate (also its logic has no problem with freshly created tasks, so we shouldn't introduce something special; we are able just to use it). Possibly, this decides the Burke Libbey's problem: https://lkml.org/lkml/2014/10/24/456 Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1414405105.19914.169.camel@tkhai Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Linux 3.18-rc2Linus Torvalds2014-10-26
|
* Merge tag 'armsoc-for-rc2' of ↵Linus Torvalds2014-10-26
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Another week, another small batch of fixes. Most of these make zynq, socfpga and sunxi platforms work a bit better: - due to new requirements for regulators, DWMMC on socfpga broke past v3.17 - SMP spinup fix for socfpga - a few DT fixes for zynq - another option (FIXED_REGULATOR) for sunxi is needed that used to be selected by other options but no longer is. - a couple of small DT fixes for at91 - ...and a couple for i.MX" * tag 'armsoc-for-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: dts: imx28-evk: Let i2c0 run at 100kHz ARM: i.MX6: Fix "emi" clock name typo ARM: multi_v7_defconfig: enable CONFIG_MMC_DW_ROCKCHIP ARM: sunxi_defconfig: enable CONFIG_REGULATOR_FIXED_VOLTAGE ARM: dts: socfpga: Add a 3.3V fixed regulator node ARM: dts: socfpga: Fix SD card detect ARM: dts: socfpga: rename gpio nodes ARM: at91/dt: sam9263: fix PLLB frequencies power: reset: at91-reset: fix power down register MAINTAINERS: add atmel ssc driver maintainer entry arm: socfpga: fix fetching cpu1start_addr for SMP ARM: zynq: DT: trivial: Fix mc node ARM: zynq: DT: Add cadence watchdog node ARM: zynq: DT: Add missing reference for memory-controller ARM: zynq: DT: Add missing reference for ADC ARM: zynq: DT: Add missing address for L2 pl310 ARM: zynq: DT: Remove 222 MHz OPP ARM: zynq: DT: Fix GEM register area size
| * Merge tag 'imx-fixes-3.18' of ↵Olof Johansson2014-10-25
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into fixes Merge "ARM: imx: fixes for 3.18" from Shawn Guo: The i.MX fixes for 3.18: - Revert one patch which increases I2C bus frequency on imx28-evk - Fix a typo on imx6q EIM clock name * tag 'imx-fixes-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: ARM: dts: imx28-evk: Let i2c0 run at 100kHz ARM: i.MX6: Fix "emi" clock name typo Signed-off-by: Olof Johansson <olof@lixom.net>
| | * ARM: dts: imx28-evk: Let i2c0 run at 100kHzFabio Estevam2014-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 78b81f4666fb ("ARM: dts: imx28-evk: Run I2C0 at 400kHz") caused issues when doing the following sequence in loop: - Boot the kernel - Perform audio playback - Reboot the system via 'reboot' command In many times the audio card cannot be probed, which causes playback to fail. After restoring to the original i2c0 frequency of 100kHz there is no such problem anymore. This reverts commit 78b81f4666fbb22a20b1e63e5baf197ad2e90e88. Cc: <stable@vger.kernel.org> # 3.16+ Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
| | * ARM: i.MX6: Fix "emi" clock name typoSteve Longerbeam2014-10-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix a typo error, the "emi" names refer to the eim clocks. The change fixes typo in EIM and EIM_SLOW pre-output dividers and selectors clock names. Notably EIM_SLOW clock itself is named correctly. Signed-off-by: Steve Longerbeam <steve_longerbeam@mentor.com> [vladimir_zapolskiy@mentor.com: ported to v3.17] Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com> Cc: Sascha Hauer <kernel@pengutronix.de> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
| * | Merge tag 'socfpga_fixes_for_3.18' of ↵Olof Johansson2014-10-24
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.rocketboards.org/linux-socfpga-next into fixes Merge "SOCFPGA fixes for 3.18" from Dinh Nguyen: These patches fixes an SMP and SDMMC driver hang during boot up on the SOCFPGA platform. Patch "arm: socfpga: fix fetching cpu1start_addr for SMP" fixes the SMP trampoline code in order for CPU1 to correctly fetch it's cpu1start_addr. Patch "ARM: dts: socfpga: rename gpio nodes" renames that GPIO node in order to allow a standard way of specifying status="okay" in the board DTS file. Patch "ARM: dts: socfpga: Fix SD card detect" fixes a SDMMC driver hang during boot. The reason for the hang was the deferred probe of the SDMMC driver was waiting for the GPIO resource that would never come. Patch "ARM: dts: socfpga: Add a 3.3V fixed regulator node" adds a fixed regulator node for the SDMMC driver to use. * tag 'socfpga_fixes_for_3.18' of git://git.rocketboards.org/linux-socfpga-next: ARM: dts: socfpga: Add a 3.3V fixed regulator node ARM: dts: socfpga: Fix SD card detect ARM: dts: socfpga: rename gpio nodes arm: socfpga: fix fetching cpu1start_addr for SMP Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | ARM: dts: socfpga: Add a 3.3V fixed regulator nodeDinh Nguyen2014-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without the 3.3V regulator node, the SDMMC driver will give these warnings: dw_mmc ff704000.dwmmc0: No vmmc regulator found dw_mmc ff704000.dwmmc0: No vqmmc regulator found This patch adds the regulator node, and points the SD/MMC to the regulator. Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com> Reviewed-by: Doug Anderson <dianders@chromium.org> --- v3: Rename nodes to have schematic-name_regulator and remove "boot-on" and "always-on" v2: Move the regulator nodes to their respective board dts file and correctly rename them to match the schematic
| | * | ARM: dts: socfpga: Fix SD card detectDinh Nguyen2014-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Without this patch, the booting the SOCFPGA platform would hang at the SDMMC driver loading. The issue, debugged by Doug Anderson, turned out to be that the GPIO bank used by the SD card-detect was not set to status="okay". Also update the cd-gpios to point to portb of the &gpio1 GPIO IP. Suggested-by: Doug Anderson <dianders@chromium.org> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com> --- v4: Use &gpio1 to set status="okay" and update cd-gpio=&portb v3: Correctly degugged the issue to be a gpio node not having status="okay"
| | * | ARM: dts: socfpga: rename gpio nodesDinh Nguyen2014-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since the Synopsys GPIO IP can support multiple ports of varying widths, it would make more sense to have the GPIO node DTS entry as this: gpio0: gpio@ff708000{ porta{ }; }; Also, this is documented in the snps-dwapb-gpio.txt. Suggested-by: Doug Anderson <dianders@chromium.org> Reviewed-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com>
| | * | arm: socfpga: fix fetching cpu1start_addr for SMPDinh Nguyen2014-10-21
| | |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When CPU1 is brought out of reset, it's MMU is not turned on yet, so it will only be able to use physical addresses. For systems with that have the MMU page configured for 0xC0000000, 0x80000000, or 0x40000000 "BIC 0x40000000" will work just fine, as it was just converting the virtual address of &cpu1start_addr into a physical address, ie. 0xC0000000 became 0x80000000. So for systems where the SDRAM controller was able to do a wrap-around access, this was working fine, as it was just dropping the MSB, but for systems where out of bounds memory access is not allowed, this would not allow CPU1 to correctly fetch &cpu1start_addr. This patch fixes the secondary_trampoline code to correctly fetch the physical address of cpu1start_addr directly. The patch will subtract the correct PAGE_OFFSET from &cpu1start_addr. And since on this platform, the physical memory will always start at 0x0, subtracting PAGE_OFFSET from &cpu1start_addr will allow CPU1 to correctly fetch the value of cpu1start_addr. While at it, change the name of cpu1start_addr to socfpga_cpu1start_addr to avoid any future naming collisions for multiplatform image. Signed-off-by: Dinh Nguyen <dinguyen@opensource.altera.com> --- v4: Updated commit log to correctly lay out the usage of PAGE_OFFSET and add comments to the same effect. v3: Used PAGE_OFFSET to get the physical address v2: Correctly get the physical address instead of just a BIC hack.
| * | Merge tag 'at91-fixes' of ↵Olof Johansson2014-10-24
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91 into fixes Merge "at91: fixes for v3.18 #1" from Nicholas Ferre: First AT91 fixes for 3.18: - one more MAINTAINERS entry for the SSC driver - a fix for the newly introduced power/reset driver - a fix on at91sam9263 USB due to PLLB misconfiguration * tag 'at91-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nferre/linux-at91: ARM: at91/dt: sam9263: fix PLLB frequencies power: reset: at91-reset: fix power down register MAINTAINERS: add atmel ssc driver maintainer entry Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | ARM: at91/dt: sam9263: fix PLLB frequenciesBoris Brezillon2014-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PLLB input and output ranges were wrongly copied from at91sam9261 as the datasheet didn't mention explicitly PLLB. Correct their values. This fixes USB. Reported-by: Andreas Henriksson <andreas.henriksson@endian.se> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Tested-by: Andreas Henriksson <andreas.henriksson@endian.se> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
| | * | power: reset: at91-reset: fix power down registerAlexandre Belloni2014-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case of at91sam9g45_restart(), the driver is writing AT91_DDRSDRC_LPCB_POWER_DOWN to AT91_DDRSDRC_RTR, this should actually be AT91_DDRSDRC_LPR. Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com> Acked-by: Sebastian Reichel <sre@kernel.org> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
| | * | MAINTAINERS: add atmel ssc driver maintainer entryBo Shen2014-10-22
| | |/ | | | | | | | | | | | | Signed-off-by: Bo Shen <voice.shen@atmel.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
| * | Merge tag 'zynq-dt-fixes-for-3.18' of https://github.com/Xilinx/linux-xlnx ↵Olof Johansson2014-10-24
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | into fixes Merge "Xilinx Zynq dt fixes for v3.18" from Michal Simek: arm: Xilinx Zynq DT fixes for v3.18 - Fix gem register size - Fix OPP - Add missing references - Trivial cleanup * tag 'zynq-dt-fixes-for-3.18' of https://github.com/Xilinx/linux-xlnx: ARM: zynq: DT: trivial: Fix mc node ARM: zynq: DT: Add cadence watchdog node ARM: zynq: DT: Add missing reference for memory-controller ARM: zynq: DT: Add missing reference for ADC ARM: zynq: DT: Add missing address for L2 pl310 ARM: zynq: DT: Remove 222 MHz OPP ARM: zynq: DT: Fix GEM register area size Signed-off-by: Olof Johansson <olof@lixom.net>
| | * | ARM: zynq: DT: trivial: Fix mc nodeMichal Simek2014-10-20
| | | | | | | | | | | | | | | | | | | | | | | | sed -i 's/}\ ;/};/g' Signed-off-by: Michal Simek <michal.simek@xilinx.com>
| | * | ARM: zynq: DT: Add cadence watchdog nodeMichal Simek2014-10-20
| | | | | | | | | | | | | | | | | | | | | | | | Add the cadence watchdog node to the Zynq devicetree. Signed-off-by: Michal Simek <michal.simek@xilinx.com>
| | * | ARM: zynq: DT: Add missing reference for memory-controllerMichal Simek2014-10-20
| | | | | | | | | | | | | | | | | | | | | | | | Add missing reference for memory-controller. Signed-off-by: Michal Simek <michal.simek@xilinx.com>
| | * | ARM: zynq: DT: Add missing reference for ADCMichal Simek2014-10-20
| | | | | | | | | | | | | | | | | | | | | | | | Add missing reference for ADC node. Signed-off-by: Michal Simek <michal.simek@xilinx.com>
| | * | ARM: zynq: DT: Add missing address for L2 pl310Michal Simek2014-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | By in sync with others node and add also baseaddr to the node name. Signed-off-by: Michal Simek <michal.simek@xilinx.com>
| | * | ARM: zynq: DT: Remove 222 MHz OPPSoren Brinkmann2014-10-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Due to dependencies between timer and CPU frequency, only changes by powers of two are allowed. The clocksource driver prevents other changes, but with cpufreq and its governors it can result in being spammed with error messages constantly. Hence, remove the 222 MHz OPP. Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com>
| | * | ARM: zynq: DT: Fix GEM register area sizeSoren Brinkmann2014-10-20
| | |/ | | | | | | | | | | | | | | | | | | The size of the GEM's register area is only 0x1000 bytes. Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com>
| * | ARM: multi_v7_defconfig: enable CONFIG_MMC_DW_ROCKCHIPOlof Johansson2014-10-24
| | | | | | | | | | | | | | | | | | | | | Allows booting from SD/MMC on RK3288 and other platforms. Added here so I can enable the board in the boot farm. Signed-off-by: Olof Johansson <olof@lixom.net>
| * | ARM: sunxi_defconfig: enable CONFIG_REGULATOR_FIXED_VOLTAGEOlof Johansson2014-10-24
| |/ | | | | | | | | | | | | I missed in 9a2ad529ed26 that REGULATOR_FIXED_VOLTAGE had also gotten deselected, so it needs to be added back as an explicit option. Signed-off-by: Olof Johansson <olof@lixom.net>
* | Merge branch 'for-linus' of ↵Linus Torvalds2014-10-26
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: "overlayfs merge + leak fix for d_splice_alias() failure exits" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: overlayfs: embed middle into overlay_readdir_data overlayfs: embed root into overlay_readdir_data overlayfs: make ovl_cache_entry->name an array instead of pointer overlayfs: don't hold ->i_mutex over opening the real directory fix inode leaks on d_splice_alias() failure exits fs: limit filesystem stacking depth overlay: overlay filesystem documentation overlayfs: implement show_options overlayfs: add statfs support overlay filesystem shmem: support RENAME_WHITEOUT ext4: support RENAME_WHITEOUT vfs: add RENAME_WHITEOUT vfs: add whiteout support vfs: export check_sticky() vfs: introduce clone_private_mount() vfs: export __inode_permission() to modules vfs: export do_splice_direct() to modules vfs: add i_op->dentry_open()
| * | overlayfs: embed middle into overlay_readdir_dataAl Viro2014-10-24
| | | | | | | | | | | | | | | | | | same story... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | overlayfs: embed root into overlay_readdir_dataAl Viro2014-10-24
| | | | | | | | | | | | | | | | | | | | | no sense having it a pointer - all instances have it pointing to local variable in the same stack frame Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | overlayfs: make ovl_cache_entry->name an array instead of pointerAl Viro2014-10-24
| | | | | | | | | | | | Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | overlayfs: don't hold ->i_mutex over opening the real directoryAl Viro2014-10-24
| | | | | | | | | | | | | | | | | | just use it to serialize the assignment Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
| * | Merge branch 'overlayfs.v25' of ↵Al Viro2014-10-23
| |\ \ | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into for-linus
| | * | fs: limit filesystem stacking depthMiklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a simple read-only counter to super_block that indicates how deep this is in the stack of filesystems. Previously ecryptfs was the only stackable filesystem and it explicitly disallowed multiple layers of itself. Overlayfs, however, can be stacked recursively and also may be stacked on top of ecryptfs or vice versa. To limit the kernel stack usage we must limit the depth of the filesystem stack. Initially the limit is set to 2. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | overlay: overlay filesystem documentationNeil Brown2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | Document the overlay filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | overlayfs: implement show_optionsErez Zadok2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is useful because of the stacking nature of overlayfs. Users like to find out (via /proc/mounts) which lower/upper directory were used at mount time. AV: even failing ovl_parse_opt() could've done some kstrdup() AV: failure of ovl_alloc_entry() should end up with ENOMEM, not EINVAL Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | overlayfs: add statfs supportAndy Whitcroft2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for statfs to the overlayfs filesystem. As the upper layer is the target of all write operations assume that the space in that filesystem is the space in the overlayfs. There will be some inaccuracy as overwriting a file will copy it up and consume space we were not expecting, but it is better than nothing. Use the upper layer dentry and mount from the overlayfs root inode, passing the statfs call to that filesystem. Signed-off-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | overlay filesystemMiklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Overlayfs allows one, usually read-write, directory tree to be overlaid onto another, read-only directory tree. All modifications go to the upper, writable layer. This type of mechanism is most often used for live CDs but there's a wide variety of other uses. The implementation differs from other "union filesystem" implementations in that after a file is opened all operations go directly to the underlying, lower or upper, filesystems. This simplifies the implementation and allows native performance in these cases. The dentry tree is duplicated from the underlying filesystems, this enables fast cached lookups without adding special support into the VFS. This uses slightly more memory than union mounts, but dentries are relatively small. Currently inodes are duplicated as well, but it is a possible optimization to share inodes for non-directories. Opening non directories results in the open forwarded to the underlying filesystem. This makes the behavior very similar to union mounts (with the same limitations vs. fchmod/fchown on O_RDONLY file descriptors). Usage: mount -t overlayfs overlayfs -olowerdir=/lower,upperdir=/upper/upper,workdir=/upper/work /overlay The following cotributions have been folded into this patch: Neil Brown <neilb@suse.de>: - minimal remount support - use correct seek function for directories - initialise is_real before use - rename ovl_fill_cache to ovl_dir_read Felix Fietkau <nbd@openwrt.org>: - fix a deadlock in ovl_dir_read_merged - fix a deadlock in ovl_remove_whiteouts Erez Zadok <ezk@fsl.cs.sunysb.edu> - fix cleanup after WARN_ON Sedat Dilek <sedat.dilek@googlemail.com> - fix up permission to confirm to new API Robin Dong <hao.bigrat@gmail.com> - fix possible leak in ovl_new_inode - create new inode in ovl_link Andy Whitcroft <apw@canonical.com> - switch to __inode_permission() - copy up i_uid/i_gid from the underlying inode AV: - ovl_copy_up_locked() - dput(ERR_PTR(...)) on two failure exits - ovl_clear_empty() - one failure exit forgetting to do unlock_rename(), lack of check for udir being the parent of upper, dropping and regaining the lock on udir (which would require _another_ check for parent being right). - bogus d_drop() in copyup and rename [fix from your mail] - copyup/remove and copyup/rename races [fix from your mail] - ovl_dir_fsync() leaving ERR_PTR() in ->realfile - ovl_entry_free() is pointless - it's just a kfree_rcu() - fold ovl_do_lookup() into ovl_lookup() - manually assigning ->d_op is wrong. Just use ->s_d_op. [patches picked from Miklos]: * copyup/remove and copyup/rename races * bogus d_drop() in copyup and rename Also thanks to the following people for testing and reporting bugs: Jordi Pujol <jordipujolp@gmail.com> Andy Whitcroft <apw@canonical.com> Michal Suchanek <hramrach@centrum.cz> Felix Fietkau <nbd@openwrt.org> Erez Zadok <ezk@fsl.cs.sunysb.edu> Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | shmem: support RENAME_WHITEOUTMiklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allocate a dentry, initialize it with a whiteout and hash it in the place of the old dentry. Later the old dentry will be moved away and the whiteout will remain. i_mutex protects agains concurrent readdir. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: Hugh Dickins <hughd@google.com>
| | * | ext4: support RENAME_WHITEOUTMiklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add whiteout support to ext4_rename(). A whiteout inode (chrdev/0,0) is created before the rename takes place. The whiteout inode is added to the old entry instead of deleting it. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | vfs: add RENAME_WHITEOUTMiklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new RENAME_WHITEOUT flag. This flag makes rename() create a whiteout of source. The whiteout creation is atomic relative to the rename. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | vfs: add whiteout supportMiklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Whiteout isn't actually a new file type, but is represented as a char device (Linus's idea) with 0/0 device number. This has several advantages compared to introducing a new whiteout file type: - no userspace API changes (e.g. trivial to make backups of upper layer filesystem, without losing whiteouts) - no fs image format changes (you can boot an old kernel/fsck without whiteout support and things won't break) - implementation is trivial Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | vfs: export check_sticky()Miklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's already duplicated in btrfs and about to be used in overlayfs too. Move the sticky bit check to an inline helper and call the out-of-line helper only in the unlikly case of the sticky bit being set. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | vfs: introduce clone_private_mount()Miklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Overlayfs needs a private clone of the mount, so create a function for this and export to modules. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | vfs: export __inode_permission() to modulesMiklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We need to be able to check inode permissions (but not filesystem implied permissions) for stackable filesystems. Expose this interface for overlayfs. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
| | * | vfs: export do_splice_direct() to modulesMiklos Szeredi2014-10-23
| | | | | | | | | | | | | | | | | | | | | | | | Export do_splice_direct() to modules. Needed by overlay filesystem. Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>