aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* Merge tag 'marvell-mvneta-fix-and-clk-support-3.8' of ↵Jason Cooper2012-11-21
|\ | | | | | | | | | | git://github.com/MISL-EBU-System-SW/mainline-public into mvebu/everything Marvell Ethernet driver fix + clk support
| * net: mvneta: fix section mismatch warning caused by mvneta_deinit()Thomas Petazzoni2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | mvneta_deinit() can be called from the ->probe() hook in the error path, so it shouldn't be marked as __devexit. It fixes the following section mismatch warning: WARNING: vmlinux.o(.devinit.text+0x239c): Section mismatch in reference from the function mvneta_probe() to the function .devexit.text:mvneta_deinit() The function __devinit mvneta_probe() references a function __devexit mvneta_deinit(). Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * net: mvneta: add clk supportThomas Petazzoni2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that the Armada 370/XP platform has gained proper integration with the clock framework, we add clk support in the Marvell Armada 370/XP Ethernet driver. Since the existing Device Tree binding that exposes a 'clock-frequency' property has never been exposed in any stable kernel release, we take the freedom of removing this property to replace it with the standard 'clocks' clock pointer property. The Device Tree binding documentation is updated accordingly. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
* | Merge tag 'marvell-net-mdio-checkpatch-fixes-3.8' of ↵Jason Cooper2012-11-21
|\| | | | | | | | | | | git://github.com/MISL-EBU-System-SW/mainline-public into mvebu/everything Marvell network/MDIO driver checkpatch fixes
| * net: mvneta: adjust multiline comments to net/ styleThomas Petazzoni2012-11-20
| | | | | | | | | | | | | | | | As reported by checkpatch, the multiline comments for net/ and drivers/net/ have a slightly different format than the one used in the rest of the kernel, so we adjust our multiline comments accordingly. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * net: mvmdio: adjust multiline comment to net/ styleThomas Petazzoni2012-11-20
| | | | | | | | | | | | | | | | As reported by checkpatch, the multiline comments for net/ and drivers/net/ have a slightly different format than the one used in the rest of the kernel, so we adjust our multiline comment accordingly. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * net: mvmdio: use <linux/delay.h> instead of <asm/delay.h>Thomas Petazzoni2012-11-20
| | | | | | | | | | | | | | As suggested by checkpatch, using <linux/delay.h> instead of <asm/delay.h> is appropriate. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * Merge tag 'marvell-boards-net-for-3.8' of ↵Thomas Petazzoni2012-11-20
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | github.com:MISL-EBU-System-SW/mainline-public into test-the-merge Marvell boards changes related to Ethernet, for 3.8 Conflicts: arch/arm/boot/dts/armada-370-xp.dtsi arch/arm/boot/dts/armada-xp-db.dts
| * \ Merge tag 'marvell-neta-for-3.8' of ↵Thomas Petazzoni2012-11-20
| |\ \ | | | | | | | | | | | | | | | | | | | | github.com:MISL-EBU-System-SW/mainline-public into test-the-merge Marvell mvneta network driver, for 3.8
| * \ \ Merge tag 'marvell-sata-3.8' of ↵Thomas Petazzoni2012-11-20
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | github.com:MISL-EBU-System-SW/mainline-public into test-the-merge Marvell Armada 370/XP support for 3.8
| * \ \ \ Merge tag 'marvell-mvebu-clk-3.8' of ↵Thomas Petazzoni2012-11-20
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | github.com:MISL-EBU-System-SW/mainline-public into test-the-merge Marvell MVEBU clk support, for 3.8
* | \ \ \ \ Merge tag 'marvell-boards-net-for-3.8' of ↵Jason Cooper2012-11-21
|\ \ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://github.com/MISL-EBU-System-SW/mainline-public into mvebu/everything Marvell boards changes related to Ethernet, for 3.8 Conflicts: arch/arm/boot/dts/armada-370-xp.dtsi arch/arm/boot/dts/armada-xp-db.dts
| * | | | | arm: mvebu: enable Ethernet controllers on Mirabox platformThomas Petazzoni2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Globalscale Mirabox platform has two Ethernet interfaces, connected to the SoC with a RGMII interface. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * | | | | arm: mvebu: enable Ethernet controllers on OpenBlocks AX3-4 platformThomas Petazzoni2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The PlatHome OpenBlocks AX3-4 platform has 4 Ethernet ports, connected to a single quad-port PHY through SGMII. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * | | | | arm: mvebu: enable Ethernet controllers on Armada 370/XP eval boardsThomas Petazzoni2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables the two network interfaces of the Armada 370 official Marvell evaluation platform, and the four network interfaces of the Armada XP official Marvell evaluation platform. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * | | | | arm: mvebu: add Ethernet controllers using mvneta driver for Armada 370/XPThomas Petazzoni2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Armada 370 SoC has two network units, while the Armada XP has four network units. The first two network units are common to both the Armada XP and Armada 370, so they are added to armada-370-xp.dtsi, while the other two network units are specific to the Armada XP and therefore added to armada-xp.dtsi. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * | | | | arm: mvebu: support for the Globalscale Mirabox boardGregory CLEMENT2012-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This platform, available from Globalscale has an Armada 370. For now, only the serial port is supported. Support for network, USB and other peripherals will be added as drivers for them become available for Armada 370 in mainline. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> --- This is 3.8 material. Changes from original version posted by Gregory: * Renamed .dts file to armada-370-mirabox.dts * Change compatible string to 'globalscale,mirabox' * Remove compatible string from armada-370-xp.c * Removed references to MBX0001
| * | | | | arm: mvebu: support for the PlatHome OpenBlocks AX3-4 boardThomas Petazzoni2012-11-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This platform, available in Japan from PlatHome, has a dual-core Armada XP, the MV78260. For now, only the two serial ports and the three front LEDs are supported. Support for SMP, network, SATA, USB and other peripherals will be added as drivers for them become available for Armada XP in mainline. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com> --- This is 3.8 material. Changes since v2: * Renamed the .dts file to armada-xp-openblocks-ax3-4.dts * Removed the compatible string from armada-370-xp.c (which now only lists the common SoC compatible string) Changes since v1: * Renamed the board to OpenBlocks AX3-4, since there is a variant called AX3-2 which has less RAM, and no mini PCIe port. Requested by Andrew Lunn. * Fix the amount of memory to 3 GB. In fact, the board has 1 GB soldered, and 2 GB in a SODIMM slot (which is therefore removable). But as the board is delivered as is, we'll assume it has 3 GB of memory by default.
* | | | | | Merge tag 'marvell-neta-for-3.8' of ↵Jason Cooper2012-11-21
|\ \ \ \ \ \ | | |_|_|_|/ | |/| | | | | | | | | | | | | | | | | | | | | | git://github.com/MISL-EBU-System-SW/mainline-public into mvebu/everything Marvell mvneta network driver, for 3.8
| * | | | | net: mvneta: update MAINTAINERS file for the mvneta maintainersThomas Petazzoni2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: David S. Miller <davem@davemloft.net>
| * | | | | net: mvneta: driver for Marvell Armada 370/XP network unitThomas Petazzoni2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch contains a new network driver for the network unit of the ARM Marvell Armada 370 and the Armada XP. Both SoCs use the PJ4B processor, a Marvell-developed ARM core that implements the ARMv7 instruction set. Compared to previous ARM Marvell SoCs (Kirkwood, Orion, Discovery), the network unit in Armada 370 and Armada XP is highly different. This is the reason why this new 'mvneta' driver is needed, while the older ARM Marvell SoCs use the 'mv643xx_eth' driver. Here is an overview of the most important hardware changes that require a new, specific, driver for the network unit of Armada 370/XP: - The new network unit has a completely different design and layout for the RX and TX descriptors. They are now organized as a simple array (each RX and TX queue has base address and size of this array) rather than a linked list as in the old SoCs. - The new network unit has a different RXQ and TXQ management: this management is done using special read/write counter registers, while in the Old SocS, it was done using the Ownership bit in RX and TX descriptors. - The new network unit has different interrupt registers - The new network unit way of cleaning of interrupts is not done by writing to the cause register, but by updating per-queue counters - The new network unit has different GMAC registers (link, speed, duplex configuration) and different WRR registers. - The new network unit has lots of new units like PnC (Parser and Classifier), PMT, BM (Memory Buffer Management), xPON, and more. The driver proposed in the current patch only handles the basic features. Additional hardware features will progressively be supported as needed. This code has originally been written by Rami Rosen <rosenr@marvell.com>, and then reviewed and cleaned up by Thomas Petazzoni <thomas.petazzoni@free-electrons.com>. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: David S. Miller <davem@davemloft.net>
| * | | | | net: mvmdio: new Marvell MDIO driverThomas Petazzoni2012-11-16
| |/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a separate driver for the MDIO interface of the Marvell Ethernet controllers. There are two reasons to have a separate driver rather than including it inside the MAC driver itself: *) The MDIO interface is shared by all Ethernet ports, so a driver must guarantee non-concurrent accesses to this MDIO interface. The most logical way is to have a separate driver that handles this single MDIO interface, used by all Ethernet ports. *) The MDIO interface is the same between the existing mv643xx_eth driver and the new mvneta driver. Even though it is for now only used by the mvneta driver, it will in the future be used by the mv643xx_eth driver as well. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Acked-by: David S. Miller <davem@davemloft.net>
* | | | | Merge tag 'marvell-sata-3.8' of ↵Jason Cooper2012-11-21
|\ \ \ \ \ | | |_|_|/ | |/| | | | | | | | | | | | | | | | | | git://github.com/MISL-EBU-System-SW/mainline-public into mvebu/everything Marvell Armada 370/XP support for 3.8
| * | | | arm: mvebu: SATA support: board-level DT data for Armada 370/XP boardsGregory CLEMENT2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the SATA device tree bindings for - Armada XP evaluation board (DB-78460-BP) - Armada 370 evaluation board (DB-88F6710-BP-DDR3) Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Lior Amsalem <alior@marvell.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * | | | arm: mvebu: SATA support: mvebu_defconfig updateGregory CLEMENT2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Lior Amsalem <alior@marvell.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * | | | arm: mvebu: SATA support: SoC-level DT data for Armada 370/XPGregory CLEMENT2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Lior Amsalem <alior@marvell.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * | | | arm: mvebu: increase atomic coherent pool size for armada 370/XPGregory CLEMENT2012-11-20
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For Armada 370/XP we have the same problem that for the commit cb01b63, so we applied the same solution: "The default 256 KiB coherent pool may be too small for some of the Kirkwood devices, so increase it to make sure that devices will be able to allocate their buffers with GFP_ATOMIC flag" Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
* | | | Merge tag 'marvell-mvebu-clk-3.8' of ↵Jason Cooper2012-11-21
|\| | | | |_|/ |/| | | | | | | | | | | git://github.com/MISL-EBU-System-SW/mainline-public into mvebu/everything Marvell MVEBU clk support, for 3.8
| * | ARM: Kirkwood: switch to DT clock providersAndrew Lunn2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | With true DT clock providers available switch Kirkwood clock setup in DT- enabled boards. While AUXDATA can be removed completely from bus probing, some devices still don't know about DT. Therefore, some clkdev aliases are created until these devices also move to DT. Signed-off-by: Andrew Lunn <andrew@lunn.ch>
| * | ARM: dove: switch to DT clock providersSebastian Hesselbarth2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | With true DT clock providers available switch Dove clock setup in DT- enabled boards. While AUXDATA can be removed completely from bus probing, some devices still don't know about DT at all. Therefore, some clock aliases are created until the devices also move to DT. Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
| * | clocksource: convert time-armada-370-xp to clk frameworkGregory CLEMENT2012-11-20
| | | | | | | | | | | | | | | Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Tested-by Gregory CLEMENT <gregory.clement@free-electrons.com>
| * | clk: armada-370-xp: add support for clock frameworkGregory CLEMENT2012-11-20
| | | | | | | | | | | | | | | | | | Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Tested-by Gregory CLEMENT <gregory.clement@free-electrons.com>
| * | clk: mvebu: armada 370/XP add clock gating control provider for DTGregory CLEMENT2012-11-20
| | | | | | | | | | | | | | | | | | Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
| * | clk: mvebu: add clock gating control provider for DTSebastian Hesselbarth2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | This driver allows to provide DT clocks for clock gates found on Marvell Dove and Kirkwood SoCs. The clock gates are referenced by the phandle index of the corresponding bit in the clock gating control register to ease lookup in the datasheet. Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
| * | clk: mvebu: add armada-370-xp CPU specific clocksGregory CLEMENT2012-11-20
| | | | | | | | | | | | | | | | | | | | | | | | Add Armada 370/XP specific CPU clocks Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Tested-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
| * | clk: mvebu: add mvebu core clocks.Sebastian Hesselbarth2012-11-20
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | This driver allows to provide DT clocks for core clocks found on Marvell Kirkwood, Dove & 370/XP SoCs. The core clock frequencies and ratios are determined by decoding the Sample-At-Reset registers. Although technically correct, using a divider of 0 will lead to div_by_zero panic. Let's use a ratio of 0/1 instead to fail later with a zero clock. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by Gregory CLEMENT <gregory.clement@free-electrons.com>
* | Linux 3.7-rc6Linus Torvalds2012-11-16
| |
* | Merge git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2012-11-16
|\ \ | | | | | | | | | | | | | | | | | | | | | Pull KVM fix from Marcelo Tosatti: "A correction for oops on module init with older Intel hosts." * git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix invalid secondary exec controls in vmx_cpuid_update()
| * | KVM: x86: Fix invalid secondary exec controls in vmx_cpuid_update()Takashi Iwai2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The commit [ad756a16: KVM: VMX: Implement PCID/INVPCID for guests with EPT] introduced the unconditional access to SECONDARY_VM_EXEC_CONTROL, and this triggers kernel warnings like below on old CPUs: vmwrite error: reg 401e value a0568000 (err 12) Pid: 13649, comm: qemu-kvm Not tainted 3.7.0-rc4-test2+ #154 Call Trace: [<ffffffffa0558d86>] vmwrite_error+0x27/0x29 [kvm_intel] [<ffffffffa054e8cb>] vmcs_writel+0x1b/0x20 [kvm_intel] [<ffffffffa054f114>] vmx_cpuid_update+0x74/0x170 [kvm_intel] [<ffffffffa03629b6>] kvm_vcpu_ioctl_set_cpuid2+0x76/0x90 [kvm] [<ffffffffa0341c67>] kvm_arch_vcpu_ioctl+0xc37/0xed0 [kvm] [<ffffffff81143f7c>] ? __vunmap+0x9c/0x110 [<ffffffffa0551489>] ? vmx_vcpu_load+0x39/0x1a0 [kvm_intel] [<ffffffffa0340ee2>] ? kvm_arch_vcpu_load+0x52/0x1a0 [kvm] [<ffffffffa032dcd4>] ? vcpu_load+0x74/0xd0 [kvm] [<ffffffffa032deb0>] kvm_vcpu_ioctl+0x110/0x5e0 [kvm] [<ffffffffa032e93d>] ? kvm_dev_ioctl+0x4d/0x4a0 [kvm] [<ffffffff8117dc6f>] do_vfs_ioctl+0x8f/0x530 [<ffffffff81139d76>] ? remove_vma+0x56/0x60 [<ffffffff8113b708>] ? do_munmap+0x328/0x400 [<ffffffff81187c8c>] ? fget_light+0x4c/0x100 [<ffffffff8117e1a1>] sys_ioctl+0x91/0xb0 [<ffffffff815a942d>] system_call_fastpath+0x1a/0x1f This patch adds a check for the availability of secondary exec control to avoid these warnings. Cc: <stable@vger.kernel.org> [v3.6+] Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
* | | Merge branch 'akpm' (Fixes from Andrew)Linus Torvalds2012-11-16
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Merge misc fixes from Andrew Morton. * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (12 patches) revert "mm: fix-up zone present pages" tmpfs: change final i_blocks BUG to WARNING tmpfs: fix shmem_getpage_gfp() VM_BUG_ON mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem address mm: revert "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures" rapidio: fix kernel-doc warnings swapfile: fix name leak in swapoff memcg: fix hotplugged memory zone oops mips, arc: fix build failure memcg: oom: fix totalpages calculation for memory.swappiness==0 mm: fix build warning for uninitialized value mm: add anon_vma_lock to validate_mm()
| * | | revert "mm: fix-up zone present pages"Andrew Morton2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Revert commit 7f1290f2f2a4 ("mm: fix-up zone present pages") That patch tried to fix a issue when calculating zone->present_pages, but it caused a regression on 32bit systems with HIGHMEM. With that change, reset_zone_present_pages() resets all zone->present_pages to zero, and fixup_zone_present_pages() is called to recalculate zone->present_pages when the boot allocator frees core memory pages into buddy allocator. Because highmem pages are not freed by bootmem allocator, all highmem zones' present_pages becomes zero. Various options for improving the situation are being discussed but for now, let's return to the 3.6 code. Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Petr Tesarik <ptesarik@suse.cz> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Minchan Kim <minchan.kim@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Chris Clayton <chris2553@googlemail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | tmpfs: change final i_blocks BUG to WARNINGHugh Dickins2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under a particular load on one machine, I have hit shmem_evict_inode()'s BUG_ON(inode->i_blocks), enough times to narrow it down to a particular race between swapout and eviction. It comes from the "if (freed > 0)" asymmetry in shmem_recalc_inode(), and the lack of coherent locking between mapping's nrpages and shmem's swapped count. There's a window in shmem_writepage(), between lowering nrpages in shmem_delete_from_page_cache() and then raising swapped count, when the freed count appears to be +1 when it should be 0, and then the asymmetry stops it from being corrected with -1 before hitting the BUG. One answer is coherent locking: using tree_lock throughout, without info->lock; reasonable, but the raw_spin_lock in percpu_counter_add() on used_blocks makes that messier than expected. Another answer may be a further effort to eliminate the weird shmem_recalc_inode() altogether, but previous attempts at that failed. So far undecided, but for now change the BUG_ON to WARN_ON: in usual circumstances it remains a useful consistency check. Signed-off-by: Hugh Dickins <hughd@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | tmpfs: fix shmem_getpage_gfp() VM_BUG_ONHugh Dickins2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fuzzing with trinity hit the "impossible" VM_BUG_ON(error) (which Fedora has converted to WARNING) in shmem_getpage_gfp(): WARNING: at mm/shmem.c:1151 shmem_getpage_gfp+0xa5c/0xa70() Pid: 29795, comm: trinity-child4 Not tainted 3.7.0-rc2+ #49 Call Trace: warn_slowpath_common+0x7f/0xc0 warn_slowpath_null+0x1a/0x20 shmem_getpage_gfp+0xa5c/0xa70 shmem_fault+0x4f/0xa0 __do_fault+0x71/0x5c0 handle_pte_fault+0x97/0xae0 handle_mm_fault+0x289/0x350 __do_page_fault+0x18e/0x530 do_page_fault+0x2b/0x50 page_fault+0x28/0x30 tracesys+0xe1/0xe6 Thanks to Johannes for pointing to truncation: free_swap_and_cache() only does a trylock on the page, so the page lock we've held since before confirming swap is not enough to protect against truncation. What cleanup is needed in this case? Just delete_from_swap_cache(), which takes care of the memcg uncharge. Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Dave Jones <davej@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | mm: highmem: don't treat PKMAP_ADDR(LAST_PKMAP) as a highmem addressWill Deacon2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | kmap_to_page returns the corresponding struct page for a virtual address of an arbitrary mapping. This works by checking whether the address falls in the pkmap region and using the pkmap page tables instead of the linear mapping if appropriate. Unfortunately, the bounds checking means that PKMAP_ADDR(LAST_PKMAP) is incorrectly treated as a highmem address and we can end up walking off the end of pkmap_page_table and subsequently passing junk to pte_page. This patch fixes the bound check to stay within the pkmap tables. Signed-off-by: Will Deacon <will.deacon@arm.com> Cc: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | mm: revert "mm: vmscan: scale number of pages reclaimed by ↵Mel Gorman2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reclaim/compaction based on failures" Jiri Slaby reported the following: (It's an effective revert of "mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures".) Given kswapd had hours of runtime in ps/top output yesterday in the morning and after the revert it's now 2 minutes in sum for the last 24h, I would say, it's gone. The intention of the patch in question was to compensate for the loss of lumpy reclaim. Part of the reason lumpy reclaim worked is because it aggressively reclaimed pages and this patch was meant to be a sane compromise. When compaction fails, it gets deferred and both compaction and reclaim/compaction is deferred avoid excessive reclaim. However, since commit c654345924f7 ("mm: remove __GFP_NO_KSWAPD"), kswapd is woken up each time and continues reclaiming which was not taken into account when the patch was developed. Attempts to address the problem ended up just changing the shape of the problem instead of fixing it. The release window gets closer and while a THP allocation failing is not a major problem, kswapd chewing up a lot of CPU is. This patch reverts commit 83fde0f22872 ("mm: vmscan: scale number of pages reclaimed by reclaim/compaction based on failures") and will be revisited in the future. Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: Zdenek Kabelac <zkabelac@redhat.com> Tested-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Jiri Slaby <jirislaby@gmail.com> Cc: Rik van Riel <riel@redhat.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | rapidio: fix kernel-doc warningsRandy Dunlap2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix rapidio kernel-doc warnings: Warning(drivers/rapidio/rio.c:415): No description found for parameter 'local' Warning(drivers/rapidio/rio.c:415): Excess function parameter 'lstart' description in 'rio_map_inb_region' Warning(include/linux/rio.h:290): No description found for parameter 'switches' Warning(include/linux/rio.h:290): No description found for parameter 'destid_table' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Matt Porter <mporter@kernel.crashing.org> Acked-by: Alexandre Bounine <alexandre.bounine@idt.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | swapfile: fix name leak in swapoffXiaotian Feng2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a name leak introduced by commit 91a27b2a7567 ("vfs: define struct filename and have getname() return it"). Add the missing putname. [akpm@linux-foundation.org: cleanup] Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | memcg: fix hotplugged memory zone oopsHugh Dickins2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When MEMCG is configured on (even when it's disabled by boot option), when adding or removing a page to/from its lru list, the zone pointer used for stats updates is nowadays taken from the struct lruvec. (On many configurations, calculating zone from page is slower.) But we have no code to update all the lruvecs (per zone, per memcg) when a memory node is hotadded. Here's an extract from the oops which results when running numactl to bind a program to a newly onlined node: BUG: unable to handle kernel NULL pointer dereference at 0000000000000f60 IP: __mod_zone_page_state+0x9/0x60 Pid: 1219, comm: numactl Not tainted 3.6.0-rc5+ #180 Bochs Bochs Process numactl (pid: 1219, threadinfo ffff880039abc000, task ffff8800383c4ce0) Call Trace: __pagevec_lru_add_fn+0xdf/0x140 pagevec_lru_move_fn+0xb1/0x100 __pagevec_lru_add+0x1c/0x30 lru_add_drain_cpu+0xa3/0x130 lru_add_drain+0x2f/0x40 ... The natural solution might be to use a memcg callback whenever memory is hotadded; but that solution has not been scoped out, and it happens that we do have an easy location at which to update lruvec->zone. The lruvec pointer is discovered either by mem_cgroup_zone_lruvec() or by mem_cgroup_page_lruvec(), and both of those do know the right zone. So check and set lruvec->zone in those; and remove the inadequate attempt to set lruvec->zone from lruvec_init(), which is called before NODE_DATA(node) has been allocated in such cases. Ah, there was one exceptionr. For no particularly good reason, mem_cgroup_force_empty_list() has its own code for deciding lruvec. Change it to use the standard mem_cgroup_zone_lruvec() and mem_cgroup_get_lru_size() too. In fact it was already safe against such an oops (the lru lists in danger could only be empty), but we're better proofed against future changes this way. I've marked this for stable (3.6) since we introduced the problem in 3.5 (now closed to stable); but I have no idea if this is the only fix needed to get memory hotadd working with memcg in 3.6, and received no answer when I enquired twice before. Reported-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | mips, arc: fix build failureDavid Rientjes2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using a cross-compiler to fix another issue, the following build error occurred for mips defconfig: arch/mips/fw/arc/misc.c: In function 'ArcHalt': arch/mips/fw/arc/misc.c:25:2: error: implicit declaration of function 'local_irq_disable' Fix it up by including irqflags.h. Signed-off-by: David Rientjes <rientjes@google.com> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | memcg: oom: fix totalpages calculation for memory.swappiness==0Michal Hocko2012-11-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | oom_badness() takes a totalpages argument which says how many pages are available and it uses it as a base for the score calculation. The value is calculated by mem_cgroup_get_limit which considers both limit and total_swap_pages (resp. memsw portion of it). This is usually correct but since fe35004fbf9e ("mm: avoid swapping out with swappiness==0") we do not swap when swappiness is 0 which means that we cannot really use up all the totalpages pages. This in turn confuses oom score calculation if the memcg limit is much smaller than the available swap because the used memory (capped by the limit) is negligible comparing to totalpages so the resulting score is too small if adj!=0 (typically task with CAP_SYS_ADMIN or non zero oom_score_adj). A wrong process might be selected as result. The problem can be worked around by checking mem_cgroup_swappiness==0 and not considering swap at all in such a case. Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>