aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAge
* rps: immediate send IPI in process_backlog()Eric Dumazet2010-04-22
| | | | | | | | | | | | | | | | If some skb are queued to our backlog, we are delaying IPI sending at the end of net_rx_action(), increasing latencies. This defeats the queueing, since we want to quickly dispatch packets to the pool of worker cpus, then eventually deeply process our packets. It's better to send IPI before processing our packets in upper layers, from process_backlog(). Change the _and_disable_irq suffix to _and_enable_irq(), since we enable local irq in net_rps_action(), sorry for the confusion. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Make unnecessarily global functions staticRoland Dreier2010-04-22
| | | | | | | | Also put t4_write_indirect() inside "#if 0" to avoid a "defined but not used" compile warning. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* cxgb4: Use ntohs() on __be16 value instead of htons()Roland Dreier2010-04-22
| | | | | | | | | Use the correct direction of byte-swapping function to fix a mistake shown by sparse endianness checking -- c.fl0id is __be16. Signed-off-by: Roland Dreier <rolandd@cisco.com> Acked-by: Dimitris Michailidis <dm@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ethernet: print protocol in host byte orderJohannes Berg2010-04-22
| | | | | | | | | | | Eric's recent patch added __force, but this place would seem to require actually doing a byte order conversion so the printk is consistent across architectures. Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Introduce skb_orphan_try()Eric Dumazet2010-04-22
| | | | | | | | At this point, skb->destructor is not the original one (stored in DEV_GSO_CB(skb)->destructor) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ehea: fix possible DLPAR/mem deadlockThomas Klein2010-04-22
| | | | | | | Force serialization of userspace-triggered DLPAR/mem operations Signed-off-by: Thomas Klein <tklein@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ehea: error handling improvementThomas Klein2010-04-22
| | | | | | | Reset a port's resources only if they're actually in an error state Signed-off-by: Thomas Klein <tklein@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ks8842: Add platform data for setting mac addressRichard Röjfors2010-04-21
| | | | | | | | | | | | | This patch adds platform data to the ks8842 driver. Via the platform data a MAC address, to be used by the controller, can be passed. To ensure this MAC address is used, the MAC address is written after each hardware reset. Signed-off-by: Richard Röjfors <richard.rojfors@pelagicore.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: small cleanup of lib8390Nikanth Karthikesan2010-04-21
| | | | | | | | Remove the always true #if 1. Also the unecessary re-test of ei_local->irqlock and the unreachable printk format string. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
* fasync: RCU and fine grained lockingEric Dumazet2010-04-21
| | | | | | | | | | | | | | | | | | | | | kill_fasync() uses a central rwlock, candidate for RCU conversion, to avoid cache line ping pongs on SMP. fasync_remove_entry() and fasync_add_entry() can disable IRQS on a short section instead during whole list scan. Use a spinlock per fasync_struct to synchronize kill_fasync_rcu() and fasync_{remove|add}_entry(). This spinlock is IRQ safe, so sock_fasync() doesnt need its own implementation and can use fasync_helper(), to reduce code size and complexity. We can remove __kill_fasync() direct use in net/socket.c, and rename it to kill_fasync_rcu(). Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Mark v6 response packets as CHECKSUM_PARTIALDavid S. Miller2010-04-21
| | | | | | | | Otherwise we only get the checksum right for data-less TCP responses. Noticed by Herbert Xu. Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Fix ipv6 checksumming on response packets for real.David S. Miller2010-04-21
| | | | | | | | | | | | | | | | | Commit 6651ffc8e8bdd5fb4b7d1867c6cfebb4f309512c ("ipv6: Fix tcp_v6_send_response transport header setting.") fixed one half of why ipv6 tcp response checksums were invalid, but it's not the whole story. If we're going to use CHECKSUM_PARTIAL for these things (which we are since commit 2e8e18ef52e7dd1af0a3bd1f7d990a1d0b249586 "tcp: Set CHECKSUM_UNNECESSARY in tcp_init_nondata_skb"), we can't be setting buff->csum as we always have been here in tcp_v6_send_response. We need to leave it at zero. Kill that line and checksums are good again. Signed-off-by: David S. Miller <davem@davemloft.net>
* Merge branch 'master' of ↵David S. Miller2010-04-21
|\ | | | | | | | | | | | | | | master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl-6000.c net/core/dev.c
| * net: Fix an RCU warning in dev_pick_tx()David Howells2010-04-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix the following RCU warning in dev_pick_tx(): =================================================== [ INFO: suspicious rcu_dereference_check() usage. ] --------------------------------------------------- net/core/dev.c:1993 invoked rcu_dereference_check() without protection! other info that might help us debug this: rcu_scheduler_active = 1, debug_locks = 0 2 locks held by swapper/0: #0: (&idev->mc_ifc_timer){+.-...}, at: [<ffffffff81039e65>] run_timer_softirq+0x17b/0x278 #1: (rcu_read_lock_bh){.+....}, at: [<ffffffff812ea3eb>] dev_queue_xmit+0x14e/0x4dc stack backtrace: Pid: 0, comm: swapper Not tainted 2.6.34-rc5-cachefs #4 Call Trace: <IRQ> [<ffffffff810516c4>] lockdep_rcu_dereference+0xaa/0xb2 [<ffffffff812ea4f6>] dev_queue_xmit+0x259/0x4dc [<ffffffff812ea3eb>] ? dev_queue_xmit+0x14e/0x4dc [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf [<ffffffff81035362>] ? local_bh_enable_ip+0xbc/0xc1 [<ffffffff812f0954>] neigh_resolve_output+0x24b/0x27c [<ffffffff8134f673>] ip6_output_finish+0x7c/0xb4 [<ffffffff81350c34>] ip6_output2+0x256/0x261 [<ffffffff81052324>] ? trace_hardirqs_on+0xd/0xf [<ffffffff813517fb>] ip6_output+0xbbc/0xbcb [<ffffffff8135bc5d>] ? fib6_force_start_gc+0x2b/0x2d [<ffffffff81368acb>] mld_sendpack+0x273/0x39d [<ffffffff81368858>] ? mld_sendpack+0x0/0x39d [<ffffffff81052099>] ? mark_held_locks+0x52/0x70 [<ffffffff813692fc>] mld_ifc_timer_expire+0x24f/0x288 [<ffffffff81039ed6>] run_timer_softirq+0x1ec/0x278 [<ffffffff81039e65>] ? run_timer_softirq+0x17b/0x278 [<ffffffff813690ad>] ? mld_ifc_timer_expire+0x0/0x288 [<ffffffff81035531>] ? __do_softirq+0x69/0x140 [<ffffffff8103556a>] __do_softirq+0xa2/0x140 [<ffffffff81002e0c>] call_softirq+0x1c/0x28 [<ffffffff81004b54>] do_softirq+0x38/0x80 [<ffffffff81034f06>] irq_exit+0x45/0x47 [<ffffffff810177c3>] smp_apic_timer_interrupt+0x88/0x96 [<ffffffff810028d3>] apic_timer_interrupt+0x13/0x20 <EOI> [<ffffffff810488dd>] ? __atomic_notifier_call_chain+0x0/0x86 [<ffffffff810096bf>] ? mwait_idle+0x6e/0x78 [<ffffffff810096b6>] ? mwait_idle+0x65/0x78 [<ffffffff810011cb>] cpu_idle+0x4d/0x83 [<ffffffff81380b05>] rest_init+0xb9/0xc0 [<ffffffff81380a4c>] ? rest_init+0x0/0xc0 [<ffffffff8168dcf0>] start_kernel+0x392/0x39d [<ffffffff8168d2a3>] x86_64_start_reservations+0xb3/0xb7 [<ffffffff8168d38b>] x86_64_start_kernel+0xe4/0xeb An rcu_dereference() should be an rcu_dereference_bh(). Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller2010-04-21
| |\
| | * Merge branch 'for_linus' of ↵Linus Torvalds2010-04-20
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: quota: Convert __DQUOT_PARANOIA symbol to standard config option
| | | * quota: Convert __DQUOT_PARANOIA symbol to standard config optionJan Kara2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make __DQUOT_PARANOIA define from the old days a standard config option and turn it off by default. This gets rid of a quota warning about writes before quota is turned on for systems with ext4 root filesystem. Currently there's no way to legally solve this because /etc/mtab has to be written before quota is turned on on most systems. Signed-off-by: Jan Kara <jack@suse.cz>
| | * | Merge branch 'urgent' of ↵Linus Torvalds2010-04-20
| | |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6 * 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6: pcmcia: fix error handling in cm4000_cs.c drivers/pcmcia: Add missing local_irq_restore serial_cs: MD55x support (PCMCIA GPRS/EDGE modem) (kernel 2.6.33) pcmcia: avoid late calls to pccard_validate_cis pcmcia: fix ioport size calculation in rsrc_nonstatic pcmcia: re-start on MFC override pcmcia: fix io_probe due to parent (PCI) resources pcmcia: use previously assigned IRQ for all card functions
| | | * | pcmcia: fix error handling in cm4000_cs.cDan Carpenter2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the original code we used -ENODEV as the number of bytes to copy_to_user() and we didn't release the locks. Signed-off-by: Dan Carpenter <error27@gmail.com> Acked-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | | * | drivers/pcmcia: Add missing local_irq_restoreJulia Lawall2010-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use local_irq_restore in this error-handling case just like in the one just below. A simplified version of the semantic patch that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r exists@ expression E1; identifier f; @@ f (...) { <+... * local_irq_save (E1,...); ... when != E1 * return ...; ...+> } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | | * | serial_cs: MD55x support (PCMCIA GPRS/EDGE modem) (kernel 2.6.33)Timur Maximov2010-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Many PCMCIA GPRS modems like: Onda Edge N100E, Novaway PC98 (OEM SPC98Z), Rovermate Edgus Adaptmate-039 and others have same construction and identification: lspcmcia -vvv Product Name: Generic Modem: MD55x 1.00 Serial number: xxxxx-xxx Identification: manf_id: 0x015d card_id: 0x4c45 function: 2 (serial) prod_id(1): "Generic" (0xc49e4731) prod_id(2): "Modem: MD55x" (0x8913b110) prod_id(3): "1.00" (0x83dbf271) prod_id(4): "Serial number: xxxxx-xxx" (0x73ee9514) Serial connection to GSM module based on Elan VPU16551 PCMCIA UART with datasheet recommeded 14.7456MHz crystal oscillator. By default serial_cs set UART clock == 1843200 Hz For correct work need set clock 14745600 Hz. This quirk already present in driver, only need add device in quirk list. Signed-off-by: Timur Maximov <xcom.org@gmail.com> Acked-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | | * | pcmcia: avoid late calls to pccard_validate_cisDominik Brodowski2010-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pccard_validate_cis() nowadays destroys the CIS cache. Therefore, calling it after card setup should be avoided. We can't control the deprecated PCMCIA ioctl (which is only used on ARM nowadays), but we can avoid -- and report -- any other calls. Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | | * | pcmcia: fix ioport size calculation in rsrc_nonstaticDominik Brodowski2010-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Size needs to be calculated after manipulating with the start value. Reported-by: Komuro <komurojun-mbn@nifty.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | | * | pcmcia: re-start on MFC overrideDominik Brodowski2010-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there are changes to the number of socket devices, we need to start over in all cases: else pcmcia_request_configuration() might get confused. Reported-by: Alexander Kurz <linux@kbdbabel.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | | * | pcmcia: fix io_probe due to parent (PCI) resourcesDominik Brodowski2010-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to commit 7a96e87d, we need to be aware of any parent PCI device when requesting IO regions, even only for testing ("probing"). Reported-by: Komuro <komurojun-mbn@nifty.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | | * | pcmcia: use previously assigned IRQ for all card functionsDominik Brodowski2010-04-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use a previously assigned IRQ for all card functions, not only if CONFIG_PCMCIA_PROBE is set. Reported-by: Alexander Kurz <linux@kbdbabel.org> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
| | * | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6Linus Torvalds2010-04-20
| | |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-2.6: sparc64: Fix hardirq tracing in trap return path. sparc64: Use correct pt_regs in decode_access_size() error paths. sparc64: Fix PREEMPT_ACTIVE value. sparc64: Run NMIs on the hardirq stack. sparc64: Allocate sufficient stack space in ftrace stubs. sparc: Fix forgotten kmemleak headers inclusion
| | | * | | sparc64: Fix hardirq tracing in trap return path.David S. Miller2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can overflow the hardirq stack if we set the %pil here so early, just let the normal control flow do it. This is fine as we are allowed to do the actual IRQ enable at any point after we call trace_hardirqs_on. Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | sparc64: Use correct pt_regs in decode_access_size() error paths.David S. Miller2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | sparc64: Fix PREEMPT_ACTIVE value.David S. Miller2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It currently overlaps the NMI bit. Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | sparc64: Run NMIs on the hardirq stack.David S. Miller2010-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Otherwise we can overflow the main stack with the function tracer enabled. Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | sparc64: Allocate sufficient stack space in ftrace stubs.David S. Miller2010-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 128 bytes is sufficient for the register window save area, but the calling conventions allow the callee to save up to 6 incoming argument registers into the stack frame after the register window save area. This means a minimal stack frame is 176 bytes (128 + (6 * 8)). This fixes random crashes when using the function tracer. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | | * | | sparc: Fix forgotten kmemleak headers inclusionFrederic Weisbecker2010-04-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix forgotten kmemleak headers inclusion for kmemleak_not_leak() declaration. This fixes the following build error: arch/sparc/kernel/irq_64.c: In function ‘sun4v_build_virq’: arch/sparc/kernel/irq_64.c:657: error: implicit declaration of function ‘kmemleak_not_leak’ Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | | | Merge branch 'perf-fixes-for-linus' of ↵Linus Torvalds2010-04-20
| | |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf: Fix unsafe frame rewinding with hot regs fetching
| | | * | | | perf: Fix unsafe frame rewinding with hot regs fetchingFrederic Weisbecker2010-04-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we fetch the hot regs and rewind to the nth caller, it might happen that we dereference a frame pointer outside the kernel stack boundaries, like in this example: perf_trace_sched_switch+0xd5/0x120 schedule+0x6b5/0x860 retint_careful+0xd/0x21 Since we directly dereference a userspace frame pointer here while rewinding behind retint_careful, this may end up in a crash. Fix this by simply using probe_kernel_address() when we rewind the frame pointer. This issue will have a much more proper fix in the next version of the perf_arch_fetch_caller_regs() API that will only need to rewind to the first caller. Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Paul Mackerras <paulus@samba.org> Cc: David Miller <davem@davemloft.net> Cc: Archs <linux-arch@vger.kernel.org>
| | * | | | | Merge branch 'drm-linus' of ↵Linus Torvalds2010-04-20
| | |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6 * 'drm-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: drm: delay vblank cleanup until after driver unload
| | | * | | | | drm: delay vblank cleanup until after driver unloadJesse Barnes2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Drivers may use vblank calls now (e.g. drm_vblank_off) in their unload paths, so don't clean up the vblank related structures until after driver unload. Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Reviewed-by: Kristian Høgsberg <krh@bitplanet.net> Signed-off-by: Dave Airlie <airlied@redhat.com>
| | * | | | | | x86: correctly wire up the newuname system callChristoph Hellwig2010-04-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before commit e28cbf22933d0c0ccaf3c4c27a1a263b41f73859 ("improve sys_newuname() for compat architectures") 64-bit x86 had a private implementation of sys_uname which was just called sys_uname, which other architectures used for the old uname. Due to some merge issues with the uname refactoring patches we ended up calling the old uname version for both the old and new system call slots, which lead to the domainname filed never be set which caused failures with libnss_nis. Reported-and-tested-by: Andy Isaacson <adi@hexapodia.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| | * | | | | | Linux 2.6.34-rc5v2.6.34-rc5Linus Torvalds2010-04-19
| | | | | | | |
| | * | | | | | rmap: add exclusively owned pages to the newest anon_vmaRik van Riel2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The recent anon_vma fixes cause many anonymous pages to end up in the parent process anon_vma, even when the page is exclusively owned by the current process. Adding exclusively owned anonymous pages to the top anon_vma reduces rmap scanning overhead, especially in workloads with forking servers. This patch adds a parameter to __page_set_anon_rmap that can be used to indicate whether or not the added page is exclusively owned by the current process. Pages added through page_add_new_anon_rmap are exclusively owned by the current process, and can be added to the top anon_vma. Pages added through page_add_anon_rmap can be either shared or exclusively owned, so we do the conservative thing and add it to the oldest anon_vma. A next step would be to add the exclusive parameter to page_add_anon_rmap, to be used from functions where we do know for sure whether a page is exclusively owned. Signed-off-by: Rik van Riel <riel@redhat.com> Reviewed-by: Johannes Weiner <hannes@cmpxchg.org> Lightly-tested-by: Borislav Petkov <bp@alien8.de> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> [ Edited to look nicer - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| | * | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2010-04-19
| | |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ecryptfs/ecryptfs-2.6: eCryptfs: Turn lower lookup error messages into debug messages eCryptfs: Copy lower directory inode times and size on link ecryptfs: fix use with tmpfs by removing d_drop from ecryptfs_destroy_inode ecryptfs: fix error code for missing xattrs in lower fs eCryptfs: Decrypt symlink target for stat size eCryptfs: Strip metadata in xattr flag in encrypted view eCryptfs: Clear buffer before reading in metadata xattr eCryptfs: Rename ecryptfs_crypt_stat.num_header_bytes_at_front eCryptfs: Fix metadata in xattr feature regression
| | | * | | | | | eCryptfs: Turn lower lookup error messages into debug messagesTyler Hicks2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Vaugue warnings about ENAMETOOLONG errors when looking up an encrypted file name have caused many users to become concerned about their data. Since this is a rather harmless condition, I'm moving this warning to only be printed when the ecryptfs_verbosity module param is 1. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| | | * | | | | | eCryptfs: Copy lower directory inode times and size on linkTyler Hicks2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The timestamps and size of a lower inode involved in a link() call was being copied to the upper parent inode. Instead, we should be copying lower parent inode's timestamps and size to the upper parent inode. I discovered this bug using the POSIX test suite at Tuxera. Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| | | * | | | | | ecryptfs: fix use with tmpfs by removing d_drop from ecryptfs_destroy_inodeJeff Mahoney2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since tmpfs has no persistent storage, it pins all its dentries in memory so they have d_count=1 when other file systems would have d_count=0. ->lookup is only used to create new dentries. If the caller doesn't instantiate it, it's freed immediately at dput(). ->readdir reads directly from the dcache and depends on the dentries being hashed. When an ecryptfs mount is mounted, it associates the lower file and dentry with the ecryptfs files as they're accessed. When it's umounted and destroys all the in-memory ecryptfs inodes, it fput's the lower_files and d_drop's the lower_dentries. Commit 4981e081 added this and a d_delete in 2008 and several months later commit caeeeecf removed the d_delete. I believe the d_drop() needs to be removed as well. The d_drop effectively hides any file that has been accessed via ecryptfs from the underlying tmpfs since it depends on it being hashed for it to be accessible. I've removed the d_drop on my development node and see no ill effects with basic testing on both tmpfs and persistent storage. As a side effect, after ecryptfs d_drops the dentries on tmpfs, tmpfs BUGs on umount. This is due to the dentries being unhashed. tmpfs->kill_sb is kill_litter_super which calls d_genocide to drop the reference pinning the dentry. It skips unhashed and negative dentries, but shrink_dcache_for_umount_subtree doesn't. Since those dentries still have an elevated d_count, we get a BUG(). This patch removes the d_drop call and fixes both issues. This issue was reported at: https://bugzilla.novell.com/show_bug.cgi?id=567887 Reported-by: Árpád Bíró <biroa@demasz.hu> Signed-off-by: Jeff Mahoney <jeffm@suse.com> Cc: Dustin Kirkland <kirkland@canonical.com> Cc: stable@kernel.org Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| | | * | | | | | ecryptfs: fix error code for missing xattrs in lower fsChristian Pulvermacher2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the lower file system driver has extended attributes disabled, ecryptfs' own access functions return -ENOSYS instead of -EOPNOTSUPP. This breaks execution of programs in the ecryptfs mount, since the kernel expects the latter error when checking for security capabilities in xattrs. Signed-off-by: Christian Pulvermacher <pulvermacher@gmx.de> Cc: stable@kernel.org Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| | | * | | | | | eCryptfs: Decrypt symlink target for stat sizeTyler Hicks2010-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Create a getattr handler for eCryptfs symlinks that is capable of reading the lower target and decrypting its path. Prior to this patch, a stat's st_size field would represent the strlen of the encrypted path, while readlink() would return the strlen of the decrypted path. This could lead to confusion in some userspace applications, since the two values should be equal. https://bugs.launchpad.net/bugs/524919 Reported-by: Loïc Minier <loic.minier@canonical.com> Cc: stable@kernel.org Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| | | * | | | | | eCryptfs: Strip metadata in xattr flag in encrypted viewTyler Hicks2010-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ecryptfs_encrypted_view mount option provides a unified way of viewing encrypted eCryptfs files. If the metadata is stored in a xattr, the metadata is moved to the file header when the file is read inside the eCryptfs mount. Because of this, we should strip the ECRYPTFS_METADATA_IN_XATTR flag from the header's flag section. This allows eCryptfs to treat the file as an eCryptfs file with a header at the front. Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| | | * | | | | | eCryptfs: Clear buffer before reading in metadata xattrTyler Hicks2010-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We initially read in the first PAGE_CACHE_SIZE of a file to if the eCryptfs header marker can be found. If it isn't found and ecryptfs_xattr_metadata was given as a mount option, then the user.ecryptfs xattr is read into the same buffer. Since the data from the first page of the file wasn't cleared, it is possible that we think we've found a second tag 3 or tag 1 packet and then error out after the packet contents aren't as expected. This patch clears the buffer before filling it with metadata from the user.ecryptfs xattr. Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| | | * | | | | | eCryptfs: Rename ecryptfs_crypt_stat.num_header_bytes_at_frontTyler Hicks2010-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch renames the num_header_bytes_at_front variable to metadata_size since it now contains the max size of the metadata. Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>
| | | * | | | | | eCryptfs: Fix metadata in xattr feature regressionTyler Hicks2010-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes regression in 8faece5f906725c10e7a1f6caf84452abadbdc7b When using the ecryptfs_xattr_metadata mount option, eCryptfs stores the metadata (normally stored at the front of the file) in the user.ecryptfs xattr. This causes ecryptfs_crypt_stat.num_header_bytes_at_front to be 0, since there is no header data at the front of the file. This results in too much memory being requested and ENOMEM being returned from ecryptfs_write_metadata(). This patch fixes the problem by using the num_header_bytes_at_front variable for specifying the max size of the metadata, despite whether it is stored in the header or xattr. Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Tyler Hicks <tyhicks@linux.vnet.ibm.com>