aboutsummaryrefslogtreecommitdiffstats
path: root/include
Commit message (Collapse)AuthorAge
* netfilter: xtables: give xt_ecn its own nameJan Engelhardt2011-12-27
| | | | | | | | Use the new macro and struct names in xt_ecn.h, and put the old definitions into a definition-forwarding ipt_ecn.h. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* netfilter: xtables: move ipt_ecn to xt_ecnJan Engelhardt2011-12-27
| | | | | | | | | Prepare the ECN match for augmentation by an IPv6 counterpart. Since no symbol dependencies to ipv6.ko are added, having a single ecn match module is the more so welcome. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* Merge branch 'nf-next' of git://1984.lsi.us.es/net-nextDavid S. Miller2011-12-25
|\
| * netfilter: xtables: add nfacct match to support extended accountingPablo Neira Ayuso2011-12-24
| | | | | | | | | | | | | | | | | | | | This patch adds the match that allows to perform extended accounting. It requires the new nfnetlink_acct infrastructure. # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: add extended accounting infrastructure over nfnetlinkPablo Neira Ayuso2011-12-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently have two ways to account traffic in netfilter: - iptables chain and rule counters: # iptables -L -n -v Chain INPUT (policy DROP 3 packets, 867 bytes) pkts bytes target prot opt in out source destination 8 1104 ACCEPT all -- lo * 0.0.0.0/0 0.0.0.0/0 - use flow-based accounting provided by ctnetlink: # conntrack -L tcp 6 431999 ESTABLISHED src=192.168.1.130 dst=212.106.219.168 sport=58152 dport=80 packets=47 bytes=7654 src=212.106.219.168 dst=192.168.1.130 sport=80 dport=58152 packets=49 bytes=66340 [ASSURED] mark=0 use=1 While trying to display real-time accounting statistics, we require to pool the kernel periodically to obtain this information. This is OK if the number of flows is relatively low. However, in case that the number of flows is huge, we can spend a considerable amount of cycles to iterate over the list of flows that have been obtained. Moreover, if we want to obtain the sum of the flow accounting results that match some criteria, we have to iterate over the whole list of existing flows, look for matchings and update the counters. This patch adds the extended accounting infrastructure for nfnetlink which aims to allow displaying real-time traffic accounting without the need of complicated and resource-consuming implementation in user-space. Basically, this new infrastructure allows you to create accounting objects. One accounting object is composed of packet and byte counters. In order to manipulate create accounting objects, you require the new libnetfilter_acct library. It contains several examples of use: libnetfilter_acct/examples# ./nfacct-add http-traffic libnetfilter_acct/examples# ./nfacct-get http-traffic = { pkts = 000000000000, bytes = 000000000000 }; Then, you can use one of this accounting objects in several iptables rules using the new nfacct match (which comes in a follow-up patch): # iptables -I INPUT -p tcp --sport 80 -m nfacct --nfacct-name http-traffic # iptables -I OUTPUT -p tcp --dport 80 -m nfacct --nfacct-name http-traffic The idea is simple: if one packet matches the rule, the nfacct match updates the counters. Thanks to Patrick McHardy, Eric Dumazet, Changli Gao for reviewing and providing feedback for this contribution. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: ctnetlink: remove dead NAT codePatrick McHardy2011-12-23
| | | | | | | | | | | | | | | | | | | | The NAT range to nlattr conversation callbacks and helpers are entirely dead code and are also useless since there are no NAT ranges in conntrack context, they are only used for initially selecting a tuple. The final NAT information is contained in the selected tuples of the conntrack entry. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nat: remove module reference counting from NAT protocolsPatrick McHardy2011-12-23
| | | | | | | | | | | | | | | | | | | | | | | | The only remaining user of NAT protocol module reference counting is NAT ctnetlink support. Since this is a fairly short sequence of code, convert over to use RCU and remove module reference counting. Module unregistration is already protected by RCU using synchronize_rcu(), so no further changes are necessary. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nf_nat: export NAT definitions to userspacePatrick McHardy2011-12-23
| | | | | | | | | | | | | | | | | | | | | | Export the NAT definitions to userspace. So far userspace (specifically, iptables) has been copying the headers files from include/net. Also rename some structures and definitions in preparation for IPv6 NAT. Since these have never been officially exported, this doesn't affect existing userspace code. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: rework user-space expectation helper supportPablo Neira Ayuso2011-12-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This partially reworks bc01befdcf3e40979eb518085a075cbf0aacede0 which added userspace expectation support. This patch removes the nf_ct_userspace_expect_list since now we force to use the new iptables CT target feature to add the helper extension for conntracks that have attached expectations from userspace. A new version of the proof-of-concept code to implement userspace helpers from userspace is available at: http://people.netfilter.org/pablo/userspace-conntrack-helpers/nf-ftp-helper-POC.tar.bz2 This patch also modifies the CT target to allow to set the conntrack's userspace helper status flags. This flag is used to tell the conntrack system to explicitly allocate the helper extension. This helper extension is useful to link the userspace expectations with the master conntrack that is being tracked from one userspace helper. This feature fixes a problem in the current approach of the userspace helper support. Basically, if the master conntrack that has got a userspace expectation vanishes, the expectations point to one invalid memory address. Thus, triggering an oops in the expectation deletion event path. I decided not to add a new revision of the CT target because I only needed to add a new flag for it. I'll document in this issue in the iptables manpage. I have also changed the return value from EINVAL to EOPNOTSUPP if one flag not supported is specified. Thus, in the future adding new features that only require a new flag can be added without a new revision. There is no official code using this in userspace (apart from the proof-of-concept) that uses this infrastructure but there will be some by beginning 2012. Reported-by: Sam Roberts <vieuxtech@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: nf_conntrack: use atomic64 for accounting countersEric Dumazet2011-12-17
| | | | | | | | | | | | | | | | | | | | | | We can use atomic64_t infrastructure to avoid taking a spinlock in fast path, and remove inaccuracies while reading values in ctnetlink_dump_counters() and connbytes_mt() on 32bit arches. Suggested by Pablo. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * ipv6: add ip6_route_lookupFlorian Westphal2011-12-04
| | | | | | | | | | | | | | | | | | | | like rt6_lookup, but allows caller to pass in flowi6 structure. Will be used by the upcoming ipv6 netfilter reverse path filter match. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
| * netfilter: add ipv4 reverse path filter matchFlorian Westphal2011-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This tries to do the same thing as fib_validate_source(), but differs in several aspects. The most important difference is that the reverse path filter built into fib_validate_source uses the oif as iif when performing the reverse lookup. We do not do this, as the oif is not yet known by the time the PREROUTING hook is invoked. We can't wait until FORWARD chain because by the time FORWARD is invoked ipv4 forward path may have already sent icmp messages is response to to-be-discarded-via-rpfilter packets. To avoid the such an additional lookup in PREROUTING, Patrick McHardy suggested to attach the path information directly in the match (i.e., just do what the standard ipv4 path does a bit earlier in PREROUTING). This works, but it also has a few caveats. Most importantly, when using marks in PREROUTING to re-route traffic based on the nfmark, -m rpfilter would have to be used after the nfmark has been set; otherwise the nfmark would have no effect (because the route is already attached). Another problem would be interaction with -j TPROXY, as this target sets an nfmark and uses ACCEPT instead of continue, i.e. such a version of -m rpfilter cannot be used for the initial to-be-intercepted packets. In case in turns out that the oif is required, we can add Patricks suggestion with a new match option (e.g. --rpf-use-oif) to keep ruleset compatibility. Another difference to current builtin ipv4 rpfilter is that packets subject to ipsec transformation are not automatically excluded. If you want this, simply combine -m rpfilter with the policy match. Packets arriving on loopback interfaces always match. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
* | rfs: better sizing of dev_flow_tableEric Dumazet2011-12-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Aim of this patch is to provide full range of rps_flow_cnt on 64bit arches. Theorical limit on number of flows is 2^32 Fix some buggy RPS/RFS macros as well. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> CC: Tom Herbert <therbert@google.com> CC: Xi Wang <xi.wang@gmail.com> CC: Laurent Chavey <chavey@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2011-12-23
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: net/bluetooth/l2cap_core.c Just two overlapping changes, one added an initialization of a local variable, and another change added a new local variable. Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: relax rcvbuf limitsEric Dumazet2011-12-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | skb->truesize might be big even for a small packet. Its even bigger after commit 87fb4b7b533 (net: more accurate skb truesize) and big MTU. We should allow queueing at least one packet per receiver, even with a low RCVBUF setting. Reported-by: Michal Simek <monstr@monstr.eu> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | net: introduce DST_NOPEER dst flagEric Dumazet2011-12-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Chris Boot reported crashes occurring in ipv6_select_ident(). [ 461.457562] RIP: 0010:[<ffffffff812dde61>] [<ffffffff812dde61>] ipv6_select_ident+0x31/0xa7 [ 461.578229] Call Trace: [ 461.580742] <IRQ> [ 461.582870] [<ffffffff812efa7f>] ? udp6_ufo_fragment+0x124/0x1a2 [ 461.589054] [<ffffffff812dbfe0>] ? ipv6_gso_segment+0xc0/0x155 [ 461.595140] [<ffffffff812700c6>] ? skb_gso_segment+0x208/0x28b [ 461.601198] [<ffffffffa03f236b>] ? ipv6_confirm+0x146/0x15e [nf_conntrack_ipv6] [ 461.608786] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.614227] [<ffffffff81271d64>] ? dev_hard_start_xmit+0x357/0x543 [ 461.620659] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.626440] [<ffffffffa0379745>] ? br_parse_ip_options+0x19a/0x19a [bridge] [ 461.633581] [<ffffffff812722ff>] ? dev_queue_xmit+0x3af/0x459 [ 461.639577] [<ffffffffa03747d2>] ? br_dev_queue_push_xmit+0x72/0x76 [bridge] [ 461.646887] [<ffffffffa03791e3>] ? br_nf_post_routing+0x17d/0x18f [bridge] [ 461.653997] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.659473] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.665485] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.671234] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.677299] [<ffffffffa0379215>] ? nf_bridge_update_protocol+0x20/0x20 [bridge] [ 461.684891] [<ffffffffa03bb0e5>] ? nf_ct_zone+0xa/0x17 [nf_conntrack] [ 461.691520] [<ffffffffa0374760>] ? br_flood+0xfa/0xfa [bridge] [ 461.697572] [<ffffffffa0374812>] ? NF_HOOK.constprop.8+0x3c/0x56 [bridge] [ 461.704616] [<ffffffffa0379031>] ? nf_bridge_push_encap_header+0x1c/0x26 [bridge] [ 461.712329] [<ffffffffa037929f>] ? br_nf_forward_finish+0x8a/0x95 [bridge] [ 461.719490] [<ffffffffa037900a>] ? nf_bridge_pull_encap_header+0x1c/0x27 [bridge] [ 461.727223] [<ffffffffa0379974>] ? br_nf_forward_ip+0x1c0/0x1d4 [bridge] [ 461.734292] [<ffffffff81291c4d>] ? nf_iterate+0x41/0x77 [ 461.739758] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.746203] [<ffffffff81291cf6>] ? nf_hook_slow+0x73/0x111 [ 461.751950] [<ffffffffa03748cc>] ? __br_deliver+0xa0/0xa0 [bridge] [ 461.758378] [<ffffffffa037533a>] ? NF_HOOK.constprop.4+0x56/0x56 [bridge] This is caused by bridge netfilter special dst_entry (fake_rtable), a special shared entry, where attaching an inetpeer makes no sense. Problem is present since commit 87c48fa3b46 (ipv6: make fragment identifications less predictable) Introduce DST_NOPEER dst flag and make sure ipv6_select_ident() and __ip_select_ident() fallback to the 'no peer attached' handling. Reported-by: Chris Boot <bootc@bootc.net> Tested-by: Chris Boot <bootc@bootc.net> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds2011-12-21
| |\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net: Add a flow_cache_flush_deferred function ipv4: reintroduce route cache garbage collector net: have ipconfig not wait if no dev is available sctp: Do not account for sizeof(struct sk_buff) in estimated rwnd asix: new device id davinci-cpdma: fix locking issue in cpdma_chan_stop sctp: fix incorrect overflow check on autoclose r8169: fix Config2 MSIEnable bit setting. llc: llc_cmsg_rcv was getting called after sk_eat_skb. net: bpf_jit: fix an off-one bug in x86_64 cond jump target iwlwifi: update SCD BC table for all SCD queues Revert "Bluetooth: Revert: Fix L2CAP connection establishment" Bluetooth: Clear RFCOMM session timer when disconnecting last channel Bluetooth: Prevent uninitialized data access in L2CAP configuration iwlwifi: allow to switch to HT40 if not associated iwlwifi: tx_sync only on PAN context mwifiex: avoid double list_del in command cancel path ath9k: fix max phy rate at rate control init nfc: signedness bug in __nci_request() iwlwifi: do not set the sequence control bit is not needed
| | * | net: Add a flow_cache_flush_deferred functionSteffen Klassert2011-12-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | flow_cach_flush() might sleep but can be called from atomic context via the xfrm garbage collector. So add a flow_cache_flush_deferred() function and use this if the xfrm garbage colector is invoked from within the packet path. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Acked-by: Timo Teräs <timo.teras@iki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
| | * | sctp: fix incorrect overflow check on autocloseXi Wang2011-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 8ffd3208 voids the previous patches f6778aab and 810c0719 for limiting the autoclose value. If userspace passes in -1 on 32-bit platform, the overflow check didn't work and autoclose would be set to 0xffffffff. This patch defines a max_autoclose (in seconds) for limiting the value and exposes it through sysctl, with the following intentions. 1) Avoid overflowing autoclose * HZ. 2) Keep the default autoclose bound consistent across 32- and 64-bit platforms (INT_MAX / HZ in this patch). 3) Keep the autoclose value consistent between setsockopt() and getsockopt() calls. Suggested-by: Vlad Yasevich <vladislav.yasevich@hp.com> Signed-off-by: Xi Wang <xi.wang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
| * | | Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds2011-12-20
| |\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time/clocksource: Fix kernel-doc warnings rtc: m41t80: Workaround broken alarm functionality rtc: Expire alarms after the time is set.
| | * | | time/clocksource: Fix kernel-doc warningsKusanagi Kouichi2011-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix various KernelDoc build warnings. Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Cc: John Stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20111219091320.0D5AF6FC03D@msa105.auone-net.jp Signed-off-by: Ingo Molnar <mingo@elte.hu>
| * | | | Merge branch 'stable/for-linus-fixes-3.2' of ↵Linus Torvalds2011-12-20
| |\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen * 'stable/for-linus-fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel"
| | * | | | Revert "xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from ↵Konrad Rzeszutek Wilk2011-12-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | old kernel" This reverts commit ddacf5ef684a655abe2bb50c4b2a5b72ae0d5e05. As when booting the kernel under Amazon EC2 as an HVM guest it ends up hanging during startup. Reverting this we loose the fix for kexec booting to the crash kernels. Fixes Canonical BZ #901305 (http://bugs.launchpad.net/bugs/901305) Tested-by: Alessandro Salvatori <sandr8@gmail.com> Reported-by: Stefan Bader <stefan.bader@canonical.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
| * | | | | Merge branch 'v4l_for_linus' of ↵Linus Torvalds2011-12-20
| |\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (31 commits) Revert "[media] af9015: limit I2C access to keep FW happy" [media] s5p-fimc: Fix camera input configuration in subdev operations [media] m5mols: Fix logic in sanity check [media] ati_remote: switch to single-byte scancodes [media] V4L: mt9m111: fix uninitialised mutex [media] V4L: omap1_camera: fix missing <linux/module.h> include [media] V4L: mt9t112: use after free in mt9t112_probe() [media] V4L: soc-camera: fix compiler warnings on 64-bit platforms [media] s5p_mfc_enc: fix s/H264/H263/ typo [media] omap_vout: Fix compile error in 3.1 [media] au0828: add missing models 72101, 72201 & 72261 to the model matrix [media] au0828: add missing USB ID 2040:7213 [media] au0828: add missing USB ID 2040:7260 [media] [trivial] omap24xxcam-dma: Fix logical test [media] omap_vout: fix crash if no driver for a display [media] media: video: s5p-tv: fix build break [media] omap3isp: fix compilation of ispvideo.c [media] m5mols: Fix set_fmt to return proper pixel format code [media] s5p-fimc: Use correct fourcc for RGB565 colour format [media] s5p-fimc: Fail driver probing when sensor configuration is wrong ...
| | * | | | | [media] V4L: soc-camera: fix compiler warnings on 64-bit platformsGuennadi Liakhovetski2011-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On 64-bit platforms assigning a pointer to a 32-bit variable causes a compiler warning and cannot actually work. Soc-camera currently doesn't support any 64-bit systems, but such platforms can be added in the and in any case compiler warnings should be avoided. Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Acked-by: Janusz Krzysztofik <jkrzyszt@tis.icnet.pl> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
| * | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6Linus Torvalds2011-12-18
| |\ \ \ \ \ \ | | |_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6: (22 commits) [SCSI] fcoe: fix fcoe in a DCB environment by adding DCB notifiers to set skb priority [SCSI] bnx2i: Fixed kernel panic caused by unprotected task->sc->request deref [SCSI] qla4xxx: check for failed conn setup [SCSI] qla4xxx: a small loop fix [SCSI] qla4xxx: fix flash/ddb support [SCSI] zfcp: return early from slave_destroy if slave_alloc returned early [SCSI] fcoe: Fix preempt count leak in fcoe_filter_frames() [SCSI] qla2xxx: Update version number to 8.03.07.12-k. [SCSI] qla2xxx: Submit all chained IOCBs for passthrough commands on request queue 0. [SCSI] qla2xxx: Correct fc_host port_state display. [SCSI] qla2xxx: Disable generating pause frames when firmware hang detected for ISP82xx. [SCSI] qla2xxx: Clear mailbox busy flag during premature mailbox completion for ISP82xx. [SCSI] qla2xxx: Encapsulate prematurely completing mailbox commands during ISP82xx firmware hang. [SCSI] qla2xxx: Display IPE error message for ISP82xx. [SCSI] qla2xxx: Return the correct value for a mailbox command if 82xx is in reset recovery. [SCSI] qla2xxx: Enable Minidump by default with default capture mask 0x1f. [SCSI] qla2xxx: Stop unconditional completion of mailbox commands issued in interrupt mode during firmware hang. [SCSI] qla2xxx: Revert back the request queue mapping to request queue 0. [SCSI] qla2xxx: Don't call alloc_fw_dump for ISP82XX. [SCSI] qla2xxx: Check for SCSI status on underruns. ...
| | * | | | | [SCSI] fcoe: fix fcoe in a DCB environment by adding DCB notifiers to set ↵john fastabend2011-12-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | skb priority Use DCB notifiers to set the skb priority to allow packets to be steered and tagged correctly over DCB enabled drivers that setup traffic classes. This allows queue_mapping() routines to be removed in these drivers that were previously inspecting the ethertype of every skb to mark FCoE/FIP frames. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com>
| * | | | | | Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linuxLinus Torvalds2011-12-16
| |\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux: drm/i915/dp: Dither down to 6bpc if it makes the mode fit drm/i915: enable semaphores on per-device defaults drm/i915: don't set unpin_work if vblank_get fails drm/i915: By default, enable RC6 on IVB and SNB when reasonable iommu: Export intel_iommu_enabled to signal when iommu is in use drm/i915/sdvo: Include LVDS panels for the IS_DIGITAL check drm/i915: prevent division by zero when asking for chipset power drm/i915: add PCH info to i915_capabilities drm/i915: set the right SDVO transcoder for CPT drm/i915: no-lvds quirk for ASUS AT5NM10T-I drm/i915: Treat pre-gen4 backlight duty cycle value consistently drm/i915: Hook up Ivybridge eDP drm/i915: add multi-threaded forcewake support
| | * | | | | | iommu: Export intel_iommu_enabled to signal when iommu is in useEugeni Dodonov2011-12-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In i915 driver, we do not enable either rc6 or semaphores on SNB when dmar is enabled. The new 'intel_iommu_enabled' variable signals when the iommu code is in operation. Cc: Ted Phelps <phelps@gnusto.com> Cc: Peter <pab1612@gmail.com> Cc: Lukas Hejtmanek <xhejtman@fi.muni.cz> Cc: Andrew Lutomirski <luto@mit.edu> CC: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Eugeni Dodonov <eugeni.dodonov@intel.com> Signed-off-by: Keith Packard <keithp@keithp.com>
| * | | | | | | Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds2011-12-16
| |\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * 'for-linus' of git://git.kernel.dk/linux-block: block: don't kick empty queue in blk_drain_queue() block/swim3: Locking fixes loop: Fix discard_alignment default setting cfq-iosched: fix cfq_cic_link() race confition cfq-iosched: free cic_index if blkio_alloc_blkg_stats fails cciss: fix flush cache transfer length cciss: Add IRQF_SHARED back in for the non-MSI(X) interrupt handler loop: fix loop block driver discard and encryption comment block: initialize request_queue's numa node during
| | * | | | | | | block: initialize request_queue's numa node duringMike Snitzer2011-11-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | struct request_queue is allocated with __GFP_ZERO so its "node" field is zero before initialization. This causes an oops if node 0 is offline in the page allocator because its zonelists are not initialized. From Dave Young's dmesg: SRAT: Node 1 PXM 2 0-d0000000 SRAT: Node 1 PXM 2 100000000-330000000 SRAT: Node 0 PXM 1 330000000-630000000 Initmem setup node 1 0000000000000000-000000000affb000 ... Built 1 zonelists in Node order, mobility grouping on. ... BUG: unable to handle kernel paging request at 0000000000001c08 IP: [<ffffffff8111c355>] __alloc_pages_nodemask+0xb5/0x870 and __alloc_pages_nodemask+0xb5 translates to a NULL pointer on zonelist->_zonerefs. The fix is to initialize q->node at the time of allocation so the correct node is passed to the slab allocator later. Since blk_init_allocated_queue_node() is no longer needed, merge it with blk_init_allocated_queue(). [rientjes@google.com: changelog, initializing q->node] Cc: stable@vger.kernel.org [2.6.37+] Reported-by: Dave Young <dyoung@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: David Rientjes <rientjes@google.com> Tested-by: Dave Young <dyoung@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
| * | | | | | | | drm/radeon/kms: add some new pci idsAlex Deucher2011-12-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=43739 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com>
| * | | | | | | | linux/log2.h: Fix rounddown_pow_of_two(1)Linus Torvalds2011-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Exactly like roundup_pow_of_two(1), the rounddown version was buggy for the case of a compile-time constant '1' argument. Probably because it originated from the same code, sharing history with the roundup version from before the bugfix (for that one, see commit 1a06a52ee1b0: "Fix roundup_pow_of_two(1)"). However, unlike the roundup version, the fix for rounddown is to just remove the broken special case entirely. It's simply not needed - the generic code 1UL << ilog2(n) does the right thing for the constant '1' argment too. The only reason roundup needed that special case was because rounding up does so by subtracting one from the argument (and then adding one to the result) causing the obvious problems with "ilog2(0)". But rounddown doesn't do any of that, since ilog2() naturally truncates (ie "rounds down") to the right rounded down value. And without the ilog2(0) case, there's no reason for the special case that had the wrong value. tl;dr: rounddown_pow_of_two(1) should be 1, not 0. Acked-by: Dmitry Torokhov <dtor@vmware.com> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2011-12-12
| |\ \ \ \ \ \ \ \ | | |_|_|/ / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: mmc: core: Fix deadlock when the CONFIG_MMC_UNSAFE_RESUME is not defined mmc: sdhci-s3c: Remove old and misprototyped suspend operations mmc: tmio: fix clock gating on platforms with a .set_pwr() method mmc: sh_mmcif: fix clock gating on platforms with a .down_pwr() method mmc: core: Fix typo at mmc_card_sleep mmc: core: Fix power_off_notify during suspend mmc: core: Fix setting power notify state variable for non-eMMC mmc: core: Add quirk for long data read time mmc: Add module.h include to sdhci-cns3xxx.c mmc: mxcmmc: fix falling back to PIO mmc: omap_hsmmc: DMA unmap only once in case of MMC error
| | * | | | | | | mmc: core: Add quirk for long data read timeStefan Nilsson XK2011-12-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds a quirk that sets the data read timeout to a fixed value instead of relying on the information in the CSD. The timeout value chosen is 300ms since that has proven enough for the problematic cards found, but could be increased if other cards require this. This patch also enables this quirk for certain Micron cards known to have this problem. Signed-off-by: Stefan Nilsson XK <stefan.xk.nilsson@stericsson.com> Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Cc: <stable@kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>
| * | | | | | | | Merge git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tileLinus Torvalds2011-12-09
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: arch/tile: use new generic {enable,disable}_percpu_irq() routines drivers/net/ethernet/tile: use skb_frag_page() API asm-generic/unistd.h: support new process_vm_{readv,write} syscalls arch/tile: fix double-free bug in homecache_free_pages() arch/tile: add a few #includes and an EXPORT to catch up with kernel changes.
| | * | | | | | | | asm-generic/unistd.h: support new process_vm_{readv,write} syscallsChris Metcalf2011-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also prototype the "compat" functions so they can be referenced from C code. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Acked-by: Arnd Bergmann <arnd@arndb.de>
| * | | | | | | | | vmscan: use atomic-long for shrinker batchingKonstantin Khlebnikov2011-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Use atomic-long operations instead of looping around cmpxchg(). [akpm@linux-foundation.org: massage atomic.h inclusions] Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
| * | | | | | | | | Merge branch '3.2-rc-fixes' of ↵Linus Torvalds2011-12-07
| |\ \ \ \ \ \ \ \ \ | | |_|/ / / / / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending * '3.2-rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (25 commits) iscsi-target: Fix hex2bin warn_unused compile message target: Don't return an error if disabling unsupported features target/rd: fix or rewrite the copy routine target/rd: simplify the page/offset computation target: remove the unused se_dev_list target/file: walk properly over sg list target: remove unused struct fields target: Fix page length in emulated INQUIRY VPD page 86h target: Handle 0 correctly in transport_get_sectors_6() target: Don't return an error status for 0-length READ and WRITE iscsi-target: Use kmemdup rather than duplicating its implementation iscsi-target: Add missing F_BIT for iscsi_tm_rsp iscsi-target: Fix residual count hanlding + remove iscsi_cmd->residual_count target: Reject SCSI data overflow for fabrics using transport_generic_map_mem_to_cmd target: remove the unused t_task_pt_sgl and t_task_pt_sgl_num se_cmd fields target: remove the t_tasks_bidi se_cmd field target: remove the t_tasks_fua se_cmd field target: remove the se_ordered_node se_cmd field target: remove the se_obj_ptr and se_orig_obj_ptr se_cmd fields target: Drop config_item_name usage in fabric TFO->free_wwn() ...
| | * | | | | | | | target: remove the unused se_dev_listChristoph Hellwig2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| | * | | | | | | | target: remove unused struct fieldsJörn Engel2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some are never used, some are set but never read, dev_hoq_count is incremented and decremented, but never read. Signed-off-by: Joern Engel <joern@logfs.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| | * | | | | | | | target: remove the unused t_task_pt_sgl and t_task_pt_sgl_num se_cmd fieldsChristoph Hellwig2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| | * | | | | | | | target: remove the t_tasks_bidi se_cmd fieldChristoph Hellwig2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And use a SCF_BIDI flag instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| | * | | | | | | | target: remove the t_tasks_fua se_cmd fieldChristoph Hellwig2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And use a SCF_FUA flag instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| | * | | | | | | | target: remove the se_ordered_node se_cmd fieldChristoph Hellwig2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We never walk ordered_cmd_list in the se_device, so remove all code related to supporting it. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| | * | | | | | | | target: remove the se_obj_ptr and se_orig_obj_ptr se_cmd fieldsChristoph Hellwig2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already have a perfectly valid se_device pointer in the command, so remove the mostly useless duplicates. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| | * | | | | | | | target: Avoid compiler warnings about signed one-bit bitfieldsBart Van Assche2011-12-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert to unsigned bit fields for active I/O shutdown fields. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| | * | | | | | | | target: Address legacy PYX_TRANSPORT_* return code breakageNicholas Bellinger2011-12-06
| | |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes legacy usage of PYX_TRANSPORT_* return codes in a number of locations and addresses cases where transport_generic_request_failure() was returning the incorrect sense upon CHECK_CONDITION status after the v3.1 converson to use errno return codes. This includes the conversion of transport_generic_request_failure() to process cmd->scsi_sense_reason and handle extra TCM_RESERVATION_CONFLICT before calling transport_send_check_condition_and_sense() to queue up response status. It also drops PYX_TRANSPORT_OUT_OF_MEMORY_RESOURCES legacy usgae, and returns TCM_LOGICAL_UNIT_COMMUNICATION_FAILURE w/ a response for these cases. transport_generic_allocate_tasks(), transport_generic_new_cmd(), backend SCF_SCSI_DATA_SG_IO_CDB ->do_task(), and emulated ->execute_task() have all been updated to set se_cmd->scsi_sense_reason and return errno codes universally upon failure. This includes cmd->scsi_sense_reason assignment in target_core_alua.c, target_core_pr.c and target_core_cdb.c emulation code. Finally it updates fabric modules to remove the legacy usage, and for TFO->new_cmd_map() callers forwards return values outside of fabric code. iscsi-target has also been updated to remove a handful of special cases related to the cleanup and signaling QUEUE_FULL handling w/ ft_write_pending() (v2: Drop extra SCF_SCSI_CDB_EXCEPTION check during failure from transport_generic_new_cmd, and re-add missing task->task_error_status assignment in transport_complete_task) Cc: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
| * | | | | | | | Merge branch 'for-linus' of ↵Linus Torvalds2011-12-07
| |\ \ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix apparmor dereferencing potentially freed dentry, sanitize __d_path() API
| | * | | | | | | | fix apparmor dereferencing potentially freed dentry, sanitize __d_path() APIAl Viro2011-12-06
| | |/ / / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | __d_path() API is asking for trouble and in case of apparmor d_namespace_path() getting just that. The root cause is that when __d_path() misses the root it had been told to look for, it stores the location of the most remote ancestor in *root. Without grabbing references. Sure, at the moment of call it had been pinned down by what we have in *path. And if we raced with umount -l, we could have very well stopped at vfsmount/dentry that got freed as soon as prepend_path() dropped vfsmount_lock. It is safe to compare these pointers with pre-existing (and known to be still alive) vfsmount and dentry, as long as all we are asking is "is it the same address?". Dereferencing is not safe and apparmor ended up stepping into that. d_namespace_path() really wants to examine the place where we stopped, even if it's not connected to our namespace. As the result, it looked at ->d_sb->s_magic of a dentry that might've been already freed by that point. All other callers had been careful enough to avoid that, but it's really a bad interface - it invites that kind of trouble. The fix is fairly straightforward, even though it's bigger than I'd like: * prepend_path() root argument becomes const. * __d_path() is never called with NULL/NULL root. It was a kludge to start with. Instead, we have an explicit function - d_absolute_root(). Same as __d_path(), except that it doesn't get root passed and stops where it stops. apparmor and tomoyo are using it. * __d_path() returns NULL on path outside of root. The main caller is show_mountinfo() and that's precisely what we pass root for - to skip those outside chroot jail. Those who don't want that can (and do) use d_path(). * __d_path() root argument becomes const. Everyone agrees, I hope. * apparmor does *NOT* try to use __d_path() or any of its variants when it sees that path->mnt is an internal vfsmount. In that case it's definitely not mounted anywhere and dentry_path() is exactly what we want there. Handling of sysctl()-triggered weirdness is moved to that place. * if apparmor is asked to do pathname relative to chroot jail and __d_path() tells it we it's not in that jail, the sucker just calls d_absolute_path() instead. That's the other remaining caller of __d_path(), BTW. * seq_path_root() does _NOT_ return -ENAMETOOLONG (it's stupid anyway - the normal seq_file logics will take care of growing the buffer and redoing the call of ->show() just fine). However, if it gets path not reachable from root, it returns SEQ_SKIP. The only caller adjusted (i.e. stopped ignoring the return value as it used to do). Reviewed-by: John Johansen <john.johansen@canonical.com> ACKed-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org