| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
Sending host command with CMD_WANT_SKB flag demands the release of the
response buffer with iwl_free_resp function.
The patch adds the memory release in all the relevant places
Signed-off-by: Shahar S Matityahu <shahar.s.matityahu@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
|
| |
In a previous commit, we removed support for API versions earlier than
22 for these NICs. By mistake, the *_UCODE_API_MIN definitions were
set to 17. Fix that.
Fixes: 4b87e5af638b ("iwlwifi: remove support for fw older than -17 and -22")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
| |
Clear the struct so that all reserved fields are zero when we
send the struct down to the device.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The iwl_mvm_remove_sta_key() function handles removing a key when the
sta doesn't exist anymore. Mistakenly, this was changed to return an
error while fixing another bug.
If the mvm_sta doesn't exist, we continue normally, but just don't try
to remove the igtk key.
Fixes: cd4d23c1ea9b ("iwlwifi: mvm: Fix removal of IGTK")
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
We only need to handle d0i3 entry and exit during suspend resume if
system_pm is set to IWL_PLAT_PM_MODE_D0I3, otherwise d0i3 entry
failures will cause suspend to fail.
This fixes https://bugzilla.kernel.org/show_bug.cgi?id=194791
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we want to stop the recording of the firmware debug
and restart it later without reloading the firmware we
don't need to resend the configuration that comes with
host commands.
Sending those commands confused the hardware and led to
an NMI 0x66.
Change the flow as following:
* read the relevant registers (DBGC_IN_SAMPLE, DBGC_OUT_CTRL)
* clear those registers
* wait for the hardware to complete its write to the buffer
* get the data
* restore the value of those registers (to restart the
recording)
For early start (where the configuration is already
compiled in the firmware), we don't need to set those
registers after the firmware has been loaded, but only
when we want to restart the recording without having
restarted the firmware.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
The ucode_loaded check should be under the mutex, since it can
otherwise change state after we looked at it and before we got
the mutex. Fix that.
Fixes: 5c89e7bc557e ("iwlwifi: mvm: add registration to cooling device")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
Allow working IBSS also when working in DQA mode.
This is done by setting it to treat the queues the
same as a BSS AP treats the queues.
Fixes: 7948b87308a4 ("iwlwifi: mvm: enable dynamic queue allocation mode")
Signed-off-by: Liad Kaufman <liad.kaufman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During d0i3 flow we flush all the queue except from the command queue.
Currently, in this flow the command queue is hard coded to 9.
In DQA the command queue number has changed from 9 to 0.
Fix that.
This fixes a problem in runtime PM resume flow.
Fixes: 097129c9e625 ("iwlwifi: mvm: move cmd queue to be #0 in dqa mode")
Signed-off-by: Haim Dreyfuss <haim.dreyfuss@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Up until now, the driver was comparing the rate reported by the FW and
the rate of the latest LQ command to avoid processing data belonging
to the old LQ command. Recently, FW changed the meaning of the initial
rate field in tx response and it holds the actual rate (which is not
necessarily the initial rate of LQ's rate table). Use instead LQ cmd
color to be able to filter out tx responses/BA notifications which
where sent during earlier LQ commands' time frame.
This fixes some throughput degradation in noisy environments.
Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since commit 76b91c32dd86 ("bridge: stp: when using userspace stp stop
kernel hello and hold timers"), bridge would not start hello_timer if
stp_enabled is not KERNEL_STP when br_dev_open.
The problem is even if users set stp_enabled with KERNEL_STP later,
the timer will still not be started. It causes that KERNEL_STP can
not really work. Users have to re-ifup the bridge to avoid this.
This patch is to fix it by starting br->hello_timer when enabling
KERNEL_STP in br_stp_start.
As an improvement, it's also to start hello_timer again only when
br->stp_enabled is KERNEL_STP in br_hello_timer_expired, there is
no reason to start the timer again when it's NO_STP.
Fixes: 76b91c32dd86 ("bridge: stp: when using userspace stp stop kernel hello and hold timers")
Reported-by: Haidong Li <haili@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Ivan Vecera <cera@cera.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When TX checksum offload is used, if the computed checksum is 0 the
LAN95xx device do not alter the checksum to 0xffff. In the case of ipv4
UDP checksum, it indicates to receiver that no checksum is calculated.
Under ipv6, UDP checksum yields a result of zero must be changed to
0xffff. Hence disabling checksum offload for ipv6 packets.
Signed-off-by: Nisar Sayed <Nisar.Sayed@microchip.com>
Reported-by: popcorn mix <popcornmix@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Ihar Hrachyshka says:
====================
arp: always override existing neigh entries with gratuitous ARP
This patchset is spurred by discussion started at
https://patchwork.ozlabs.org/patch/760372/ where we figured that there is no
real reason for enforcing override by gratuitous ARP packets only when
arp_accept is 1. Same should happen when it's 0 (the default value).
changelog v2: handled review comments by Julian Anastasov
- fixed a mistake in a comment;
- postponed addr_type calculation to as late as possible.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently, when arp_accept is 1, we always override existing neigh
entries with incoming gratuitous ARP replies. Otherwise, we override
them only if new replies satisfy _locktime_ conditional (packets arrive
not earlier than _locktime_ seconds since the last update to the neigh
entry).
The idea behind locktime is to pick the very first (=> close) reply
received in a unicast burst when ARP proxies are used. This helps to
avoid ARP thrashing where Linux would switch back and forth from one
proxy to another.
This logic has nothing to do with gratuitous ARP replies that are
generally not aligned in time when multiple IP address carriers send
them into network.
This patch enforces overriding of existing neigh entries by all incoming
gratuitous ARP packets, irrespective of their time of arrival. This will
make the kernel honour all incoming gratuitous ARP packets.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The addr_type retrieval can be costly, so it's worth trying to avoid its
calculation as much as possible. This patch makes it calculated only
for gratuitous ARP packets. This is especially important since later we
may want to move is_garp calculation outside of arp_accept block, at
which point the costly operation will be executed for all setups.
The patch is the result of a discussion in net-dev:
http://marc.info/?l=linux-netdev&m=149506354216994
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The code is quite involving already to earn a separate function for
itself. If anything, it helps arp_process readability.
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|/
|
|
|
|
|
|
|
|
|
|
| |
the is_garp code deals just with gratuitous ARP packets, not every
unsolicited packet.
This patch is a result of a discussion in netdev:
http://marc.info/?l=linux-netdev&m=149506354216994
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Ihar Hrachyshka <ihrachys@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When tcp_disconnect() is called, inet_csk_delack_init() sets
icsk->icsk_ack.rcv_mss to 0.
This could potentially cause tcp_recvmsg() => tcp_cleanup_rbuf() =>
__tcp_select_window() call path to have division by 0 issue.
So this patch initializes rcv_mss to TCP_MIN_MSS instead of 0.
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains Netfilter/IPVS fixes for your net tree,
they are:
1) When using IPVS in direct-routing mode, normal traffic from the LVS
host to a back-end server is sometimes incorrectly NATed on the way
back into the LVS host. Patch to fix this from Julian Anastasov.
2) Calm down clang compilation warning in ctnetlink due to type
mismatch, from Matthias Kaehlcke.
3) Do not re-setup NAT for conntracks that are already confirmed, this
is fixing a problem that was introduced in the previous nf-next batch.
Patch from Liping Zhang.
4) Do not allow conntrack helper removal from userspace cthelper
infrastructure if already in used. This comes with an initial patch
to introduce nf_conntrack_helper_put() that is required by this fix.
From Liping Zhang.
5) Zero the pad when copying data to userspace, otherwise iptables fails
to remove rules. This is a follow up on the patchset that sorts out
the internal match/target structure pointer leak to userspace. Patch
from the same author, Willem de Bruijn. This also comes with a build
failure when CONFIG_COMPAT is not on, coming in the last patch of
this series.
6) SYNPROXY crashes with conntrack entries that are created via
ctnetlink, more specifically via conntrackd state sync. Patch from
Eric Leblond.
7) RCU safe iteration on set element dumping in nf_tables, from
Liping Zhang.
8) Missing sanitization of immediate date for the bitwise and cmp
expressions in nf_tables.
9) Refcounting logic for chain and objects from set elements does not
integrate into the nf_tables 2-phase commit protocol.
10) Missing sanitization of target verdict in ebtables arpreply target,
from Gao Feng.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The patch in the Fixes references COMPAT_XT_ALIGN in the definition
of XT_DATA_TO_USER, outside an #ifdef CONFIG_COMPAT block.
Split XT_DATA_TO_USER into separate compat and non compat variants and
define the first inside an CONFIG_COMPAT block.
This simplifies both variants by removing branches inside the macro.
Fixes: 324318f0248c ("netfilter: xtables: zero padding in data_to_user")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The info->target comes from userspace and it would be used directly.
So we need to add the sanity check to make sure it is a valid standard
target, although the ebtables tool has already checked it. Kernel needs
to validate anything coming from userspace.
If the target is set as an evil value, it would break the ebtables
and cause a panic. Because the non-standard target is treated as one
offset.
Now add one helper function ebt_invalid_target, and we would replace
the macro INVALID_TARGET later.
Signed-off-by: Gao Feng <gfree.wind@vip.163.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Andreas reports that the following incremental update using our commit
protocol doesn't work.
# nft -f incremental-update.nft
delete element ip filter client_to_any { 10.180.86.22 : goto CIn_1 }
delete chain ip filter CIn_1
... Error: Could not process rule: Device or resource busy
The existing code is not well-integrated into the commit phase protocol,
since element deletions do not result in refcount decrement from the
preparation phase. This results in bogus EBUSY errors like the one
above.
Two new functions come with this patch:
* nft_set_elem_activate() function is used from the abort path, to
restore the set element refcounting on objects that occurred from
the preparation phase.
* nft_set_elem_deactivate() that is called from nft_del_setelem() to
decrement set element refcounting on objects from the preparation
phase in the commit protocol.
The nft_data_uninit() has been renamed to nft_data_release() since this
function does not uninitialize any data store in the data register,
instead just releases the references to objects. Moreover, a new
function nft_data_hold() has been introduced to be used from
nft_set_elem_activate().
Reported-by: Andreas Schultz <aschultz@tpip.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Do not assume userspace always sends us NFT_DATA_VALUE for bitwise and
cmp expressions. Although NFT_DATA_VERDICT does not make any sense, it
is still possible to handcraft a netlink message using this incorrect
data type.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When dumping the elements related to a specified set, we may invoke the
nf_tables_dump_set with the NFNL_SUBSYS_NFTABLES lock not acquired. So
we should use the proper rcu operation to avoid race condition, just
like other nft dump operations.
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch fixes the creation of connection tracking entry from
netlink when synproxy is used. It was missing the addition of
the synproxy extension.
This was causing kernel crashes when a conntrack entry created by
conntrackd was used after the switch of traffic from active node
to the passive node.
Signed-off-by: Eric Leblond <eric@regit.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When looking up an iptables rule, the iptables binary compares the
aligned match and target data (XT_ALIGN). In some cases this can
exceed the actual data size to include padding bytes.
Before commit f77bc5b23fb1 ("iptables: use match, target and data
copy_to_user helpers") the malloc()ed bytes were overwritten by the
kernel with kzalloced contents, zeroing the padding and making the
comparison succeed. After this patch, the kernel copies and clears
only data, leaving the padding bytes undefined.
Extend the clear operation from data size to aligned data size to
include the padding bytes, if any.
Padding bytes can be observed in both match and target, and the bug
triggered, by issuing a rule with match icmp and target ACCEPT:
iptables -t mangle -A INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT
iptables -t mangle -D INPUT -i lo -p icmp --icmp-type 1 -j ACCEPT
Fixes: f77bc5b23fb1 ("iptables: use match, target and data copy_to_user helpers")
Reported-by: Paul Moore <pmoore@redhat.com>
Reported-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| |\
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
http://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs
Simon Horman says:
====================
IPVS Fixes for v4.12
please consider this fix to IPVS for v4.12.
* It is a fix from Julian Anastasov to only SNAT SNAT packet replies only for
NATed connections
My understanding is that this fix is appropriate for 4.9.25, 4.10.13, 4.11
as well as the nf tree. Julian has separately posted backports for other
-stable kernels; please see:
* [PATCH 3.2.88,3.4.113 -stable 1/3] ipvs: SNAT packet replies only for
NATed connections
* [PATCH 3.10.105,3.12.73,3.16.43,4.1.39 -stable 2/3] ipvs: SNAT packet
replies only for NATed connections
* [PATCH 4.4.65 -stable 3/3] ipvs: SNAT packet replies only for NATed
connections
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We do not check if packet from real server is for NAT
connection before performing SNAT. This causes problems
for setups that use DR/TUN and allow local clients to
access the real server directly, for example:
- local client in director creates IPVS-DR/TUN connection
CIP->VIP and the request packets are routed to RIP.
Talks are finished but IPVS connection is not expired yet.
- second local client creates non-IPVS connection CIP->RIP
with same reply tuple RIP->CIP and when replies are received
on LOCAL_IN we wrongly assign them for the first client
connection because RIP->CIP matches the reply direction.
As result, IPVS SNATs replies for non-IPVS connections.
The problem is more visible to local UDP clients but in rare
cases it can happen also for TCP or remote clients when the
real server sends the reply traffic via the director.
So, better to be more precise for the reply traffic.
As replies are not expected for DR/TUN connections, better
to not touch them.
Reported-by: Nick Moriarty <nick.moriarty@york.ac.uk>
Tested-by: Nick Moriarty <nick.moriarty@york.ac.uk>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We can still delete the ct helper even if it is in use, this will cause
a use-after-free error. In more detail, I mean:
# nfct helper add ssdp inet udp
# iptables -t raw -A OUTPUT -p udp -j CT --helper ssdp
# nfct helper delete ssdp //--> oops, succeed!
BUG: unable to handle kernel paging request at 000026ca
IP: 0x26ca
[...]
Call Trace:
? ipv4_helper+0x62/0x80 [nf_conntrack_ipv4]
nf_hook_slow+0x21/0xb0
ip_output+0xe9/0x100
? ip_fragment.constprop.54+0xc0/0xc0
ip_local_out+0x33/0x40
ip_send_skb+0x16/0x80
udp_send_skb+0x84/0x240
udp_sendmsg+0x35d/0xa50
So add reference count to fix this issue, if ct helper is used by
others, reject the delete request.
Apply this patch:
# nfct helper delete ssdp
nfct v1.4.3: netlink error: Device or resource busy
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
And convert module_put invocation to nf_conntrack_helper_put, this is
prepared for the followup patch, which will add a refcnt for cthelper,
so we can reject the deleting request when cthelper is in use.
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
We cannot setup nat info if the ct has been confirmed already, else,
different cpu may race to handle the same ct. In extreme situation,
we may hit the "BUG_ON(nf_nat_initialized(ct, maniptype))" in the
nf_nat_setup_info.
Also running the following commands will easily hit NF_CT_ASSERT in
nf_conntrack_alter_reply:
# nft flush ruleset
# ping -c 2 -W 1 1.1.1.111 &
# nft add table t
# nft add chain t c {type nat hook postrouting priority 0 \;}
# nft add rule t c snat to 4.5.6.7
WARNING: CPU: 1 PID: 10065 at net/netfilter/nf_conntrack_core.c:1472
nf_conntrack_alter_reply+0x9a/0x1a0 [nf_conntrack]
[...]
Call Trace:
nf_nat_setup_info+0xad/0x840 [nf_nat]
? deactivate_slab+0x65d/0x6c0
nft_nat_eval+0xcd/0x100 [nft_nat]
nft_do_chain+0xff/0x5d0 [nf_tables]
? mark_held_locks+0x6f/0xa0
? __local_bh_enable_ip+0x70/0xa0
? trace_hardirqs_on_caller+0x11f/0x190
? ipt_do_table+0x310/0x610
? trace_hardirqs_on+0xd/0x10
? __local_bh_enable_ip+0x70/0xa0
? ipt_do_table+0x32b/0x610
? __lock_acquire+0x2ac/0x1580
? ipt_do_table+0x32b/0x610
nft_nat_do_chain+0x65/0x80 [nft_chain_nat_ipv4]
nf_nat_ipv4_fn+0x1ae/0x240 [nf_nat_ipv4]
nf_nat_ipv4_out+0x4a/0xf0 [nf_nat_ipv4]
nft_nat_ipv4_out+0x15/0x20 [nft_chain_nat_ipv4]
nf_hook_slow+0x2c/0xf0
ip_output+0x154/0x270
So for the confirmed ct, just ignore it and return NF_ACCEPT.
Fixes: 9a08ecfe74d7 ("netfilter: don't attach a nat extension by default")
Signed-off-by: Liping Zhang <zlpnobody@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Not all parameters passed to ctnetlink_parse_tuple() and
ctnetlink_exp_dump_tuple() match the enum type in the signatures of these
functions. Since this is intended change the argument type of to be an
unsigned integer value.
Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|\ \ \
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Pull MD fixes from Shaohua Li:
- Several bug fixes for raid5-cache from Song Liu, mainly handle
journal disk error
- Fix bad block handling in choosing raid1 disk from Tomasz Majchrzak
- Simplify external metadata array sysfs handling from Artur
Paszkiewicz
- Optimize raid0 discard handling from me, now raid0 will dispatch
large discard IO directly to underlayer disks.
* tag 'md/4.12-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
raid1: prefer disk without bad blocks
md/r5cache: handle sync with data in write back cache
md/r5cache: gracefully handle journal device errors for writeback mode
md/raid1/10: avoid unnecessary locking
md/raid5-cache: in r5l_do_submit_io(), submit io->split_bio first
md/md0: optimize raid0 discard handling
md: don't return -EAGAIN in md_allow_write for external metadata arrays
md/raid5: make use of spin_lock_irq over local_irq_disable + spin_lock
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
If an array consists of two drives and the first drive has the bad
block, the read request to the region overlapping the bad block chooses
the same disk (with bad block) as device to read from over and over and
the request gets stuck. If the first disk only partially overlaps with
bad block, it becomes a candidate ("best disk") for shorter range of
sectors. The second disk is capable of reading the entire requested
range and it is updated accordingly, however it is not recorded as a
best device for the request. In the end the request is sent to the first
disk to read entire range of sectors. It fails and is re-tried in a
moment but with the same outcome.
Actually it is quite likely scenario but it had little exposure in my
test until commit 715d40b93b10 ("md/raid1: add failfast handling for
reads.") removed preference for idle disk. Such scenario had been
passing as second disk was always chosen when idle.
Reset a candidate ("best disk") to read from if disk can read entire
range. Do it only if other disk has already been chosen as a candidate
for a smaller range. The head position / disk type logic will select
the best disk to read from - it is fine as disk with bad block won't be
considered for it.
Signed-off-by: Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Currently, sync of raid456 array cannot make progress when hitting
data in writeback r5cache.
This patch fixes this issue by flushing cached data of the stripe
before processing the sync request. This is achived by:
1. In handle_stripe(), do not set STRIPE_SYNCING if the stripe is
in write back cache;
2. In r5c_try_caching_write(), handle the stripe in sync with write
through;
3. In do_release_stripe(), make stripe in sync write out and send
it to the state machine.
Shaohua: explictly set STRIPE_HANDLE after write out completed
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
For the raid456 with writeback cache, when journal device failed during
normal operation, it is still possible to persist all data, as all
pending data is still in stripe cache. However, it is necessary to handle
journal failure gracefully.
During journal failures, the following logic handles the graceful shutdown
of journal:
1. raid5_error() marks the device as Faulty and schedules async work
log->disable_writeback_work;
2. In disable_writeback_work (r5c_disable_writeback_async), the mddev is
suspended, set to write through, and then resumed. mddev_suspend()
flushes all cached stripes;
3. All cached stripes need to be flushed carefully to the RAID array.
This patch fixes issues within the process above:
1. In r5c_update_on_rdev_error() schedule disable_writeback_work for
journal failures;
2. In r5c_disable_writeback_async(), wait for MD_SB_CHANGE_PENDING,
since raid5_error() updates superblock.
3. In handle_stripe(), allow stripes with data in journal (s.injournal > 0)
to make progress during log_failed;
4. In delay_towrite(), if log failed only process data in the cache (skip
new writes in dev->towrite);
5. In __get_priority_stripe(), process loprio_list during journal device
failures.
6. In raid5_remove_disk(), wait for all cached stripes are flushed before
calling log_exit().
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
If we add bios to block plugging list, locking is unnecessry, since the block
unplug is guaranteed not to run at that time.
Reviewed-by: NeilBrown <neilb@suse.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In r5l_do_submit_io(), it is necessary to check io->split_bio before
submit io->current_bio. This is because, endio of current_bio may
free the whole IO unit, and thus change io->split_bio.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
There are complaints that raid0 discard handling is slow. Currently we
divide discard request into chunks and dispatch to underlayer disks. The
block layer will do merge to form big requests. This causes a lot of
request split/merge and uses significant CPU time.
A simple idea is to calculate the range for each raid disk for an IO
request and send a discard request to raid disks, which will avoid the
split/merge completely. Previously Coly tried the approach, but the
implementation was too complex because of raid0 zones. This patch always
split bio in zone boundary and handle bio within one zone. It simplifies
the implementation a lot.
Reviewed-by: NeilBrown <neilb@suse.com>
Acked-by: Coly Li <colyli@suse.de>
Signed-off-by: Shaohua Li <shli@fb.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This essentially reverts commit b5470dc5fc18 ("md: resolve external
metadata handling deadlock in md_allow_write") with some adjustments.
Since commit 6791875e2e53 ("md: make reconfig_mutex optional for writes
to md sysfs files.") changing array_state to 'active' does not use
mddev_lock() and will not cause a deadlock with md_allow_write(). This
revert simplifies userspace tools that write to sysfs attributes like
"stripe_cache_size" or "consistency_policy" because it removes the need
for special handling for external metadata arrays, checking for EAGAIN
and retrying the write.
Signed-off-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
On mainline, there is no functional difference, just less code, and
symmetric lock/unlock paths.
On PREEMPT_RT builds, this fixes the following warning, seen by
Alexander GQ Gerasiov, due to the sleeping nature of spinlocks.
BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:993
in_atomic(): 0, irqs_disabled(): 1, pid: 58, name: kworker/u12:1
CPU: 5 PID: 58 Comm: kworker/u12:1 Tainted: G W 4.9.20-rt16-stand6-686 #1
Hardware name: Supermicro SYS-5027R-WRF/X9SRW-F, BIOS 3.2a 10/28/2015
Workqueue: writeback wb_workfn (flush-253:0)
Call Trace:
dump_stack+0x47/0x68
? migrate_enable+0x4a/0xf0
___might_sleep+0x101/0x180
rt_spin_lock+0x17/0x40
add_stripe_bio+0x4e3/0x6c0 [raid456]
? preempt_count_add+0x42/0xb0
raid5_make_request+0x737/0xdd0 [raid456]
Reported-by: Alexander GQ Gerasiov <gq@redlab-i.ru>
Tested-by: Alexander GQ Gerasiov <gq@redlab-i.ru>
Signed-off-by: Julia Cartwright <julia@ni.com>
Signed-off-by: Shaohua Li <shli@fb.com>
|
|\ \ \ \
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Pull networking fixes from David Miller:
1) Don't allow negative TCP reordering values, from Soheil Hassas
Yeganeh.
2) Don't overflow while parsing ipv6 header options, from Craig Gallek.
3) Handle more cleanly the case where an individual route entry during
a dump will not fit into the allocated netlink SKB, from David
Ahern.
4) Add missing CONFIG_INET dependency for mlx5e, from Arnd Bergmann.
5) Allow neighbour updates to converge more quickly via gratuitous
ARPs, from Ihar Hrachyshka.
6) Fix compile error from CONFIG_INET is disabled, from Eric Dumazet.
7) Fix use after free in x25 protocol init, from Lin Zhang.
8) Valid VLAN pvid ranges passed into br_validate(), from Tobias
Jungel.
9) NULL out address lists in child sockets in SCTP, this is similar to
the fix we made for inet connection sockets last week. From Eric
Dumazet.
10) Fix NULL deref in mlxsw driver, from Ido Schimmel.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (27 commits)
mlxsw: spectrum: Avoid possible NULL pointer dereference
sh_eth: Do not print an error message for probe deferral
sh_eth: Use platform device for printing before register_netdev()
mlxsw: spectrum_router: Fix rif counter freeing routine
mlxsw: spectrum_dpipe: Fix incorrect entry index
cxgb4: update latest firmware version supported
qmi_wwan: add another Lenovo EM74xx device ID
sctp: do not inherit ipv6_{mc|ac|fl}_list from parent
udp: make *udp*_queue_rcv_skb() functions static
bridge: netlink: check vlan_default_pvid range
net: ethernet: faraday: To support device tree usage.
net: x25: fix one potential use-after-free issue
bpf: adjust verifier heuristics
ipv6: Check ip6_find_1stfragopt() return value properly.
selftests/bpf: fix broken build due to types.h
bnxt_en: Check status of firmware DCBX agent before setting DCB_CAP_DCBX_HOST.
bnxt_en: Call bnxt_dcb_init() after getting firmware DCBX configuration.
net: fix compile error in skb_orphan_partial()
ipv6: Prevent overrun when parsing v6 header options
neighbour: update neigh timestamps iff update is effective
...
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
In case we got an FDB notification for a port that doesn't exist we
execute an FDB entry delete to prevent it from re-appearing the next
time we poll for notifications.
If the operation failed we would trigger a NULL pointer dereference as
'mlxsw_sp_port' is NULL.
Fix it by reporting the error using the underlying bus device instead.
Fixes: 12f1501e7511 ("mlxsw: spectrum: remove FDB entry in case we get unknown object notification")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
EPROBE_DEFER is not an error, hence printing an error message like
sh-eth ee700000.ethernet: failed to initialise MDIO
may confuse the user.
To fix this, suppress the error message in case of probe deferral.
While at it, shorten the message, and add the actual error code.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
The MDIO initialization failure message is printed using the network
device, before it has been registered, leading to:
(null): failed to initialise MDIO
Use the platform device instead to fix this:
sh-eth ee700000.ethernet: failed to initialise MDIO
Fixes: daacf03f0bbfefee ("sh_eth: Register MDIO bus before registering the network device")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |\ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Jiri Pirko says:
====================
mlxsw: couple of fixes
Couple of fixes from Arkadi
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
During rif counter freeing the counter index can be invalid. Add check
of validity before freeing the counter.
Fixes: e0c0afd8aa4e ("mlxsw: spectrum: Support for counters on router interfaces")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| |/ / / /
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
In case of disabled counters the entry index will be incorrect. Fix this
by moving the entry index set before the counter status check.
Fixes: 2ba5999f009d ("mlxsw: spectrum: Add Support for erif table entries access")
Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Change t4fw_version.h to update latest firmware version
number to 1.16.43.0.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
In their infinite wisdom, and never ending quest for end user frustration,
Lenovo has decided to use a new USB device ID for the wwan modules in
their 2017 laptops. The actual hardware is still the Sierra Wireless
EM7455 or EM7430, depending on region.
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
|