diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2012-04-06 13:37:38 -0400 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2012-04-06 13:37:38 -0400 |
commit | 23f347ef63aa36b5a001b6791f657cd0e2a04de3 (patch) | |
tree | ce06ebdccd16b99265b3e74f8e9b7bd1e29cf465 /Documentation | |
parent | 314489bd4c7780fde6a069783d5128f6cef52919 (diff) | |
parent | 110c43304db6f06490961529536c362d9ac5732f (diff) |
Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking updates from David Miller:
1) Fix inaccuracies in network driver interface documentation, from Ben
Hutchings.
2) Fix handling of negative offsets in BPF JITs, from Jan Seiffert.
3) Compile warning, locking, and refcounting fixes in netfilter's
xt_CT, from Pablo Neira Ayuso.
4) phonet sendmsg needs to validate user length just like any other
datagram protocol, fix from Sasha Levin.
5) Ipv6 multicast code uses wrong loop index, from RongQing Li.
6) Link handling and firmware fixes in bnx2x driver from Yaniv Rosner
and Yuval Mintz.
7) mlx4 erroneously allocates 4 pages at a time, regardless of page
size, fix from Thadeu Lima de Souza Cascardo.
8) SCTP socket option wasn't extended in a backwards compatible way,
fix from Thomas Graf.
9) Add missing address change event emissions to bonding, from Shlomo
Pongratz.
10) /proc/net/dev regressed because it uses a private offset to track
where we are in the hash table, but this doesn't track the offset
pullback that the seq_file code does resulting in some entries being
missed in large dumps.
Fix from Eric Dumazet.
11) do_tcp_sendpage() unloads the send queue way too fast, because it
invokes tcp_push() when it shouldn't. Let the natural sequence
generated by the splice paths, and the assosciated MSG_MORE
settings, guide the tcp_push() calls.
Otherwise what goes out of TCP is spaghetti and doesn't batch
effectively into GSO/TSO clusters.
From Eric Dumazet.
12) Once we put a SKB into either the netlink receiver's queue or a
socket error queue, it can be consumed and freed up, therefore we
cannot touch it after queueing it like that.
Fixes from Eric Dumazet.
13) PPP has this annoying behavior in that for every transmit call it
immediately stops the TX queue, then calls down into the next layer
to transmit the PPP frame.
But if that next layer can take it immediately, it just un-stops the
TX queue right before returning from the transmit method.
Besides being useless work, it makes several facilities unusable, in
particular things like the equalizers. Well behaved devices should
only stop the TX queue when they really are full, and in PPP's case
when it gets backlogged to the downstream device.
David Woodhouse therefore fixed PPP to not stop the TX queue until
it's downstream can't take data any more.
14) IFF_UNICAST_FLT got accidently lost in some recent stmmac driver
changes, re-add. From Marc Kleine-Budde.
15) Fix link flaps in ixgbe, from Eric W. Multanen.
16) Descriptor writeback fixes in e1000e from Matthew Vick.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (47 commits)
net: fix a race in sock_queue_err_skb()
netlink: fix races after skb queueing
doc, net: Update ndo_start_xmit return type and values
doc, net: Remove instruction to set net_device::trans_start
doc, net: Update netdev operation names
doc, net: Update documentation of synchronisation for TX multiqueue
doc, net: Remove obsolete reference to dev->poll
ethtool: Remove exception to the requirement of holding RTNL lock
MAINTAINERS: update for Marvell Ethernet drivers
bonding: properly unset current_arp_slave on slave link up
phonet: Check input from user before allocating
tcp: tcp_sendpages() should call tcp_push() once
ipv6: fix array index in ip6_mc_add_src()
mlx4: allocate just enough pages instead of always 4 pages
stmmac: re-add IFF_UNICAST_FLT for dwmac1000
bnx2x: Clear MDC/MDIO warning message
bnx2x: Fix BCM57711+BCM84823 link issue
bnx2x: Clear BCM84833 LED after fan failure
bnx2x: Fix BCM84833 PHY FW version presentation
bnx2x: Fix link issue for BCM8727 boards.
...
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/networking/driver.txt | 31 | ||||
-rw-r--r-- | Documentation/networking/ip-sysctl.txt | 11 | ||||
-rw-r--r-- | Documentation/networking/netdevices.txt | 25 |
3 files changed, 29 insertions, 38 deletions
diff --git a/Documentation/networking/driver.txt b/Documentation/networking/driver.txt index 03283daa64fe..da59e2884130 100644 --- a/Documentation/networking/driver.txt +++ b/Documentation/networking/driver.txt | |||
@@ -2,16 +2,16 @@ Document about softnet driver issues | |||
2 | 2 | ||
3 | Transmit path guidelines: | 3 | Transmit path guidelines: |
4 | 4 | ||
5 | 1) The hard_start_xmit method must never return '1' under any | 5 | 1) The ndo_start_xmit method must not return NETDEV_TX_BUSY under |
6 | normal circumstances. It is considered a hard error unless | 6 | any normal circumstances. It is considered a hard error unless |
7 | there is no way your device can tell ahead of time when it's | 7 | there is no way your device can tell ahead of time when it's |
8 | transmit function will become busy. | 8 | transmit function will become busy. |
9 | 9 | ||
10 | Instead it must maintain the queue properly. For example, | 10 | Instead it must maintain the queue properly. For example, |
11 | for a driver implementing scatter-gather this means: | 11 | for a driver implementing scatter-gather this means: |
12 | 12 | ||
13 | static int drv_hard_start_xmit(struct sk_buff *skb, | 13 | static netdev_tx_t drv_hard_start_xmit(struct sk_buff *skb, |
14 | struct net_device *dev) | 14 | struct net_device *dev) |
15 | { | 15 | { |
16 | struct drv *dp = netdev_priv(dev); | 16 | struct drv *dp = netdev_priv(dev); |
17 | 17 | ||
@@ -23,7 +23,7 @@ Transmit path guidelines: | |||
23 | unlock_tx(dp); | 23 | unlock_tx(dp); |
24 | printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", | 24 | printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n", |
25 | dev->name); | 25 | dev->name); |
26 | return 1; | 26 | return NETDEV_TX_BUSY; |
27 | } | 27 | } |
28 | 28 | ||
29 | ... queue packet to card ... | 29 | ... queue packet to card ... |
@@ -35,6 +35,7 @@ Transmit path guidelines: | |||
35 | ... | 35 | ... |
36 | unlock_tx(dp); | 36 | unlock_tx(dp); |
37 | ... | 37 | ... |
38 | return NETDEV_TX_OK; | ||
38 | } | 39 | } |
39 | 40 | ||
40 | And then at the end of your TX reclamation event handling: | 41 | And then at the end of your TX reclamation event handling: |
@@ -58,15 +59,12 @@ Transmit path guidelines: | |||
58 | TX_BUFFS_AVAIL(dp) > 0) | 59 | TX_BUFFS_AVAIL(dp) > 0) |
59 | netif_wake_queue(dp->dev); | 60 | netif_wake_queue(dp->dev); |
60 | 61 | ||
61 | 2) Do not forget to update netdev->trans_start to jiffies after | 62 | 2) An ndo_start_xmit method must not modify the shared parts of a |
62 | each new tx packet is given to the hardware. | ||
63 | |||
64 | 3) A hard_start_xmit method must not modify the shared parts of a | ||
65 | cloned SKB. | 63 | cloned SKB. |
66 | 64 | ||
67 | 4) Do not forget that once you return 0 from your hard_start_xmit | 65 | 3) Do not forget that once you return NETDEV_TX_OK from your |
68 | method, it is your driver's responsibility to free up the SKB | 66 | ndo_start_xmit method, it is your driver's responsibility to free |
69 | and in some finite amount of time. | 67 | up the SKB and in some finite amount of time. |
70 | 68 | ||
71 | For example, this means that it is not allowed for your TX | 69 | For example, this means that it is not allowed for your TX |
72 | mitigation scheme to let TX packets "hang out" in the TX | 70 | mitigation scheme to let TX packets "hang out" in the TX |
@@ -74,8 +72,9 @@ Transmit path guidelines: | |||
74 | This error can deadlock sockets waiting for send buffer room | 72 | This error can deadlock sockets waiting for send buffer room |
75 | to be freed up. | 73 | to be freed up. |
76 | 74 | ||
77 | If you return 1 from the hard_start_xmit method, you must not keep | 75 | If you return NETDEV_TX_BUSY from the ndo_start_xmit method, you |
78 | any reference to that SKB and you must not attempt to free it up. | 76 | must not keep any reference to that SKB and you must not attempt |
77 | to free it up. | ||
79 | 78 | ||
80 | Probing guidelines: | 79 | Probing guidelines: |
81 | 80 | ||
@@ -85,10 +84,10 @@ Probing guidelines: | |||
85 | 84 | ||
86 | Close/stop guidelines: | 85 | Close/stop guidelines: |
87 | 86 | ||
88 | 1) After the dev->stop routine has been called, the hardware must | 87 | 1) After the ndo_stop routine has been called, the hardware must |
89 | not receive or transmit any data. All in flight packets must | 88 | not receive or transmit any data. All in flight packets must |
90 | be aborted. If necessary, poll or wait for completion of | 89 | be aborted. If necessary, poll or wait for completion of |
91 | any reset commands. | 90 | any reset commands. |
92 | 91 | ||
93 | 2) The dev->stop routine will be called by unregister_netdevice | 92 | 2) The ndo_stop routine will be called by unregister_netdevice |
94 | if device is still UP. | 93 | if device is still UP. |
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index ad3e80e17b4f..bd80ba5847d2 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt | |||
@@ -604,15 +604,8 @@ IP Variables: | |||
604 | ip_local_port_range - 2 INTEGERS | 604 | ip_local_port_range - 2 INTEGERS |
605 | Defines the local port range that is used by TCP and UDP to | 605 | Defines the local port range that is used by TCP and UDP to |
606 | choose the local port. The first number is the first, the | 606 | choose the local port. The first number is the first, the |
607 | second the last local port number. Default value depends on | 607 | second the last local port number. The default values are |
608 | amount of memory available on the system: | 608 | 32768 and 61000 respectively. |
609 | > 128Mb 32768-61000 | ||
610 | < 128Mb 1024-4999 or even less. | ||
611 | This number defines number of active connections, which this | ||
612 | system can issue simultaneously to systems not supporting | ||
613 | TCP extensions (timestamps). With tcp_tw_recycle enabled | ||
614 | (i.e. by default) range 1024-4999 is enough to issue up to | ||
615 | 2000 connections per second to systems supporting timestamps. | ||
616 | 609 | ||
617 | ip_local_reserved_ports - list of comma separated ranges | 610 | ip_local_reserved_ports - list of comma separated ranges |
618 | Specify the ports which are reserved for known third-party | 611 | Specify the ports which are reserved for known third-party |
diff --git a/Documentation/networking/netdevices.txt b/Documentation/networking/netdevices.txt index 89358341682a..c7ecc7080494 100644 --- a/Documentation/networking/netdevices.txt +++ b/Documentation/networking/netdevices.txt | |||
@@ -47,26 +47,25 @@ packets is preferred. | |||
47 | 47 | ||
48 | struct net_device synchronization rules | 48 | struct net_device synchronization rules |
49 | ======================================= | 49 | ======================================= |
50 | dev->open: | 50 | ndo_open: |
51 | Synchronization: rtnl_lock() semaphore. | 51 | Synchronization: rtnl_lock() semaphore. |
52 | Context: process | 52 | Context: process |
53 | 53 | ||
54 | dev->stop: | 54 | ndo_stop: |
55 | Synchronization: rtnl_lock() semaphore. | 55 | Synchronization: rtnl_lock() semaphore. |
56 | Context: process | 56 | Context: process |
57 | Note1: netif_running() is guaranteed false | 57 | Note: netif_running() is guaranteed false |
58 | Note2: dev->poll() is guaranteed to be stopped | ||
59 | 58 | ||
60 | dev->do_ioctl: | 59 | ndo_do_ioctl: |
61 | Synchronization: rtnl_lock() semaphore. | 60 | Synchronization: rtnl_lock() semaphore. |
62 | Context: process | 61 | Context: process |
63 | 62 | ||
64 | dev->get_stats: | 63 | ndo_get_stats: |
65 | Synchronization: dev_base_lock rwlock. | 64 | Synchronization: dev_base_lock rwlock. |
66 | Context: nominally process, but don't sleep inside an rwlock | 65 | Context: nominally process, but don't sleep inside an rwlock |
67 | 66 | ||
68 | dev->hard_start_xmit: | 67 | ndo_start_xmit: |
69 | Synchronization: netif_tx_lock spinlock. | 68 | Synchronization: __netif_tx_lock spinlock. |
70 | 69 | ||
71 | When the driver sets NETIF_F_LLTX in dev->features this will be | 70 | When the driver sets NETIF_F_LLTX in dev->features this will be |
72 | called without holding netif_tx_lock. In this case the driver | 71 | called without holding netif_tx_lock. In this case the driver |
@@ -87,20 +86,20 @@ dev->hard_start_xmit: | |||
87 | o NETDEV_TX_LOCKED Locking failed, please retry quickly. | 86 | o NETDEV_TX_LOCKED Locking failed, please retry quickly. |
88 | Only valid when NETIF_F_LLTX is set. | 87 | Only valid when NETIF_F_LLTX is set. |
89 | 88 | ||
90 | dev->tx_timeout: | 89 | ndo_tx_timeout: |
91 | Synchronization: netif_tx_lock spinlock. | 90 | Synchronization: netif_tx_lock spinlock; all TX queues frozen. |
92 | Context: BHs disabled | 91 | Context: BHs disabled |
93 | Notes: netif_queue_stopped() is guaranteed true | 92 | Notes: netif_queue_stopped() is guaranteed true |
94 | 93 | ||
95 | dev->set_rx_mode: | 94 | ndo_set_rx_mode: |
96 | Synchronization: netif_tx_lock spinlock. | 95 | Synchronization: netif_addr_lock spinlock. |
97 | Context: BHs disabled | 96 | Context: BHs disabled |
98 | 97 | ||
99 | struct napi_struct synchronization rules | 98 | struct napi_struct synchronization rules |
100 | ======================================== | 99 | ======================================== |
101 | napi->poll: | 100 | napi->poll: |
102 | Synchronization: NAPI_STATE_SCHED bit in napi->state. Device | 101 | Synchronization: NAPI_STATE_SCHED bit in napi->state. Device |
103 | driver's dev->close method will invoke napi_disable() on | 102 | driver's ndo_stop method will invoke napi_disable() on |
104 | all NAPI instances which will do a sleeping poll on the | 103 | all NAPI instances which will do a sleeping poll on the |
105 | NAPI_STATE_SCHED napi->state bit, waiting for all pending | 104 | NAPI_STATE_SCHED napi->state bit, waiting for all pending |
106 | NAPI activity to cease. | 105 | NAPI activity to cease. |