litmus-rt.git - The LITMUS^RT kernel.

	Commit message (Collapse)	Author	Age
*	Merge branch 'master' of ↵	David S. Miller	2013-02-14
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== 1) Remove a duplicated call to skb_orphan() in pf_key, from Cong Wang. 2) Prepare xfrm and pf_key for algorithms without pf_key support, from Jussi Kivilinna. 3) Fix an unbalanced lock in xfrm_output_one(), from Li RongQing. 4) Add an IPsec state resolution packet queue to handle packets that are send before the states are resolved. 5) xfrm4_policy_fini() is unused since 2.6.11, time to remove it. From Michal Kubecek. 6) The xfrm gc threshold was configurable just in the initial namespace, make it configurable in all namespaces. From Michal Kubecek. 7) We currently can not insert policies with mark and mask such that some flows would be matched from both policies. Allow this if the priorities of these policies are different, the one with the higher priority is used in this case. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| *	xfrm: Allow inserting policies with matching mark and different priorities	Steffen Klassert	2013-02-11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We currently can not insert policies with mark and mask such that some flows would be matched from both policies. We make this possible when the priority of these policies are different. If both policies match a flow, the one with the higher priority is used. Reported-by: Emmanuel Thierry <emmanuel.thierry@telecom-bretagne.eu> Reported-by: Romain Kuntz <r.kuntz@ipflavors.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
\| *	xfrm: make gc_thresh configurable in all namespaces	Michal Kubecek	2013-02-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The xfrm gc threshold can be configured via xfrm{4,6}_gc_thresh sysctl but currently only in init_net, other namespaces always use the default value. This can substantially limit the number of IPsec tunnels that can be effectively used. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
\| *	xfrm: remove unused xfrm4_policy_fini()	Michal Kubecek	2013-02-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Function xfrm4_policy_fini() is unused since xfrm4_fini() was removed in 2.6.11. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
\| *	xfrm: Add a state resolution packet queue	Steffen Klassert	2013-02-06
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the default, we blackhole packets until the key manager resolves the states. This patch implements a packet queue where IPsec packets are queued until the states are resolved. We generate a dummy xfrm bundle, the output routine of the returned route enqueues the packet to a per policy queue and arms a timer that checks for state resolution when dst_output() is called. Once the states are resolved, the packets are sent out of the queue. If the states are not resolved after some time, the queue is flushed. This patch keeps the defaut behaviour to blackhole packets as long as we have no states. To enable the packet queue the sysctl xfrm_larval_drop must be switched off. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
\| *	xfrm: fix a unbalanced lock	Li RongQing	2013-02-01
\| \| \| \| \| \| \| \| \| \|	Signed-off-by: Li RongQing <roy.qing.li@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
\| *	pf_key/xfrm_algo: prepare pf_key and xfrm_algo for new algorithms without ↵	Jussi Kivilinna	2013-02-01
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pfkey support Mark existing algorithms as pfkey supported and make pfkey only use algorithms that have pfkey_supported set. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
\| *	af_key: remove a duplicated skb_orphan()	Cong Wang	2013-01-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	skb_set_owner_r() will call skb_orphan(), I don't see any reason to call it twice. Cc: Steffen Klassert <steffen.klassert@secunet.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
* \|	bridge: make ifla_br_policy and br_af_ops static	Cong Wang	2013-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	They are only used within this file. Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	bgmac: add read of interrupt mask after disabling interrupts	Nathan Hintz	2013-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The specs prescribe an immediate read of the interrupt mask after disabling interrupts. This patch updates the driver to match the specs. Signed-off-by: Nathan Hintz <nlhintz@hotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	bridge: use __u16 in if_bridge.h	Cong Wang	2013-02-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We should use "__u16" instead of "u16" in the user-space visable header. Cc: Vlad Yasevich <vyasevic@redhat.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'bridge_vlan'	David S. Miller	2013-02-13
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Vlad Yasevich says: ==================== VLAN filtering/VLAN aware bridge Changes since v10 * Updated implemenation of ndo_fdb_del in emulex and qlogic drivers. Changes since v9: * series re-ordering so make functionality more distinct. Basic vlan filtering is patches 1-4. Support for PVID/untagged vlans is patches 5 and 6. VLAN support for FDB/MDB is patches 7-11. Patch 12 is still additional egress policy. * Slight simplification to code that extracts the VID from skb. Since we now depend on the vlan module, at the time of input skb_tci is guaranteed to be set if the packet had 8021q header. We can simply refere to it. * Changed the opaque 'parent' pointer from prior patches to a union so we can be much more explicit in our assignments. * Lots of additional testing with STP turned on. No issues were observed. Changes since v8: * Unified vlans_to_* calls into a single interface * Fixed the rest of the issues report by Michal Miroslaw * Fixed a bug where fdb entries were not created for all added vlans. Changes since v7: * Rebases on the latest net-next and removed the vlan wrapper patch from the series. * Fixed a crash in br_fdb_add/br_fdb_delete. Changes since v6: * VLANs are now stored in a VLAN bitmap per port. This allows for O(1) lookup at ingress and egress. We simply check to see if the bit associated with the vlan id is set in the map. The drawback to this approach is that it wastes some space when there is only a small number of VLANs. * In addition to the build time configuration option, VLAN filtering also has a configuration paramter in sysfs. By default the filtering is turned off and all traffic is permitted. When the filtring is turned on, we do strict matching to the filter configured. Thus, if there is no configuration, all packets are rejected. This was done to make the behavior more streight forward. Without this (and if egress policy patch is rejected), the decision for how to forward untagged traffic that was not filtered at ingress is almost impossible to make. It would not be right to deliver to every port that has PVID set as, each port may have a different PVID. * Separate egress policy bitmap patch has been isolated and is provided last in the series. This has been a more contentious piece of functionality and I wanted to isolate it so that it could easily be dropped and not block the whole series. Changes since v5: - Pulled VLAN filtering into its own file and made it a configuration options. - Made new vlan filtering option dependent on VLAN_8021Q. - Got rid of HW filter inlines and moved then vlan_core.c. (All of the above suggested by Stephen Hemminger) Changes since v4: - Pull per-port vlan data into its own structures and give it to the bridge device thus making bridge device behave like a regular port for vlan configuration. - Add a per-vlan 'untagged' bitmap that determins egress policy. If a port is part of this bitmap, traffic egresses untagged. - PVID is now used for ingress policy only. Incomming frames without VLAN tag are assigned to the PVID vlan. Egress is determined via bitmap memberships. - Allow for incremental config of a vlan. Now, PVID and untagged memberships may be set on existing vlans. They however can NOT be cleared separately. - VLAN deletion is now done via RTM_DELLINK command for PF_BRIDGE family. This cleans up the netlink interface. Changes since v3: - Re-integrated compiler problems that got left out last time. Appologies. - checkpatches.pl errors fixed Changes since v2: - Added inline functiosn to manimulate vlan hw filters and re-use in 8021q and bridge code. - Use rtnl_dereference (Michael Tsirkin) - Remove synchronize_net() call (Eric Dumazet) - Fix NULL ptr deref bug I introduced in br_ifinfo_notify. Changes since v1: - Fixed some forwarding bugs. - Add vlan to local fdb entries. New local entries are created per vlan to facilite correct forwarding to bridge interface. - Allow configuration of vlans directly on the bridge master device in addition to ports. Changes since rfc v2: - Per-port vlan bitmap is gone and is replaced with a vlan list. - Added bridge vlan list, which is referenced by each port. Entries in the birdge vlan list have port bitmap that shows which port are parts of which vlan. - Netlink API changes. - Dropped sysfs support for now. If people think this is really usefull, can add it back. - Support for native/untagged vlans. Changes since rfc v1: - Comments addressed regarding formatting and RCU usage - iocts have been removed and changed over the netlink interface. - Added support of user added ndb entries. - changed sysfs interface to export a bitmap. Also added a write interface. I am not sure how much I like it, but it made my testing easier/faster. I might change the write interface to take text instead of binary. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Separate egress policy bitmap	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add an ability to configure a separate "untagged" egress policy to the VLAN information of the bridge. This superseeds PVID policy and makes PVID ingress-only. The policy is configured with a new flag and is represented as a port bitmap per vlan. Egress frames with a VLAN id in "untagged" policy bitmap would egress the port without VLAN header. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Add vlan support for local fdb entries	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When VLAN is added to the port, a local fdb entry for that port (the entry with the mac address of the port) is added for that VLAN. This way we can correctly determine if the traffic is for the bridge itself. If the address of the port changes, we try to change all the local fdb entries we have for that port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Add vlan support to static neighbors	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a user adds bridge neighbors, allow him to specify VLAN id. If the VLAN id is not specified, the neighbor will be added for VLANs currently in the ports filter list. If no VLANs are configured on the port, we use vlan 0 and only add 1 entry. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Acked-by: Jitendra Kalsaria <jitendra.kalsaria@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Add vlan id to multicast groups	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add vlan_id to multicasts groups so that we know which vlan each group belongs to and can correctly forward to appropriate vlan. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Add vlan to unicast fdb entries	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds vlan to unicast fdb entries that are created for learned addresses (not the manually configured ones). It adds vlan id into the hash mix and uses vlan as an addditional parameter for an entry match. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Add the ability to configure pvid	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A user may designate a certain vlan as PVID. This means that any ingress frame that does not contain a vlan tag is assigned to this vlan and any forwarding decisions are made with this vlan in mind. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Implement vlan ingress/egress policy with PVID.	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At ingress, any untagged traffic is assigned to the PVID. Any tagged traffic is filtered according to membership bitmap. At egress, if the vlan matches the PVID, the frame is sent untagged. Otherwise the frame is sent tagged. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Dump vlan information from a bridge port	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using the RTM_GETLINK dump the vlan filter list of a given bridge port. The information depends on setting the filter flag similar to how nic VF info is dumped. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Add netlink interface to configure vlans on bridge ports	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a netlink interface to add and remove vlan configuration on bridge port. The interface uses the RTM_SETLINK message and encodes the vlan configuration inside the IFLA_AF_SPEC. It is possble to include multiple vlans to either add or remove in a single message. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Verify that a vlan is allowed to egress on given port	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When bridge forwards a frame, make sure that a frame is allowed to egress on that port. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Validate that vlan is permitted on ingress	Vlad Yasevich	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a frame arrives on a port or transmitted by the bridge, if we have VLANs configured, validate that a given VLAN is allowed to enter the bridge. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	bridge: Add vlan filtering infrastructure	Vlad Yasevich	2013-02-13
\|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds an optional infrustructure component to bridge that would allow native vlan filtering in the bridge. Each bridge port (as well as the bridge device) now get a VLAN bitmap. Each bit in the bitmap is associated with a vlan id. This way if the bit corresponding to the vid is set in the bitmap that the packet with vid is allowed to enter and exit the port. Write access the bitmap is protected by RTNL and read access protected by RCU. Vlan functionality is disabled by default. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net: sctp: add build check for sctp_sf_eat_sack_6_2/jsctp_sf_eat_sack	Daniel Borkmann	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to avoid any future surprises of kernel panics due to jprobes function mismatches (as e.g. fixed in 4cb9d6eaf85ecd: sctp: jsctp_sf_eat_sack: fix jprobes function signature mismatch), we should check both function types during build and scream loudly if they do not match. __same_type resolves to __builtin_types_compatible_p, which is 1 in case both types are the same and 0 otherwise, qualifiers are ignored. Tested by myself. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net: sctp: minor: make jsctp_sf_eat_sack static	Daniel Borkmann	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The function jsctp_sf_eat_sack can be made static, no need to extend its visibility. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	bgmac: return error on failed PHY write	Rafał Miłecki	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some callers may want to know if PHY write succeed. Also make PHY functions static, they are not exported anywhere. Signed-off-by: Rafał Miłecki <zajec5@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	be2net: remove BUG_ON() in be_mcc_compl_is_new()	Sathya Perla	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current code expects that the last word (with valid bit) of an MCC compl is DMAed in one shot. This may not be the case. Remove this assertion. Signed-off-by: Sathya Perla <sathya.perla@emulex.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net: ethernet: ti: remove redundant NULL check.	Cyril Roelandt	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cpdma_chan_destroy() on a NULL pointer is a no-op, so the NULL check in cpdma_ctlr_destroy() can safely be removed. Signed-off-by: Cyril Roelandt <tipecaml@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	net: Fix possible wrong checksum generation.	Pravin B Shelar	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch cef401de7be8c4e (net: fix possible wrong checksum generation) fixed wrong checksum calculation but it broke TSO by defining new GSO type but not a netdev feature for that type. net_gso_ok() would not allow hardware checksum/segmentation offload of such packets without the feature. Following patch fixes TSO and wrong checksum. This patch uses same logic that Eric Dumazet used. Patch introduces new flag SKBTX_SHARED_FRAG if at least one frag can be modified by the user. but SKBTX_SHARED_FRAG flag is kept in skb shared info tx_flags rather than gso_type. tx_flags is better compared to gso_type since we can have skb with shared frag without gso packet. It does not link SHARED_FRAG to GSO, So there is no need to define netdev feature for this. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'tcp_tsoffset'	David S. Miller	2013-02-13
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Andrey Vagin says: ==================== If a TCP socket will get live-migrated from one box to another the timestamps (which are typically ON) will get screwed up -- the new kernel will generate TS values that has nothing to do with what they were on dump. The solution is to yet again fix the kernel and put a "timestamp offset" on a socket. A socket offset is added in places where externally visible tcp timestamp option is parsed/initialized. Connections in the SYN_RECV state are not supported, global tcp_time_stamp is used for them, because repair mode doesn't support this state. In a future it can be implemented by the similar way as for TIME_WAIT sockets. For time-wait sockets offset is inhereted by a proper tcp_sock. A per-socket offset can be set only for sockets in repair mode. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	tcp: send packets with a socket timestamp	Andrey Vagin	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A socket timestamp is a sum of the global tcp_time_stamp and a per-socket offset. A socket offset is added in places where externally visible tcp timestamp option is parsed/initialized. Connections in the SYN_RECV state are not supported, global tcp_time_stamp is used for them, because repair mode doesn't support this state. In a future it can be implemented by the similar way as for TIME_WAIT sockets. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	tcp: set and get per-socket timestamp	Andrey Vagin	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A timestamp can be set, only if a socket is in the repair mode. This patch adds a new socket option TCP_TIMESTAMP, which allows to get and set current tcp times stamp. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	tcp: adding a per-socket timestamp offset	Andrey Vagin	2013-02-13
\|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This functionality is used for restoring tcp sockets. A tcp timestamp depends on how long a system has been running, so it's differ for each host. The solution is to set a per-socket offset. A per-socket offset for a TIME_WAIT socket is inherited from a proper tcp socket. tcp_request_sock doesn't have a timestamp offset, because the repair mode for them are not implemented. Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Pavel Emelyanov <xemul@parallels.com> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
* \|	Merge branch 'gfar-ethtool-atomic' of ↵	David S. Miller	2013-02-13
\|\ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux Paul Gortmaker says: ==================== Eric noticed that the handling of local u64 ethtool counters for this driver commonly found on Freescale ppc-32 boards was racy. However, before converting them over to atomic64_t, I noticed that an internal struct was being used to determine the offsets for exporting this data into the ethtool buffer, and in doing so, it assumed that the counters would always be u64. Rather than keep this implicit assumption, a simple code cleanup gets rid of the struct completely, and leaves less conversion sites. The alternative solution would have been to take advantage of the fact that the counters are all relating to error conditions, and hence make them internally u32. In doing so, we'd be assuming that U32_MAX of any particular error condition is highly unlikely. This might have made sense if any increments were in a hot path. Tested with "ethtool -S eth0" on sbc8548 board. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \|	gianfar: convert u64 status counters to atomic64_t	Paul Gortmaker	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While looking at some asm dump for an unrelated change, Eric noticed in the following stats count increment code: 50b8: 81 3c 01 f8 lwz r9,504(r28) 50bc: 81 5c 01 fc lwz r10,508(r28) 50c0: 31 4a 00 01 addic r10,r10,1 50c4: 7d 29 01 94 addze r9,r9 50c8: 91 3c 01 f8 stw r9,504(r28) 50cc: 91 5c 01 fc stw r10,508(r28) that a 64 bit counter was used on ppc-32 without sync and hence the "ethtool -S" output was racy. Here we convert all the values to use atomic64_t so that the output will always be consistent. Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
\| * \|	gianfar: remove largely unused gfar_stats struct	Paul Gortmaker	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The gfar_stats struct is only used in copying out data via ethtool. It is declared as the extra stats, followed by the rmon stats. However, the rmon stats are never actually ever used in the driver; instead the rmon data is a u32 register read that is cast directly into the ethtool buf. It seems the only reason rmon is in the struct at all is to give the offset(s) at which it should be exported into the ethtool buffer. But note gfar_stats doesn't contain a gfar_extra_stats as a substruct -- instead it contains a u64 array of equal element count. This implicitly means we have two independent declarations of what gfar_extra_stats really is. Rather than have this duality, we already have defines which give us the offset directly, and hence do not need the struct at all. Further, since we know the extra_stats is unconditionally always present, we can write it out to the ethtool buf 1st, and then optionally write out the rmon data. There is no need for two independent loops, both of which are simply copying out the extra_stats to buf offset zero. This also helps pave the way towards allowing the extra stats fields to be converted to atomic64_t values, without having their types directly influencing the ethtool stats export code (gfar_fill_stats) that expects to deal with u64. Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
* \| \|	netpoll: fix smatch warnings in netpoll core code	Neil Horman	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dan Carpenter contacted me with some notes regarding some smatch warnings in the netpoll code, some of which I introduced with my recent netpoll locking fixes, some which were there prior. Specifically they were: net-next/net/core/netpoll.c:243 netpoll_poll_dev() warn: inconsistent returns mutex:&ni->dev_lock: locked (213,217) unlocked (210,243) net-next/net/core/netpoll.c:706 netpoll_neigh_reply() warn: potential pointer math issue ('skb_transport_header(send_skb)' is a 128 bit pointer) This patch corrects the locking imbalance (the first error), and adds some parenthesis to correct the second error. Tested by myself. Applies to net-next Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: Dan Carpenter <dan.carpenter@oracle.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	net: skbuff: fix compile error in skb_panic()	James Hogan	2013-02-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I get the following build error on next-20130213 due to the following commit: commit f05de73bf82fbbc00265c06d12efb7273f7dc54a ("skbuff: create skb_panic() function and its wrappers"). It adds an argument called panic to a function that uses the BUG() macro which tries to call panic, but the argument masks the panic() function declaration, resulting in the following error (gcc 4.2.4): net/core/skbuff.c In function 'skb_panic': net/core/skbuff.c +126 : error: called object 'panic' is not a function This is fixed by renaming the argument to msg. Signed-off-by: James Hogan <james.hogan@imgtec.com> Cc: Jean Sacren <sakiwit@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	act_police: improved accuracy at high rates	Jiri Pirko	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current act_police uses rate table computed by the "tc" userspace program, which has the following issue: The rate table has 256 entries to map packet lengths to token (time units). With TSO sized packets, the 256 entry granularity leads to loss/gain of rate, making the token bucket inaccurate. Thus, instead of relying on rate table, this patch explicitly computes the time and accounts for packet transmission times with nanosecond granularity. This is a followup to 56b765b79e9a78dc7d3f8850ba5e5567205a3ecd ("htb: improved accuracy at high rates"). Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	act_police: move struct tcf_police to act_police.c	Jiri Pirko	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's not used anywhere else, so move it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	tbf: improved accuracy at high rates	Jiri Pirko	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current TBF uses rate table computed by the "tc" userspace program, which has the following issue: The rate table has 256 entries to map packet lengths to token (time units). With TSO sized packets, the 256 entry granularity leads to loss/gain of rate, making the token bucket inaccurate. Thus, instead of relying on rate table, this patch explicitly computes the time and accounts for packet transmission times with nanosecond granularity. This is a followup to 56b765b79e9a78dc7d3f8850ba5e5567205a3ecd ("htb: improved accuracy at high rates"). Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	sch_api: introduce qdisc_watchdog_schedule_ns()	Jiri Pirko	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tbf will need to schedule watchdog in ns. No need to convert it twice. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	sch: make htb_rate_cfg and functions around that generic	Jiri Pirko	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As it is going to be used in tbf as well, push these to generic code. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	htb: initialize cl->tokens and cl->ctokens correctly	Jiri Pirko	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are in ns so convert from ticks to ns. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	htb: remove pointless first initialization of buffer and cbuffer	Jiri Pirko	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These are initialized correctly a couple of lines later in the function. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	htb: use PSCHED_TICKS2NS()	Jiri Pirko	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* \| \|	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	2013-02-12
\|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Conflicts: drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c The bnx2x gso_type setting bug fix in 'net' conflicted with changes in 'net-next' that broke the gso_* setting logic out into a seperate function, which also fixes the bug in question. Thus, use the 'net-next' version. Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	htb: fix values in opt dump	Jiri Pirko	2013-02-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	in htb_change_class() cl->buffer and cl->buffer are stored in ns. So in dump, convert them back to psched ticks. Note this was introduced by: commit 56b765b79e9a78dc7d3f8850ba5e5567205a3ecd htb: improved accuracy at high rates Please consider this for -net/-stable. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
\| * \| \|	Merge branch 'for-davem' of ↵	David S. Miller	2013-02-12
\| \|\ \ \ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== Here is another handful of late-breaking fixes intended for the 3.8 stream... Hopefully the will still make it! :-) There are three mac80211 fixes pulled from Johannes: "Here are three fixes still for the 3.8 stream, the fix from Cong Ding for the bad sizeof (Stephen Hemminger had pointed it out before but I'd promptly forgotten), a mac80211 managed-mode channel context usage fix where a downgrade would never stop until reaching non-HT and a bug in the channel determination that could cause invalid channels like HT40+ on channel 11 to be used." Also included is a mwl8k fix that avoids an oops when using mwl8k devices that only support the 5 GHz band. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>