aboutsummaryrefslogtreecommitdiffstats
path: root/net/ipv4/route.c
Commit message (Collapse)AuthorAge
...
* ipv4: Remove all RTCF_DIRECTSRC handliing.David S. Miller2012-07-23
| | | | | | | The last and final kernel user, ICMP address replies, has been removed. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill rt->fiDavid S. Miller2012-07-20
| | | | | | | | | | | | | | | | | | | It's not really needed. We only grabbed a reference to the fib_info for the sake of fib_info local metrics. However, fib_info objects are freed using RCU, as are therefore their private metrics (if any). We would have triggered a route cache flush if we eliminated a reference to a fib_info object in the routing tables. Therefore, any existing cached routes will first check and see that they have been invalidated before an errant reference to these metric values would occur. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Turn rt->rt_route_iif into rt->rt_is_input.David S. Miller2012-07-20
| | | | | | | | | | That is this value's only use, as a boolean to indicate whether a route is an input route or not. So implement it that way, using a u16 gap present in the struct already. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill rt->rt_oifDavid S. Miller2012-07-20
| | | | | | | | | | | | | Never actually used. It was being set on output routes to the original OIF specified in the flow key used for the lookup. Adjust the only user, ipmr_rt_fib_lookup(), for greater correctness of the flowi4_oif and flowi4_iif values, thanks to feedback from Julian Anastasov. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Dirty less cache lines in route caching paths.David S. Miller2012-07-20
| | | | | | | Don't bother incrementing dst->__use and setting dst->lastuse, they are completely pointless and just slow things down. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill FLOWI_FLAG_RT_NOCACHE and associated code.David S. Miller2012-07-20
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Cache input routes in fib_info nexthops.David S. Miller2012-07-20
| | | | | | | | | | | | Caching input routes is slightly simpler than output routes, since we don't need to be concerned with nexthop exceptions. (locally destined, and routed packets, never trigger PMTU events or redirects that will be processed by us). However, we have to elide caching for the DIRECTSRC and non-zero itag cases. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Cache output routes in fib_info nexthops.David S. Miller2012-07-20
| | | | | | | | | | | | | | | | | | | | | | | If we have an output route that lacks nexthop exceptions, we can cache it in the FIB info nexthop. Such routes will have DST_HOST cleared because such routes refer to a family of destinations, rather than just one. The sequence of the handling of exceptions during route lookup is adjusted to make the logic work properly. Before we allocate the route, we lookup the exception. Then we know if we will cache this route or not, and therefore whether DST_HOST should be set on the allocated route. Then we use DST_HOST to key off whether we should store the resulting route, during rt_set_nexthop(), in the FIB nexthop cache. With help from Eric Dumazet. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill routes during PMTU/redirect updates.David S. Miller2012-07-20
| | | | | | | Mark them obsolete so there will be a re-lookup to fetch the FIB nexthop exception info. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Document dst->obsolete better.David S. Miller2012-07-20
| | | | | | | | | Add a big comment explaining how the field works, and use defines instead of magic constants for the values assigned to it. Suggested by Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Adjust semantics of rt->rt_gateway.David S. Miller2012-07-20
| | | | | | | | | | | | | | | | | In order to allow prefixed routes, we have to adjust how rt_gateway is set and interpreted. The new interpretation is: 1) rt_gateway == 0, destination is on-link, nexthop is iph->daddr 2) rt_gateway != 0, destination requires a nexthop gateway Abstract the fetching of the proper nexthop value using a new inline helper, rt_nexthop(), as suggested by Joe Perches. Signed-off-by: David S. Miller <davem@davemloft.net> Tested-by: Vijay Subramanian <subramanian.vijay@gmail.com>
* ipv4: Remove 'rt_dst' from 'struct rtable'David S. Miller2012-07-20
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Remove 'rt_mark' from 'struct rtable'David Miller2012-07-20
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill 'rt_src' from 'struct rtable'David Miller2012-07-20
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Remove rt_key_{src,dst,tos} from struct rtable.David Miller2012-07-20
| | | | | | | | They are always used in contexts where they can be reconstituted, or where the finally resolved rt->rt_{src,dst} is semantically equivalent. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill ip_route_input_noref().David Miller2012-07-20
| | | | | | | | The "noref" argument to ip_route_input_common() is now always ignored because we do not cache routes, and in that case we must always grab a reference to the resulting 'dst'. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Delete routing cache.David S. Miller2012-07-20
| | | | | | | | | | | | | | | | | | | The ipv4 routing cache is non-deterministic, performance wise, and is subject to reasonably easy to launch denial of service attacks. The routing cache works great for well behaved traffic, and the world was a much friendlier place when the tradeoffs that led to the routing cache's design were considered. What it boils down to is that the performance of the routing cache is a product of the traffic patterns seen by a system rather than being a product of the contents of the routing tables. The former of which is controllable by external entitites. Even for "well behaved" legitimate traffic, high volume sites can see hit rates in the routing cache of only ~%10. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: show pmtu in route listJulian Anastasov2012-07-20
| | | | | | | Override the metrics with rt_pmtu Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Fix again the time difference calculationJulian Anastasov2012-07-19
| | | | | | | | | Fix again the diff value in rt_bind_exception after collision of two latest patches, my original commit actually fixed the same problem. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: use seqlock for nh_exceptionsJulian Anastasov2012-07-19
| | | | | | | | | | | | | | | | | | Use global seqlock for the nh_exceptions. Call fnhe_oldest with the right hash chain. Correct the diff value for dst_set_expires. v2: after suggestions from Eric Dumazet: * get rid of spin lock fnhe_lock, rearrange update_or_create_fnhe * continue daddr search in rt_bind_exception v3: * remove the daddr check before seqlock in rt_bind_exception * restart lookup in rt_bind_exception on detected seqlock change, as suggested by David Miller Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Fix time difference calculation in rt_bind_exception().David S. Miller2012-07-19
| | | | | Reported-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: fix rcu splatEric Dumazet2012-07-17
| | | | | | | | | | free_nh_exceptions() should use rcu_dereference_protected(..., 1) since its called after one RCU grace period. Also add some const-ification in recent code. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Fix nexthop exception hash computation.David S. Miller2012-07-17
| | | | | | Need to mask it with (FNHE_HASH_SIZE - 1). Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Add FIB nexthop exceptions.David S. Miller2012-07-17
| | | | | | | | | | | | | | In a regime where we have subnetted route entries, we need a way to store persistent storage about destination specific learned values such as redirects and PMTU values. This is implemented here via nexthop exceptions. The initial implementation is a 2048 entry hash table with relaiming starting at chain length 5. A more sophisticated scheme can be devised if that proves necessary. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Pass optional SKB and SK arguments to dst_ops->{update_pmtu,redirect}()David S. Miller2012-07-17
| | | | | | | | | | | | | | | | This will be used so that we can compose a full flow key. Even though we have a route in this context, we need more. In the future the routes will be without destination address, source address, etc. keying. One ipv4 route will cover entire subnets, etc. In this environment we have to have a way to possess persistent storage for redirects and PMTU information. This persistent storage will exist in the FIB tables, and that's why we'll need to be able to rebuild a full lookup flow key here. Using that flow key will do a fib_lookup() and create/update the persistent entry. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Don't store a rule pointer in fib_result.David S. Miller2012-07-13
| | | | | | | | | | We only use it to fetch the rule's tclassid, so just store the tclassid there instead. This also decreases the size of fib_result by a full 8 bytes on 64-bit. On 32-bits it's a wash. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Fix warnings in ip_do_redirect() for some configurations.David S. Miller2012-07-12
| | | | | Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add dummy dst_ops->redirect method where needed.David S. Miller2012-07-12
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill ip_rt_redirect().David S. Miller2012-07-12
| | | | | | | No longer needed, as the protocol handlers now all properly propagate the redirect back into the routing code. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Add ipv4_redirect() and ipv4_sk_redirect() helper functions.David S. Miller2012-07-12
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Generalize ip_do_redirect() and hook into new dst_ops->redirect.David S. Miller2012-07-11
| | | | | | All of the redirect acceptance policy is now contained within. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Rearrange arguments to ip_rt_redirect()David S. Miller2012-07-11
| | | | | | | | Pass in the SKB rather than just the IP addresses, so that policy and other aspects can reside in ip_rt_redirect() rather then icmp_redirect(). Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Pull redirect instantiation out into a helper function.David S. Miller2012-07-11
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Remove inetpeer from routes.David S. Miller2012-07-11
| | | | | | No longer used. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Calling ->cow_metrics() now is a bug.David S. Miller2012-07-11
| | | | | | | | | | | | | | | Nothing every writes to ipv4 metrics any longer. PMTU is stored in rt->rt_pmtu. Dynamic TCP metrics are stored in a special TCP metrics cache, completely outside of the routes. Therefore ->cow_metrics() can simply nothing more than a WARN_ON trigger so we can catch anyone who tries to add new writes to ipv4 route metrics. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill dst_copy_metrics() call from ipv4_blackhole_route().David S. Miller2012-07-11
| | | | | | | | Blackhole routes have a COW metrics operation that returns NULL always, therefore this dst_copy_metrics() call did absolutely nothing. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Enforce max MTU metric at route insertion time.David S. Miller2012-07-11
| | | | | | Rather than at every struct rtable creation. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Maintain redirect and PMTU info in struct rtable again.David S. Miller2012-07-11
| | | | | | | Maintaining this in the inetpeer entries was not the right way to do this at all. Signed-off-by: David S. Miller <davem@davemloft.net>
* rtnetlink: Remove ts/tsage args to rtnl_put_cacheinfo().David S. Miller2012-07-11
| | | | | | Nobody provides non-zero values any longer. Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: Kill FLOWI_FLAG_PRECOW_METRICS.David S. Miller2012-07-11
| | | | | | | | No longer needed. TCP writes metrics, but now in it's own special cache that does not dirty the route metrics. Therefore there is no longer any reason to pre-cow metrics in this way. Signed-off-by: David S. Miller <davem@davemloft.net>
* inet: Minimize use of cached route inetpeer.David S. Miller2012-07-11
| | | | | | | | | | | | Only use it in the absolutely required cases: 1) COW'ing metrics 2) ipv4 PMTU 3) ipv4 redirects Signed-off-by: David S. Miller <davem@davemloft.net>
* tcp: Move timestamps from inetpeer to metrics cache.David S. Miller2012-07-11
| | | | | | With help from Lin Ming. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Don't report route RTT metric value in cache dumps.David S. Miller2012-07-11
| | | | | | | We don't maintain it dynamically any longer, so reporting it would be extremely misleading. Report zero instead. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: No need to set generic neighbour pointer.David S. Miller2012-07-05
| | | | | | Nobody reads it any longer. Signed-off-by: David S. Miller <davem@davemloft.net>
* net: Add optional SKB arg to dst_ops->neigh_lookup().David S. Miller2012-07-05
| | | | | | | Causes the handler to use the daddr in the ipv4/ipv6 header when the route gateway is unspecified (local subnet). Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Don't report neigh uptodate state in rtcache procfs.David S. Miller2012-07-05
| | | | | | | | Soon routes will not have a cached neigh attached, nor will we be able to necessarily go directly to a neigh from an arbitrary route. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Make neigh lookups directly in output packet path.David S. Miller2012-07-05
| | | | | | Do not use the dst cached neigh, we'll be getting rid of that. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Remove extraneous assignment of dst->tclassid.David S. Miller2012-06-29
| | | | | | We already set it several lines above. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Adjust in_dev handling in fib_validate_source()David S. Miller2012-06-28
| | | | | | | | | Checking for in_dev being NULL is pointless. In fact, all of our callers have in_dev precomputed already, so just pass it in and remove the NULL checking. Signed-off-by: David S. Miller <davem@davemloft.net>
* ipv4: Kill rt->rt_spec_dst, no longer used.David S. Miller2012-06-28
| | | | Signed-off-by: David S. Miller <davem@davemloft.net>