aboutsummaryrefslogtreecommitdiffstats
path: root/net/sched
diff options
context:
space:
mode:
authorEric Dumazet <edumazet@google.com>2012-07-13 23:16:27 -0400
committerDavid S. Miller <davem@davemloft.net>2012-07-17 02:08:33 -0400
commit5a308f40bfe27fcfd1db3970afe18b635f23c182 (patch)
tree2d2a596e36a3f6c46405c18982d9f4536b24c204 /net/sched
parent7ff65cdea72bdf9af0b9c57cd003211875ca4142 (diff)
netem: refine early skb orphaning
netem does an early orphaning of skbs. Doing so breaks TCP Small Queue or any mechanism relying on socket sk_wmem_alloc feedback. Ideally, we should perform this orphaning after the rate module and before the delay module, to mimic what happens on a real link : skb orphaning is indeed normally done at TX completion, before the transit on the link. +-------+ +--------+ +---------------+ +-----------------+ + Qdisc +---> Device +--> TX completion +--> links / hops +-> + + + xmit + + skb orphaning + + propagation + +-------+ +--------+ +---------------+ +-----------------+ < rate limiting > < delay, drops, reorders > If netem is used without delay feature (drops, reorders, rate limiting), then we should avoid early skb orphaning, to keep pressure on sockets as long as packets are still in qdisc queue. Ideally, netem should be refactored to implement delay module as the last stage. Current algorithm merges the two phases (rate limiting + delay) so its not correct. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Mark Gordon <msg@google.com> Cc: Andreas Terzis <aterzis@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/sched')
-rw-r--r--net/sched/sch_netem.c9
1 files changed, 8 insertions, 1 deletions
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index c412ad0d0308..298c0ddfb57e 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -380,7 +380,14 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch)
380 return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS; 380 return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
381 } 381 }
382 382
383 skb_orphan(skb); 383 /* If a delay is expected, orphan the skb. (orphaning usually takes
384 * place at TX completion time, so _before_ the link transit delay)
385 * Ideally, this orphaning should be done after the rate limiting
386 * module, because this breaks TCP Small Queue, and other mechanisms
387 * based on socket sk_wmem_alloc.
388 */
389 if (q->latency || q->jitter)
390 skb_orphan(skb);
384 391
385 /* 392 /*
386 * If we need to duplicate packet, then re-insert at top of the 393 * If we need to duplicate packet, then re-insert at top of the