Files
linux/include/linux
Eric Dumazet 46d3ceabd8 tcp: TCP Small Queues
This introduce TSQ (TCP Small Queues)

TSQ goal is to reduce number of TCP packets in xmit queues (qdisc &
device queues), to reduce RTT and cwnd bias, part of the bufferbloat
problem.

sk->sk_wmem_alloc not allowed to grow above a given limit,
allowing no more than ~128KB [1] per tcp socket in qdisc/dev layers at a
given time.

TSO packets are sized/capped to half the limit, so that we have two
TSO packets in flight, allowing better bandwidth use.

As a side effect, setting the limit to 40000 automatically reduces the
standard gso max limit (65536) to 40000/2 : It can help to reduce
latencies of high prio packets, having smaller TSO packets.

This means we divert sock_wfree() to a tcp_wfree() handler, to
queue/send following frames when skb_orphan() [2] is called for the
already queued skbs.

Results on my dev machines (tg3/ixgbe nics) are really impressive,
using standard pfifo_fast, and with or without TSO/GSO.

Without reduction of nominal bandwidth, we have reduction of buffering
per bulk sender :
< 1ms on Gbit (instead of 50ms with TSO)
< 8ms on 100Mbit (instead of 132 ms)

I no longer have 4 MBytes backlogged in qdisc by a single netperf
session, and both side socket autotuning no longer use 4 Mbytes.

As skb destructor cannot restart xmit itself ( as qdisc lock might be
taken at this point ), we delegate the work to a tasklet. We use one
tasklest per cpu for performance reasons.

If tasklet finds a socket owned by the user, it sets TSQ_OWNED flag.
This flag is tested in a new protocol method called from release_sock(),
to eventually send new segments.

[1] New /proc/sys/net/ipv4/tcp_limit_output_bytes tunable
[2] skb_orphan() is usually called at TX completion time,
  but some drivers call it in their start_xmit() handler.
  These drivers should at least use BQL, or else a single TCP
  session can still fill the whole NIC TX ring, since TSQ will
  have no effect.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Dave Taht <dave.taht@bufferbloat.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Matt Mathis <mattmathis@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-07-11 18:12:59 -07:00
..
2012-06-19 21:40:14 +02:00
2012-05-07 15:39:35 -07:00
2012-04-23 14:23:32 +03:00
2012-05-15 17:30:30 -04:00
2012-06-06 15:20:22 -04:00
2012-03-20 21:29:46 -04:00
2012-05-08 14:13:25 -07:00
2012-02-28 16:02:54 +01:00
2012-03-29 15:38:31 +10:30
2012-03-23 16:58:38 -07:00
2012-04-14 15:24:26 -04:00
2012-02-20 19:46:36 +11:00
2012-05-02 14:15:27 -05:00
2012-05-25 12:46:23 +05:30
2012-05-10 12:00:56 +02:00
2012-04-30 15:30:18 -07:00
2012-06-27 15:42:24 -07:00
2012-05-29 23:28:33 -04:00
2012-04-25 12:25:05 +02:00
2012-02-28 17:10:21 +00:00
2012-04-27 10:46:45 +08:00
2012-05-22 11:32:31 +02:00
2012-03-26 21:47:19 +02:00
2012-03-26 21:47:19 +02:00
2012-03-26 21:47:19 +02:00
2012-05-12 14:28:14 +02:00
2012-03-26 21:47:19 +02:00
2012-03-27 22:45:26 -04:00
2012-03-20 12:47:48 +01:00
2012-05-07 10:58:57 -06:00
2012-03-20 12:47:47 +01:00
2012-06-30 15:56:40 -07:00
2012-04-09 11:16:55 -07:00
2012-03-08 10:50:35 -08:00
2012-05-31 17:49:30 -07:00
2012-05-31 17:49:32 -07:00
2012-05-31 17:49:26 -07:00
2012-05-31 17:49:30 -07:00
2012-03-08 10:50:35 -08:00
2012-05-11 10:56:56 +01:00
2012-05-29 23:28:41 -04:00
2012-03-15 21:41:34 +01:00
2012-05-09 13:58:06 -07:00
2012-05-22 15:20:28 -04:00
2012-05-29 22:33:55 -04:00
2012-05-26 14:17:30 -04:00
2012-03-05 15:26:47 -05:00
2012-06-05 18:38:47 -04:00
2012-05-21 14:31:48 +01:00
2012-03-21 17:55:01 -07:00
2012-06-20 14:39:36 -07:00
2012-05-14 14:15:32 -07:00
2012-06-06 17:08:00 +02:00
2012-05-12 15:53:42 -04:00
2012-03-20 21:29:38 -04:00
2012-03-28 18:30:03 +01:00
2012-04-18 15:57:31 -07:00
2012-06-01 12:58:52 -04:00
2012-06-15 15:30:15 -07:00
2012-05-31 17:49:26 -07:00
2012-05-08 12:35:06 +02:00
2012-03-19 16:53:08 -04:00
2012-06-04 11:27:40 -04:00
2012-04-15 12:44:40 -04:00
2012-06-13 21:16:42 +02:00
2012-05-14 18:53:19 -04:00
2012-07-11 18:12:59 -07:00
2012-05-21 16:16:58 -07:00
2012-03-22 19:43:43 -07:00
2012-05-22 12:16:16 +09:30
2012-03-31 08:09:50 +05:30
2012-03-28 18:30:03 +01:00