Dynamic Right Sizing
Dynamic Right-Sizing provides automatic tuning of TCP flow control windows to support high bandwidth over high-latency (WAN) links. It improves TCP throughput by orders of magnitude over high delay-bandwidth links. It also keeps windows small for low-bandwidth and low-latency connections so they don't consume unnecessary amounts of memory.
Over the past decade, TCP has become the ubiquitous transport protocol
for the Internet. However, stock TCP performs absymally over high-bandwidth
or high-delay links (or more generally, networks with high bandwidth-delay
products). As a result, the performance of application infrastructures such
as computational grids and high-volume web servers, which are built on TCP,
A technique that eliminates the need for this manual process of tuning
system buffers. This technique, which can be implemented in the operating-system
kernel or in user space, results in nearly an order-of-magnitude increase
in throughput over a WAN while still abiding by TCP semantics.
In 2001, Feng and Fisk describe modifications to a Linux 2.4.8 kernel that
allow the kernel to tune the buffer size advertised by the TCP receiver.
The receiver's kernel estimates the bandwidth from the amount of data received
in each round-trip time and uses that bandwidth estimate to derive the receiver's
window. The sender's window is not constrained by the system default window
size but is allowed to grow, throttled only by the receiver's advertised
window. (The liux 2.4 kernel allows the sender's window buffer to grow.
For other OS's, the sender would have to be configured with a "big
enough" buffer.) The growth of the sender's congestion window will
be limited by currently available bandwidth. High delay, high bandwidth
flows will automatically use larger buffers (within the limits of the initial
window scale factor advertised by the receiver). No modifications are required
to either client or server network applications. Also see Allman/Paxon '99
paper on receiver-side bandwidth estimation.
Effects of DRS (dynamic right sizing - Feng)
The graph on the right shows WITHOUT DRS; notice that the cwnd (in red) flucuates quite wildly. (the green is the average value of cwnd so far...). The graph on right shows that the cwnd value is capped - in this case to just under 800,000 - which is the bandwidth delay product of this link - 6e-3sec * 1e9 / 8 ~= 750,000 (the rtt reported by web100 is actually higher than the 6ms).
What i don't understand at the moment is why it keeps on going into slow start (i have graphs elsewhere). I suspect its to do with what web100 calls OtherReductions.... i'm investigating.
Moving onto cwnd dynamics:
Plot of Cwnd and DataBytesOut - i think the delta for the databytes out is more meaningful in this context as the cwnd is dynamic and thus we should be looking at what effect this has on the number of bytes leaving the sender...
Again, we have the drs enabled one on the right: and the standard on the left.
For the standard one, we appear to have two main regions: the linear increase - representing that there is a direct relation betweeen the cwnd and the number of bytes allowed out, and then we have the second which represents that we've reached the maximum line rate (notice that there are some points beyond this region - it happens). The other bits of the graph are usually to do with the way tcp handles errors and stuff. A filtering process on some other variables would show this.
The drs one is much nicer - we have the same linear bit; but as drs caps the cwnd value, we get a small cluster and spread at the top. there is also a plateau at the bottom - you tell me! it says to me that for a value of cwnd, it's gonna send out a range of bytes... does it make sense?
But then again, the picture on the right looks like an emu to me....
I've included a variable called InstaneousWindow in the logvars list, it is snd_nxt-snd_una and represents the size of the sliding window of tcp. I'm not sure if this variable works very well (bad coding?) as it's often 0. But looking at the value of the variables, there is a difference.
|© 2001-2003, Yee-Ting Li, email: email@example.com,
Tel: +44 (0) 20 7679 1376, Fax: +44 (0) 20 7679 7145
Room D14, High Energy Particle Physics, Dept. of Physics & Astronomy, UCL, Gower St, London, WC1E 6BT