Transfer Tests between London UCL HEP and Bristol Physics
Conducted on Wednesday 28th November 2001
PipeChar
Initial pipechar results are here
and here.
Both results show wild fluctuations and high bottleneck at man-gw-1.bwe.net.uk,
ub-gw-1.bwe.net.uk and br1-gr1.nwpp.bris.ac.uk. With ub-gw-1.bwe.net.uk
suffering from a very high delay and jitter.
It should be noted that these routers may have some form of qeue management
that drops our packets as they are sent through the router, hence resulting
in a low throughput.
Iperf
Repeating the measurements of the 26th, the socket buffer size on bsesrv1
was now increased to allow a window of 1024k. The network was then throttled
from UCL using various socket buffer sizes. Results of the web100 monitoring
can be seen here.
From |
|
To |
|
Remote Port # |
|
Socket Buffer Size (bytes)
|
|
|
|
|
Bandwidth
|
Web100 Log |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
4096
|
1006948008 |
1 |
10.4 |
1900544 |
1465941
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
8192
|
1006948018 |
1 |
10.1 |
3137536 |
2495611
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
16384
|
1006948028 |
1 |
10.2 |
3899392 |
3049979
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
32768
|
1006948039 |
1 |
10.1 |
3670016 |
2912847
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
65536
|
1006948049 |
1 |
10.2 |
4243456 |
3338756
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
98304
|
1006948059 |
1 |
10.4 |
4202496 |
3237012
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
131072
|
1006948070 |
1 |
11 |
2957312 |
2147693
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
163840
|
1006948081 |
1 |
10.8 |
4014080 |
2960389
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
196608
|
1006948092 |
1 |
10.6 |
4997120 |
3778039
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
262144
|
1006948102 |
1 |
10.4 |
5300224 |
4087309
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
327680
|
1006948116 |
1 |
11 |
4562944 |
3332446
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
491520
|
1006948127 |
1 |
11.5 |
5414912 |
3756565
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
524288
|
1006948138 |
1 |
11.2 |
3751936 |
2691278
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
1048576
|
1006948150 |
1 |
14 |
4595712 |
2632052
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
4096
|
1006948190 |
1 |
13.1 |
1998848 |
1224555
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
8192
|
1006948203 |
1 |
10 |
2867200 |
2289011
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
16384
|
1006948213 |
1 |
10.1 |
3645440 |
2881898
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
32768
|
1006948224 |
1 |
10.3 |
3743744 |
2901637
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
65536
|
1006948234 |
1 |
10.2 |
4833280 |
3807993
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
98304
|
1006948244 |
1 |
10.2 |
6660096 |
5247936
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
131072
|
1006948255 |
1 |
10.3 |
3784704 |
2952726
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
163840
|
1006948265 |
1 |
10.6 |
4153344 |
3134457
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
196608
|
1006948276 |
1 |
10.3 |
4898816 |
3808933
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
262144
|
1006948286 |
1 |
11.3 |
5136384 |
3642646
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
327680
|
1006948297 |
1 |
12.1 |
3825664 |
2530755
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
491520
|
1006948310 |
1 |
11.7 |
3530752 |
2409956
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
524288
|
1006948325 |
1 |
12.5 |
3170304 |
2030526
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
1048576
|
1006948337 |
1 |
16.8 |
4014080 |
1916900
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
4096
|
1006948381 |
1 |
10.1 |
1581056 |
1257700
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
8192
|
1006948391 |
1 |
10 |
3956736 |
3155924
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
16384
|
1006948401 |
1 |
10 |
5898240 |
4698495
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
32768
|
1006948411 |
1 |
10.2 |
5398528 |
4246999
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
65536
|
1006948421 |
1 |
10.3 |
6062080 |
4724529
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
98304
|
1006948435 |
1 |
10.5 |
3629056 |
2767654
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
131072
|
1006948445 |
1 |
10.5 |
4096000 |
3123778
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
163840
|
1006948456 |
1 |
10.6 |
3923968 |
2952067
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
196608
|
1006948467 |
1 |
11.1 |
4644864 |
3333353
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
262144
|
1006948478 |
1 |
11.1 |
5169152 |
3741078
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
327680
|
1006948489 |
1 |
11.9 |
4530176 |
3038800
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
491520
|
1006948501 |
1 |
12.6 |
4325376 |
2753880
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
524288
|
1006948514 |
1 |
12.3 |
4038656 |
2626114
|
Web100 |
pc35.hep.ucl.ac.uk |
128.40.4.35 |
bsesrv1.phy.bris.ac.uk |
137.222.74.1 |
20006 |
tcp |
1048576
|
1006948526 |
1 |
12.3 |
5152768 |
3340226
|
Web100 |
|
|
Taking averages of these socket buffers shows,
Socket Buffer Size (Bytes) |
Average Link Utilisation |
(Bits/sec) |
(Bits/sec) |
(Bits/sec) |
4096 |
1316065.333 |
1.255097707 |
8192 |
2646848.667 |
2.524231593 |
16384 |
3543457.333 |
3.37930425 |
32768 |
3353827.667 |
3.198459307 |
65536 |
3957092.667 |
3.773777644 |
98304 |
3750867.333 |
3.57710584 |
131072 |
2741399 |
2.614401817 |
163840 |
3015637.667 |
2.87593619 |
196608 |
3640108.333 |
3.471477826 |
262144 |
3823677.667 |
3.646543185 |
327680 |
2967333.667 |
2.829869906 |
491520 |
2973467 |
2.835719109 |
524288 |
2449306 |
2.335840225 |
1048576 |
2629726 |
2.507902145 |
|
 |
Again this shows the funny peak and then a lower throughput at higher
buffer sizes that it should. This suggests that a router is possibly dropping
the packets from pc35.
However, this does not explain why there is a dip when the buffer size
is 128k and 160k.
Web100 Analysis
Just to check that the rmem and wmem values have been adjusted;

Seems okay - showing that the rwin on bsesrv1 is set to 1Mb.
Trying to explain the cause of the lower throughput for windows 128k
and 160k,
For set 1;

|
For Set 2;

|
For and set 3;

|
|
The extra time that the 1024k socket buffer sizes take to complete can
be attributed to the increase of data in the network as a result of having
such a high(er) buffer size. As expected, the lowest socket buffersize
has the fewest errors. However, there seems to be no correlation between
the numbe of recoveries and the throughput. What is unexpected (at least
initially without pipechar knowledge) is that typically, traffic through
the Janet network should be relatively 'clean' with low jitter and almost
constant RTT. This would result in few recoveries, certainly not as many
as seen through this link.
Looking at the specific Web100 variables shows whether the link is being
limited by the cwnd, the sender or the receiver.
Looking first at the Sender Limited time (remember that 1e+6 is one second)...
For the first set;

|
Second set;

|
Third set;

|
|
This shows explicitly that the link is being limited by pc35 (the sender)
only for small buffer sizes (high amount of time). We can reverse this
to see that the amount of time that the link remains limited by the cwnd
is,
For the first set,

|
Second set;

|
Third Set;

|
|
As we are only sending data, the reciever window has no meaning.
So again, it seems to be the network which is causing fault.
In order to look into the slow start/congestion avoidance scheme of TCP,
we need to look at cwnd and ssthresh;
And at cwnd;
Taking the difference between ssthresh and cwnd;
As TCP behaviour is such that when cwnd is below ssthresh slow start
occurs upon an error, one can see that slow start occurs occasionally
along this link. (ie when the y value is positive).
So how full is the sending socket buffer? We can analyse the window size
of the transfer from two variables: snd_una (send next unack'ed) and snd_nxt
(send next).
From these graphs, it is quite difficult to differentiate between the
different socket buffer sizes. The only obvious sign is that the 64k socket
buffer size on all three set of tests allow the window size to grow quite
large (compared to the others). This would explain the comparative higher
throughput achieved by using a 64k socket buffer size. As such, by pumping
any more packets in to the network would cause more problems in the network
to occur, to which TCP reacts by decreasing the size of its cwnd. If this
value gets below the ssthresh, slow start occurs.
We can see how TCP is reacting to the reciever by analysis the ACK packets
that are being received by the sender. As TCP can not increase it's window
size until the last unack'ed packet is acknowledged, this will prevent
the window size to progress if the window size is at maximum.
Wed, 28 November, 2001 20:09
|
  |
|