Living in Australia generally means that you're on the end of a Long Fat Network (LFN), internet-wise. That's a serious technical term which is important to the networking stack when determining optimal data transfer sizes.
Two of my colleagues down in Melbourne are also with Aussie Broadband and using the top (100Mbit down, 40Mbit up) NBN speed tier. We also have company-issued hardware vpn units because we work from home fulltime. I was delighted at the bandwidth available from Aussie for our connections to work systems in the SF Bay Area, and when I had cause to update my systems to a new build I observed that it now took about 55 minutes on our media server, rather than the 80-90 minutes it took with the SkyMesh connection.
There was a fly in the ointment, however, because my colleagues and I calculated that while we should be getting 1Mb/s or more as a sustained transfer rate from the internal pkg server, we'd often get around 400kb/s. Since networking is supposed to be something Solaris is good at, we started digging.
The first thing we looked at was the receive buffer size, which defaults
to 1Mb. Greg found https://fasterdata.es.net/host-tuning/other/ so
we changed that for tcp, udp and sctp. While fasterdata document talked
/usr/sbin/ndd, the Proper Way(tm) to do this in
Solaris 11.x is with
# for pp in tcp udp sctp; do ipadm show-prop -p max-buf $pp; done PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE tcp max-buf rw 1048576 -- 1048576 1048576-1073741824 PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE udp max-buf rw 2097152 -- 2097152 65536-1073741824 PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE sctp max-buf rw 1048576 -- 1048576 102400-1073741824
To effect a quick and persistent change, we uttered:
While that did seem to make a positive difference, transferring a sample large file from across the Pacific still cycled up and down in the transfer rate. The cycling was really annoying. We kept digging.
The next thing we investigated was the congestion window, which is where the afore-mentioned LFN comes in to play. That property is cwnd-max:
# for pp in tcp sctp; do ipadm show-prop -p cwnd-max $pp; done PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE tcp cwnd-max rw 1048576 -- 1048576 128-1073741824 PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE sctp cwnd-max rw 1048576 -- 1048576 128-1073741824
Figuring that if it was worth doing, it was worth overdoing, we bumped that parameter up too:
$ curl -o moz.bz2 http://ftp.mozilla.org/pub/mozilla/VMs/CentOS5-ReferencePlatform.tar.bz2 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 3 3091M 3 102M 0 0 4542k 0 0:11:36 0:00:23 0:11:13 5747k^C
While that speed cycled around a lot, it mostly remained above 5MB/s.
Another large improvement. Yay!
However... we still saw the cycling. Intriguingly, the period was about 20 seconds, so there was still something else to twiddle.
In the meantime, however, I decided to update our media server.
I was blown away.
23 minutes 1 second
Not bad at all, even considering that when pkg(1) is transferring lots of small files it's difficult to keep the pipes filled.
Now that both Greg and I had several interesting data points to consider, I asked some of our network gurus for advice on what else we could look at. N suggested looking at the actual congestion algorithm in use, and pointed me to this article on High speed TCP.
High-speed TCP (HS-TCP ). HS-TCP is an update of TCP that reacts better when using large congestion windows on high-bandwidth, high-latency networks.
The Solaris default is the newreno algorithm:
# ipadm show-prop -p cong-default,cong-enabled tcp PROTO PROPERTY PERM CURRENT PERSISTENT DEFAULT POSSIBLE tcp cong-default rw newreno -- newreno newreno,cubic, dctcp, highspeed, vegas tcp cong-enabled rw newreno, newreno, newreno newreno,cubic, cubic,dctcp, cubic,dctcp, dctcp, highspeed, highspeed, highspeed, vegas vegas vegas
Changing that was easy:
Off to pull down that bz2 from mozilla.org again:
$ curl -o blah.tar.bz2 http://ftp.mozilla.org/pub/mozilla/VMs/CentOS5-ReferencePlatform.tar.bz2 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 3091M 100 3091M 0 0 5866k 0 0:08:59 0:08:59 --:--:-- 8684k
For a more local test (within Australia) I made use of Internode's facility:
$ curl -o t.test http://mirror.internode.on.net/pub/test/1000meg.test % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 953M 100 953M 0 0 10.0M 0 0:01:35 0:01:35 --:--:-- 11.0M
And finally, updating my global zone.
# time pkg update --be-name $NEWBE core-os@$version *incorporation@$version Packages to update: 291 Create boot environment: Yes Create backup boot environment: No DOWNLOAD PKGS FILES XFER (MB) SPEED Completed 291/291 2025/2025 116.9/116.9 317k/s PHASE ITEMS Removing old actions 1544/1544 Installing new actions 1552/1552 Updating modified actions 2358/2358 Updating package state database Done Updating package cache 291/291 Updating image state Done Creating fast lookup database Done Reading search index Done Building new search index 1932/1932 A clone of $oldbe exists and has been updated and activated. On the next boot the Boot Environment be://rpool/$newbe will be mounted on '/'. Reboot when ready to switch to this updated BE. real 12m30.391s user 4m4.173s sys 0m21.496s
I think that's sufficient.