Optimizing TCP settings for F5 BIG-IP


#1

Below are the default TCP profile setting for the BIG-IP (with the problemed settings highlighted). A detailed explanation of why these settings should be changed is included below.

For better TCP performance, the following changes should be made:

- Disable nagle

  • Enable ack on push
  • Set proxy buffer low and high to 131072
  • Set send buffer to 65536
  • Set rcv window to 65535

profile tcp CPC-TCP {
defaults from tcp


reset on timeout enable
time wait recycle enable
delayed acks enable
proxy mss disable
proxy options disable
deferred accept disable
selective acks enable
dsack disable
ecn disable
limited transmit enable
rfc1323 enable
slow start enable
bandwidth delay enable
nagle enable
abc enable
ack on push disable
md5 sign disable
cmetrics cache enable
md5 sign passphrase none
proxy buffer low 4096
proxy buffer high 16384
idle timeout 1800
time wait 2K
fin wait 5
close wait 5
send buffer 32768
recv window 32768
keep alive interval 1800
max retrans syn 3
max retrans 8
ip tos 0
link qos 0
congestion control newreno
}
proxy buffer high
proxy buffer low

These settings control the TCP proxy buffering thresholds on BIG-IP. The way this works is that all in-order received data is pushed up through the framework and buffered at the proxy layer, which glues the receiving and transmitting filter chains together. The proxy layer buffers data that cannot be transmitted or held in the send queue. If proxy buffer high bytes of data are queued up in the proxy, then the proxy layer sends an "xoff" message to the TCP stack on the receiving side of the chain, that causes it to stop advancing its receive window, effectively acting as a flow control mechanism that throttles the sender. Once the sending side drains below proxy buffer low bytes, an "xon" message is sent to the TCP stack on the receiving side, which causes it to start opening up its receive window again, restoring prior levels of throughput.

When proxy buffer high is equal to proxy buffer low, traffic is policed and throttled at a constant target rate, based on the amount of data buffered in the proxy. When proxy buffer high is greater than proxy buffer low, the idea is for traffic to be pulled as quickly as possible into the BIG-IP, then spooled out to slower clients. The use case is usually that you keep them close to equal if you're on a low-congestion link on both ends of the BIG-IP, but keep them separated if you're on a high-congestion link on at least one side.

BIG-IP v9's default tcp profile specifies 16K for the proxy buffer high, and 4K for the proxy buffer low, which were very low defaults intended to maximize concurrency of connections. In many situations, these values would reduce the throughput of connections passing through the system. In later versions, the tcp-wan-optimized and tcp-lan-optimized profiles were introduced that significantly increased the buffering settings for TCP connections.

ExtraHop shows when these values can be increased using the "send window throttle", "receive window throttle", and "zero window" stats on its TCP analysis page.

send window

This is the absolute maximum amount of data in bytes that BIG-IP will send on the wire in any round-trip per TCP flow, regardless of congestion. It should be increased for high-latency links. Here's an example of why:

Let's say you have a dedicated leased T3 link to a remote site that's a 200ms ping away. There's no congestion at all on this link, since it's leased and private. A T3 should give you 45Mbps, or about 5.5MB/second. If you have a send window that's set to 32K, that means you'll send 32K every 200ms (based on the 200ms round-trip.) So now, the maximum amount of data BIG-IP will send on any given TCP flow on this link per second is 32K * (1000 / 200) or 160K/second. So even though we have a link that should theoretically give us over 5MB per second, we top out at a fraction of this value per connection, regardless of congestion. That's why we recommend increasing the value. ExtraHop shows when this value should be increased using the "send window throttle" stat on its TCP analysis page. receive window This is the maximum amount of data in bytes that BIG-IP will agree to receive in any-round trip per TCP flow, regardless of buffering state or system load. It is the corollary of the send window above. One thing to note is that you should be careful about increasing the receive window too much, and you should never increase it beyond the proxy buffer high, because it can result in the proxy flow control mechanisms being rendered ineffective, leading to memory exhaustion on the BIG-IP. The receive advertised window is a 16-bit value in TCP, so increasing it beyond 64K results in Window Scaling (RFC1323) being implicitly activated in BIG-IP's TCP stack.

ExtraHop shows when this value should be increased using the "receive window throttle" stat on its TCP analysis page.

ack on push Ack on Push is a feature on BIG-IP that allows many users to reap the benefits of delayed ACKs without the 200ms delay at the end of a transaction. To illustrate, the "delayed ack" mechanism of TCP reduces the number of bare-ACK packets being transmitted by the stack, resulting in less system and network load. It does this by ACKing every other packet, or after 200ms, whichever comes first. However, a side-effect of this feature is that if a transaction ends on an odd-packet count rather than an even-packet count, the stack may wait 200ms (actually, 100ms on BIG-IP's stack) before acknowledging receipt of the payload, holding up the transaction. This is worsened when Nagle's algorithm is on, since if the packets are small, the sender won't send any more data until it receives that acknowledgement. To mitigate this, BIG-IP does something quite clever when ack on push is enabled. The TCP "push" flag is traditionally used by a sender to indicate that it's done (at that point in time) with flushing its socket buffer, and that no more data is currently in the pipe. When BIG-IP sees a push flag on a packet, and this setting is on, it immediately ACKs the data regardless of delayed ACK setting or 100ms timer delay. This speeds up network traffic even if the remote end has Nagle's algorithm activated.

Interestingly, turning on this feature can actually make other systems on your network behave themselves without software patches. Take a look at the following page from Microsoft's support site. http://support.microsoft.com/kb/816627 . They recommend a hotfix, but if you can't get the hotfix installed, then BIG-IP can be used to seamlessly fix the problem just by turning on the ack on push feature in an activated TCP profile.

ExtraHop shows when this setting should be activated using the "Nagle delays" stat on its TCP analysis page.