- Clavister CorePlus / cOS Core all versions
I'm having poor performance when i'm using the ALG (HTTP for example). Is there anything i can do about it and/or how do i troubleshoot?
Note: The HTTP ALG will be discussed in this FAQ as it is the most common ALG users experience performance issues on.
A performance decreased is normally to be expected when activating the ALG. The reason for this is the SGW starts to examine and scan the HTTP traffic based on the rule it is attached to. It verifies that the traffic is what it claims to be and also performs the various tasks it has been assigned to do such as Web Content Filtering, Anti-Virus, remove Java, block specific file-types etc. Another important aspect to keep in mind is that an ALG creates a second set of connections. First a connection from the client to the ALG, then a connection from the ALG to the server, basically acting as a sort of proxy.
All these functions require resources from the system and will cause a performance decrease. The more functionality that is activated the more the system needs to work in order to perform the required operation. But that is how it works and not the purpose of this FAQ.
Sometimes a big network performance decrease is observed when used in conjunction with the HTTP ALG. Some of these decreases can be normal if for instance the hardware is not powerful enough to handle the amount of traffic that exists in the network. This is usually easy to spot as the CPU load on the SGW reach very high levels but there are also other more unusual network anomalies that can cause a similar behavior.
Possible cause: Retransmissions / CRC errors.
One such behavior is packet losses and retransmissions. An example is if you have a damaged port on the internal network. This damaged port causes CRC errors and packet losses which in turn causes clients on the inside to retransmit a large amount of packets.
This used in conjunction with the HTTP ALG can actually mean that the ALG can make things worse. The reason for this is because the ALG buffers some of the data requested by the client, so if the client requests a data retransmission the ALG will immediately retransmit it's buffered data to the client causing even more packets to reach the problematic port, further worsening the problem.
It is therefore very important to make a deeper analysis of the internal network in case there are lots of retransmissions and packet losses on the inside side of the network.
Incorrectly set speed/duplex problems may also result in the above mentioned behaviour. For instance if you have Auto on one side and 100 FD on the other, the auto side will be set to 100 HD, and that would cause severe throughput degradation.
Possible cause: Amount of Sessions
An ALG is attached to a Service, which in turn is used on a Policy/Rule. On that service there is a value called "Max Sessions". Which means "Specifies how many concurrent sessions that are permitted using this service."
If this value is set to e.g. 200 sessions and you have 1000 users behind the SGW this limit will hit the roof pretty fast as one user can allocate many sessions whenever he connects to a webpage (pictures, banner files etc). The amount of sessions used is possible to monitor using for instance InControl's Dashboard and monitor the "Total ALG session" value.
The recommended value varies quite big depending on the size of the network, amount of users, expected traffic etc. So perhaps for 1000 users set a session value of 20 000 then monitor the ALG session dashboard value in case this value needs to be increased.
Question: Why are single downloads slower than multiple download streams?
The reason for this is because the ALG is primary designed to handle thousands of users/connections. Also some of the features in the ALG are stream based, which means it scans the data streams in real time instead of buffering data, which is why a single data download stream will not reach the same amount of bandwidth as e.g. 3 simultaneous ones.
Question: Is that why online Bandwidth testing pages reports bad performance??
Yes, the reasons is the same. The bandwidth testing pages usually uses single download/upload streams to measure performance.