Low ALG throughput.

Frequently Asked Questions
Post Reply
Peter
Posts: 668
Joined: 10 Apr 2008, 14:14
Location: Clavister HQ - Örnsköldsvik

Low ALG throughput.

Post by Peter » 25 Sep 2012, 20:41

This FAQ applies to:
  • Clavister CorePlus / cOS Core all versions
Question:

I'm having poor performance when i'm using the ALG (HTTP for example). Is there anything i can do about it and/or how do i troubleshoot?

Answer:

Note: The HTTP ALG will be discussed in this FAQ as it is the most common ALG users experience performance issues on.

A performance decreased is normally to be expected when activating the ALG. The reason for this is the SGW starts to examine and scan the HTTP traffic based on the rule it is attached to. It verifies that the traffic is what it claims to be and also performs the various tasks it has been assigned to do such as Web Content Filtering, Anti-Virus, remove Java, block specific file-types etc. Another important aspect to keep in mind is that an ALG creates a second set of connections. First a connection from the client to the ALG, then a connection from the ALG to the server, basically acting as a sort of proxy.

All these functions require resources from the system and will cause a performance decrease. The more functionality that is activated the more the system needs to work in order to perform the required operation. But that is how it works and not the purpose of this FAQ.

Sometimes a big network performance decrease is observed when used in conjunction with the HTTP ALG. Some of these decreases can be normal if for instance the hardware is not powerful enough to handle the amount of traffic that exists in the network. This is usually easy to spot as the CPU load on the SGW reach very high levels but there are also other more unusual network anomalies that can cause a similar behavior.

Possible cause: Retransmissions / CRC errors.

One such behavior is packet losses and retransmissions. An example is if you have a damaged port on the internal network. This damaged port causes CRC errors and packet losses which in turn causes clients on the inside to retransmit a large amount of packets.

This used in conjunction with the HTTP ALG can actually mean that the ALG can make things worse. The reason for this is because the ALG buffers some of the data requested by the client, so if the client requests a data retransmission the ALG will immediately retransmit it's buffered data to the client causing even more packets to reach the problematic port, further worsening the problem.

It is therefore very important to make a deeper analysis of the internal network in case there are lots of retransmissions and packet losses on the inside side of the network.

Incorrectly set speed/duplex problems may also result in the above mentioned behaviour. For instance if you have Auto on one side and 100 FD on the other, the auto side will be set to 100 HD, and that would cause severe throughput degradation.

Possible cause: Amount of Sessions

An ALG is attached to a Service, which in turn is used on a Policy/Rule. On that service there is a value called "Max Sessions". Which means "Specifies how many concurrent sessions that are permitted using this service."

If this value is set to e.g. 200 sessions and you have 1000 users behind the SGW this limit will hit the roof pretty fast as one user can allocate many sessions whenever he connects to a webpage (pictures, banner files etc). The amount of sessions used is possible to monitor using for instance InControl's Dashboard and monitor the "Total ALG session" value.

The recommended value varies quite big depending on the size of the network, amount of users, expected traffic etc. So perhaps for 1000 users set a session value of 20 000 then monitor the ALG session dashboard value in case this value needs to be increased.

Question: Why are single downloads slower than multiple download streams?

The reason for this is because the ALG is primary designed to handle thousands of users/connections. Also some of the features in the ALG are stream based, which means it scans the data streams in real time instead of buffering data, which is why a single data download stream will not reach the same amount of bandwidth as e.g. 3 simultaneous ones.

Question: Is that why online Bandwidth testing pages reports bad performance??

Yes, the reasons is the same. The bandwidth testing pages usually uses single download/upload streams to measure performance.

Peter
Posts: 668
Joined: 10 Apr 2008, 14:14
Location: Clavister HQ - Örnsköldsvik

Re: Low ALG throughput.

Post by Peter » 18 Feb 2016, 16:01

Update:

In version 11.00.00 a new type of HTTP ALG was implemented called the "Lightweight HTTP ALG". This ALG has a much higher performance than the standard HTTP ALG.

You cannot use Anti-virus on the LW-ALG however.

TobiasE
Posts: 5
Joined: 13 Sep 2016, 12:01

Re: Low ALG throughput.

Post by TobiasE » 10 Nov 2017, 11:10

Update:

In Core version 11.20.00 we implemented antivirus scanning on the LW-HTTP-ALG which should greatly improve the performance.

Peter
Posts: 668
Joined: 10 Apr 2008, 14:14
Location: Clavister HQ - Örnsköldsvik

Re: Low ALG throughput.

Post by Peter » 14 Mar 2018, 11:34

Update:

In version 12.00.10 and up, all features/functions that existed in the old HTTP/HTTPS ALG is now available in the LW-ALG. Which means a very large performance boost so this problem should hopefully be diminished significantly.

Important: There will be no future updates for the old ALG + IP Rule combination. In order to get access to the performance boost as well as future updates and improvements in this area, the Protocol + IP Policy combination must be used. It is also recommended to change older IP rules to use IP Policy's instead as we are moving more and more towards the use of IP polices.

We will not remove the possibility to use/configure using IP rules but any new configuration or rules should be created using IP Policy's instead.

Post Reply