Route failover on IPsec tunnels (12.x)

Security Gateway Articles and How to's
Post Reply
Anton
Posts: 24
Joined: 16 Jun 2016, 18:50
Location: Clavister HQ - Örnsköldsvik

Route failover on IPsec tunnels (12.x)

Post by Anton » 20 Jun 2018, 15:18

This How-to applies to:
  • Clavister Security Gateway 12.x.
Problem:

I have two installations on two different locations regarding route failover and IPsec.
  • Scenario1: An SGW with one ISP has an IPsec tunnel towards an SGW with two ISP’s.
I want to be able to use Route Failover on the SGW with two ISP’s. If the primary ISP/Route goes down it should be able to establish the IPsec tunnel using the secondary/backup ISP.
  • Scenario2: An SGW with two ISP’s has an IPsec tunnel towards an SGW which also has two ISP’s.
A bit more complex but with more redundancy, if any of the primary ISP’s on either side goes down it should establish and use the backup IPsec tunnel.

Solution:
  • Scenario1.
One-ISPnew.png
One-ISPnew.png (67.98 KiB) Viewed 418 times
Since we do not have two ISP’s on both sides there is no need to use 2 IPsectunnels. We simply setup the RFO (Route Failover) towards the ISP only and then configure the IPsec tunnel to accept this. It is possible to use two IPsec tunnels in this scenario but it will only add another layer of complexity that is not really needed.
  • Base configuration:
Site-A (One ISP)

Code: Select all

Lannet=10.10.10.0/24
Wannet=80.80.80.0/24
IP_Wan=80.80.80.10
Routing Table(Site-A):

Code: Select all

Route Lan Lannet
Route Wan Wannet
Route IPSecTunnel SiteBLannet
Route Wan All-nets Gw-World
Site-B (Two ISP’s)

Code: Select all

Lannet=192.168.200.0/24
Wannet=90.90.90.0/24
Wannet2=100.100.100.0/24
IP_Wan=90.90.90.10
IP_Wan2=100.100.100.10
Routing Table(Site-B):

Code: Select all

Route Lan Lannet
Route Wan Wannet
Route Wan2 Wannet2
Route IPSecTunnel SiteALannet
Route Wan All-nets Gw-World
Route Wan2 All-nets Gw-World2
The routing table on Site-B is in its current state incorrect, the reason for this is because we have two “identical” routes on the all-nets route. Without a metric definition the SGW will be unable to determine which ISP it should use when i.e surfing the web. Sometimes you may use Wan and sometimes Wan2, so not a good setup for the moment.

Since we want to have redundancy we first need to setup the RFO, in this particular scenario it is fairly simple. We edit the routing table on Site-B to look like this:

Code: Select all

Route Lan Lannet
Route Wan Wannet
Route Wan2 Wannet2
Route IPSecTunnel SiteALannet
Route Wan All-nets Gw-World Metric=10 Monitor=Yes
Route Wan2 All-nets Gw-World2 Metric=20
What you choose to monitor here is up to you, it can be ARP, Link and HostMonitor depending on the requirement.

As you may notice we have not configured any monitoring on the IPsec tunnel. The reason for this is because we do not need to. We only have one IPsec tunnel, so no monitoring on the IPsec tunnel itself is needed in THIS scenario. What will happen here is when the primary ISP fails, it will failover to the secondary ISP. Monitoring that is per default enabled on the IPsec tunnel (DPD – Dead Peer Detection) will detect that the tunnel is no longer alive and will try to establish the tunnel anew. When it performs a route lookup for the remote gateway it will find a matching route on Wan2 as the Wan All-net route is now disabled, the tunnel will then be established from ISP2.

So are we done? No, there is some more things that needs to be taken into account, one that is the IPsec tunnel configuration on Site-A. We need to configure this tunnel to accept tunnel negotiations from 2 different IP’s. Normally you define an IPsec tunnel between two IP’s which is statically defined. In this scenario we do not know if the tunnel negotiation will arrive from 90.90.90.10 (Wan on Site-B) or 100.100.100.10 (Wan2 on Site-B).

So on Site-A we create a new host group that contains both these IP’s and use them as Remote Gateway on the IPsec tunnel configured at Site-A. By doing this we will accept tunnel negotiations from both public IP’s that exists on Site-B.

There is however a drawback, and that is that you can never initiate the tunnel from Site-A. It must always be initiated from Site-B. This IPsec RFO failover scenario is not as fast as normal interface failovers (or multiple ISP’s at both sites) since DPD is not that fast in declaring a tunnel as down. Usually the failover takes around 60-100 seconds in this scenario.

And lastly you need to configure/specify a Local ID on the IPsec tunnel on Site-B in order for it to identify the tunnel as the "same". If you do not, after a failover the tunnel will be established but no traffic will go thru the tunnel until the Conn has been re-established again. I.e if you are constantly sending a ping thru the tunnel you need to stop the ping, wait a few seconds and then start it again. Setting a Local ID will solve that problem. The value can be pretty much anything as it is used as a tunnel "identifier".
  • Scenario2.
In this scenario we have two ISP’s on both sides. This means that in order to account for all possible scenarios we need 4 IPsec tunnels between Site-A and Site-B, as follows:

Code: Select all

SiteA-ISP1 <-> ISP1-SiteB
SiteA-ISP2 <-> ISP1-SiteB
SiteA-ISP1 <-> ISP2-SiteB
SiteA-ISP2 <-> ISP2-SiteB
Two-ISPnew.png
Two-ISPnew.png (72.66 KiB) Viewed 418 times
  • Base configuration:
Site-A (Two ISP’s)

Code: Select all

Lannet=10.10.10.0/24
IP_Lan=10.10.10.1
Wannet=70.70.70.0/24
Wannet2=80.80.80.0/24
IP_Wan=70.70.70.10
IP_Wan2=80.80.80.10
Routing Table(Site-A):

Code: Select all

Route Lan Lannet
Route Wan Wannet
Route Wan2 Wannet2
Route IPSecTunnel1 SiteBLannet
Route IPSecTunnel2 SiteBLannet
Route IPsecTunnel3 SiteBLannet
Route IPsectunnel4 SiteBlannet
Route Wan All-nets Gw-World Metric=10 Monitor=Yes
Route Wan2 All-nets Gw-World2 Metric=20
Site-B (Two ISP’s)

Code: Select all

Lannet=192.168.200.0/24
IP_Lan=192.168.200.1
Wannet=90.90.90.0/24
Wannet2=100.100.100.0/24
IP_Wan=90.90.90.10
IP_Wan2=100.100.100.10

Routing Table (Site-B):

Code: Select all

Route Lan Lannet
Route Wan Wannet
Route Wan2 Wannet2
Route IPSecTunnel1 SiteALannet
Route IPSecTunnel2 SiteALannet
Route IPSecTunnel3 SiteALannet
Route IPSecTunnel4 SiteALannet
Route Wan All-nets Gw-World Metric=10 Monitor=Yes
Route Wan2 All-nets Gw-World2 Metric=20
The above scenario is in its current state working fine for the standard RFO scenario (physical interface) but the IPsec tunnel RFO still needs some modifications. First of all we need to add Metrics to the tunnel routes. So on both sites we add Metric=10 for the primary, Metric=20 for the secondary tunnel, Metric=30 for the tertiary tunnel and Metric=40 for the fourth tunnel. Also we need to monitor the primary, secondary and the tertiary tunnel. Since this is an IPsec tunnel we cannot use ARP or LinkState, the only method we can use is HostMonitor. What host to monitor here is up to you but in this example we use the IPsec interface IP on the other side of the IPsec tunnel.

In order for this to work we need to specify the originator IP on the tunnels on both sides (this can be found under the Advanced Tab on the IPsec tunnel, chose “Specify address manually” and specify an IP).

Note: Keep in mind that the IP configured as originator IP will be core routed. Meaning that can’t be used by clients on the LAN side of either Site-A or Site-B.

Note2: While it's possible to select e.g. Link for the monitored tunnel, monitoring an IPsec tunnel using link does not work. The link will always be reported as OK. So it's a kind of false response and could give the impression that it actually works.

So Site-A’s primary IPsec tunnel route monitors the IP 192.168.200.1 and Site-B’s primary IPsec tunnel route monitors the IP 10.10.10.1.
Site-A’s secondary IPsec tunnel route monitors the IP 192.168.200.2 and Site-B’s secondary IPsec tunnel route monitors the IP 10.10.10.2.
Site-A’s tertiary IPsec tunnel route monitors the IP 192.168.200.3 and Site-B’s secondary IPsec tunnel route monitors the IP 10.10.10.3.
Site-A and B’s fourth tunnel should not be monitored since that is the last resort and will only be used if Site-A and Site-B’s Primary ISP goes down.

If the primary ISP goes down so will the monitor of the primary and the secondary IPsec tunnel stop receiving replies to its HostMon queries. So the primary Route will failover to the secondary ISP and the primary route on the IPsec tunnel will failover to the tertiary IPsec tunnel route that uses the secondary WAN interface as local endpoint.

Note: The primary, secondary, tertiary and the fourth tunnel configuration is identical in terms of local and remote network. The only thing that is different is the remote gateway and the originator IP of the tunnels.

So far so good, the failover scenario works and the secondary tunnel will take over if the primary goes down. But what happens when the primary ISP comes back up?

When the primary ISP comes back the monitoring functions on the physical interfaces towards this ISP will notice this and will declare the primary route as alive again and move traffic from the secondary ISP back to the primary.

But this causes a problem for the IPsec monitoring. We monitor a host on the other side of the IPsec tunnel, monitoring packets are ONLY sent out on the route it is defined on. This means that when the primary ISP is back up the primary tunnel will be established again, but the primary route can never be declared as up since the monitoring packets are received (on the other side of the tunnel) on a disabled route, these packets will be dropped due to “default_access_rule”.

In order to solve this we define the source address of the monitor traffic on the primary route. This may sound a bit cryptic but if we update the routing table on Site-A to reflect this it will look like this:

Code: Select all

Route Lan Lannet
Route Wan Wannet
Route Wan2 Wannet2
Route IPSecTunnel1 SiteBLannet Metric=10 Monitor=Yes
Route IPSecTunnel1 MonitorSource1
Route IPSecTunnel2 SiteBLannet Metric=20 Monitor=Yes
Route IPSecTunnel2 MonitorSource2
Route IPSecTunnel3 SiteBLannet Metric=30 Monitor=Yes
Route IPSecTunnel2 MonitorSource3
Route IPSecTunnel4 SiteBLannet Metric=40
Route Wan All-nets Gw-World Metric=10 Monitor=Yes
Route Wan2 All-nets Gw-World2 Metric=20
Where the MonitorSource1 object is the following IP, 192.168.200.1. Which is Site-B’s IPsec interface IP. A similar single-host route needs to be defined on Site-B’s routing table where the MonitorSource object is 10.10.10.1.

Where the MonitorSource2 object is the following IP, 192.168.200.2. Which is Site-B’s second IPsec interface IP. A similar single-host route needs to be defined on Site-B’s routing table where the MonitorSource object is 10.10.10.2.

Where the MonitorSource3 object is the following IP, 192.168.200.3. Which is Site-B’s second IPsec interface IP. A similar single-host route needs to be defined on Site-B’s routing table where the MonitorSource object is 10.10.10.3.

So what does this mean?


This means that when a failover has occurred the defined HostMon on the primary route will constantly send out Monitoring packets to the defined host. If the primary tunnel is back up these packets will start to arrive on the primary tunnel. Since our routing principle is “smallest route first” it will match the “Route IPSecTunnel1 MonitorSource” route and accept the incoming packets. It will no longer be dropped by the “Default_Access_Rule” and the primary IPsec tunnel route can now be declared as up when the primary ISP recovers.

Important note: The IP defined as MonitorSource will always be routed on that specific IPSec tunnel, so no other traffic than the monitor should be used towards this IP. If something other than the monitor need to be able to reach this IP, it is better to define and use a new/different IP.

We now have only 1 more problem to solve and that is with this setup there could be a long delay before the new route is used when a failover occurs depending on the situation. This because all four tunnels cannot be up at the same time, in order to fix this we need to add two new routing tables and one “Access rule” on both Sites. For example the tunnel between Site-A’s ISP2 and Site-B’s ISP1 will not be able to establish since all-nets is router over ISP1 on Site-A until RFO triggers.

Let’s start with the routing tables, two routing tables with ordering “only” called ISP1 and ISP2. Add the following routes to the routing tables:
ISP1:

Code: Select all

Route Wan All-nets Gw-World
ISP2:

Code: Select all

Route Wan2 All-nets Gw-World2
Now we need to set the IPsec tunnels to use these routing tables when sending IKE/ESP packets towards the remote endpoint this can be done under the “IKE (Phase-1)” tab on the IPsec tunnel, the setting is called “Outgoing Routing Table” do that for all IPsec tunnels.

The tunnel between Site-A’s ISP1 and Site-B´s ISP1 should use “ISP1”
The tunnel between Site-A’s ISP1 and Site-B´s ISP2 should use “ISP1”
The tunnel between Site-A’s ISP2 and Site-B´s ISP1 should use “ISP2”
The tunnel between Site-A’s ISP2 and Site-B´s ISP2 should use “ISP2”

Only one more thing to do and that is to add an “Access rule” allowing Site-B’s ISP1 and ISP2 IP to arriving on Site-A’s Wan2 interface at all times even though all-nets is routed over Wan1. Under Threat Prevention tab in the WebUI and add an “Access rule” that looks like this:

Code: Select all

Action=Accept Interface=Wan2 Network=90.90.90.10, 100.100.100.10
Do the same on Site-B.

Post Reply