Passive Virtual System on Check Point VSX ARPs using physical intf. IP address instead og Cluster IP

I came across some important information. Although I did not find any useful information (at first), so hopefully this post will help speed up someone elses troubleshooting.

Problem statement: Passive VS on VSX ARPs for default GW using physical interface IP instead of cluster IP and no traffic flows from passive Virtual System. If ARP is manually configured, everything works fine.
Consequence: The passive VS will report an error as it fails to connect to updates.checkpoint.com for AV and AB-updates.
Setup: 2x Open Servers, running R80.10 VSX with VSLS.
Software level: R80.10 – Build 083 / HFA Take 121

Outbound interface
This is the interface the VS uses to communicate with everything external.

bond0.1011 Link encap:Ethernet HWaddr 48:DF:37:38:19:B9
inet addr:10.1.1.2 Bcast:10.1.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1

Symptoms
As we see from the tcpdump on bond0.1011, the VS ARPs using the physical address (intended for internal communication VSX hosts) and the default GW never responds to such an ARP. Therefor the ARP table is never populated with IP/MAC of its default gateway.

10:54:27.155698 04:eb:40:2f:f7:92 > 01:00:0c:cc:cc:cd SNAP Unnumbered, ui, Flags [Command], length 50
10:54:27.252529 arp who-has 10.1.1.1 tell 192.168.196.2
10:54:28.252612 arp who-has 10.1.1.1 tell 192.168.196.2
10:54:29.157385 04:eb:40:2f:f7:92 > 01:00:0c:cc:cc:cd SNAP Un

Diagnostic
Cluster_hide_fold is enabled, as shown in fwd.elg after policy installation (fw ctl debug -m fw + filter)
:perform_cluster_hide_fold (true)
:perform_cluster_hide_fold (true)

If configuring static ARP like below, everything works as intended.
node2-fw-01:1> add arp static ipv4-address 10.1.1.1 macaddress 78:72:5d:1c:17:01

TCPdump after static ARP
10:57:40.208071 IP 10.1.1.2.10294 > 84.39.152.31.http: . ack 435797 win 137 <nop,nop,timestamp 434154076 3269636925>
10:57:40.225471 IP 10.1.1.2.10294 > 84.39.152.31.http: P 23594:24424(830) ack 435797 win 137 <nop,nop,timestamp 434154093 3269636925>
10:57:40.268951 IP 10.1.1.2.10294 > 84.39.152.31.http: . ack 436048 win 137 <nop,nop,timestamp 434154136 3269637025>
10:57:40.276747 IP 10.1.1.2.10312 > 8.8.8.8.domain: 43708+ A? download.ctmail.com. (37)
10:57:40.326280 IP 10.1.1.2.10371 > 8.8.8.8.domain: 14476+ PTR? 45.188.163.216.in-addr.arpa. (45)

The Solution

The solution was to be found in the following SKs
sk123712 – Traffic originated from Standby VS fails, except for traffic from the lowest VLAN interface.
This is due to a design change in how Check Point handles ARP in clustered environments, as described in sk111956 – ARP Forwarding in Check Point ClusterXL.

By setting the following parameter, all ARP-requests are forwarded over the sync interface, rather than the involved interface.

[Expert@node1-fw-01:0]# fw ctl get int fwha_arp_fwd_ccp_via_sync_if
fwha_arp_fwd_ccp_via_sync_if = 0
[Expert@node1-fw-01:0]# fw ctl set int fwha_arp_fwd_ccp_via_sync_if 1

[Expert@node2-fw-01:0]# fw ctl set int fwha_arp_fwd_ccp_via_sync_if 1

To make the changes permanently, they need to be added to $FWDIR/boot/modules/fwkern.conf.

TCP dump as the parameter is being applied to both VSX-hosts

[Expert@node2-fw-01:0]# tcpdump -i bond0.1011 -nvvv 'arp'

13:20:26.171552 arp who-has 10.1.1.1 tell 192.168.196.2
13:20:27.831631 arp who-has 10.1.1.1 tell 192.168.196.2
13:20:28.831676 arp who-has 10.1.1.1 tell 192.168.196.2
13:20:29.831700 arp who-has 10.1.1.1 tell 192.168.196.2
13:20:31.173769 arp who-has 10.1.1.1 tell 192.168.196.2
13:20:32.174376 arp who-has 10.1.1.1 tell 10.1.1.2 (Parameter changed on one host)
13:20:33.174400 arp who-has 10.1.1.1 tell 10.1.1.2 (Parameter changed on both hosts)
13:20:33.175001 arp reply 10.1.1.1 is-at 78:72:5d:1c:17:01
13:20:44.161011 arp who-has 10.1.1.2 tell 10.1.1.2
13:21:31.309374 arp reply 10.1.1.1 is-at 78:72:5d:1c:17:01
13:21:44.467631 arp who-has 10.1.1.2 tell 10.1.1.2
13:22:22.258694 arp reply 10.1.1.1 is-at 78:72:5d:1c:17:01
13:22:44.764361 arp who-has 10.1.1.2 tell 10.1.1.2
13:23:40.118178 arp reply 10.1.1.1 is-at 78:72:5d:1c:17:01
13:23:45.052863 arp who-has 10.1.1.2 tell 10.1.1.2

Hope this helps you with your troubleshooting!

–Gos

5.00 avg. rating (99% score) - 1 vote

One Response to Passive Virtual System on Check Point VSX ARPs using physical intf. IP address instead og Cluster IP

  1. Thanks Bro !!
    After all day troubleshooting and research, CP TAC engineer could not find the solution to the very same problem…

    Google directed me to this webpage that fixed it in no time!!

    Thank you so much for sharing the solution.

    James.

Leave a Reply

Your email address will not be published. Required fields are marked *