The Problem
We had a Kubernetes cluster running in the cloud, but in a completely private network (VPC). Only the firewall nodes expose public IPs to the internet. The cluster needed a stateful front-end that could:
- Handle HA — a single firewall failing would kill everything
- Forward traffic to Kubernetes NodePorts without cloud magic
- Survive failover without dropping established connections
- Load-balance across multiple worker nodes
Cloud load balancers (AWS ALB, NLB, etc.) felt like overkill — we’d be paying for managed services when we wanted full control over traffic rules and failover behavior. We looked at HAProxy in HA mode (Keepalived), but Keepalived is stateless. Every connection would drop on failover.
We settled on OpenBSD running CARP (Common Address Redundancy Protocol) with pfsync state synchronization. It’s been in production for years and remains the pragmatic choice for stateful, sub-second failover.
Status: Still running, proven stable, no plans to change.
Why OpenBSD + CARP
OpenBSD’s CARP is not a new protocol, but it solves a specific problem well: stateful, synchronized failover between two firewalls.
Here’s what makes it different from Keepalived:
- CARP — handles virtual IP failover (master/backup), with the backup taking over automatically
- pfsync — synchronizes firewall state (all connections, NAT tables, etc.) in real-time
- Together — when master fails, backup inherits all active connections without dropping a single packet
The alternative (HAProxy + Keepalived) would require application-level reconnect logic or connection pooling. With stateful firewall sync, TCP connections just keep working.
Other options we considered and rejected:
- Cisco/Juniper — overkill, expensive, requires vendor support
- pfSense — built on OpenBSD, but pricey; OpenBSD itself was free
- Keepalived + HAProxy — stateless failover, requires app-level reconnect handling
- Cloud load balancers — not an option for on-prem
OpenBSD was the only thing that gave us stateful sync without paying enterprise fees.
The Architecture
Two OpenBSD VMs in separate availability zones (for fault isolation).
Internet
↓
[Firewall-Primary] ← CARP Master - public IP
[Firewall-Backup] ← CARP Backup - public IP
↓ (both have CARP VIP - public)
Private Network (VPC)
↓
[Kubernetes Cluster]
↓
[Traefik Ingress Controller]
Network Design:
- External interface — public IP, handles internet traffic
- Internal interface — private network to K8s cluster
- Dedicated sync network — high-speed link between firewalls for pfsync (critical for low-latency state sync)
- CARP VIP (external) — 72.X.X.100, load balancing incoming traffic
- CARP VIP (internal) — 10.X.X.1, cluster access
The sync network was crucial. If pfsync packets got queued behind regular traffic, state sync would lag, and failover wouldn’t be clean.
pf Configuration (simplified):
# CARP VIP for external HTTP/HTTPS traffic
pass in on egress proto tcp to <vip-public> port { 80, 443 } \
rdr-to <worker-pool> port { 80, 443 }
# NAT: all outbound traffic from cluster to public IP
pass out on egress from <cluster-net> nat-to <vip-public>
The firewall does simple port forwarding: traffic on public IPs (72.X.X.100) gets redirected to private IPs in the cluster. Inside the cluster, Traefik (Kubernetes ingress controller) handles routing. This separation of concerns is clean:
- pf (firewall): Public ↔ Private IP translation, HA failover, NAT
- Traefik (cluster): HTTP routing, TLS termination, rate limiting, path-based routing
No load balancing logic at the firewall level. The rdr-to pool distributes new connections, but Traefik is what actually routes requests to backend services.
Real-Time Failover Demo
Here’s what happens when the master firewall is shut down mid-connection. Notice: zero packet loss.
The demo shows:
- Before failover: pings going through master firewall (RTT ~5ms)
- Failover happens: master goes down, backup takes over in <100ms
- After failover: pings continue, new RTT shows traffic now goes through backup
- No failures: not a single ping was dropped
This works because pfsync keeps the backup’s state table synchronized. When the backup becomes master, all existing connections are already in its state table.
Port Forwarding Strategy
We forward public ports to private cluster ports:
- 80 → cluster port 80 (HTTP)
- 443 → cluster port 443 (HTTPS)
The firewall does simple 1:1 forwarding. Traefik inside the cluster routes requests to backend services.
This separation is clean: the firewall doesn’t care what service is running. It just translates public IPs to private IPs. Traefik handles everything else.
Load Balancing at the Firewall Level
The pf rdr-to rule includes a pool of backend worker IPs. pf distributes new connections using a hash of source/destination (not pure round-robin).
rdr-to <worker-pool> port 80
This spreads traffic across workers, but Traefik does the actual routing. pf just ensures the connection gets to one worker; Traefik then handles:
- Layer-7 routing (by hostname, path, headers)
- TLS termination
- Service discovery
Traefik Redundancy and Failover
Traefik itself runs redundantly across multiple worker nodes. We deploy Traefik with multiple replicas (typically 3+), so if one Traefik pod crashes or a node fails, other Traefik instances immediately take over routing.
This creates a complete HA stack:
- Firewall layer: CARP handles public IP failover (OpenBSD VMs)
- Ingress layer: Traefik replicas handle traffic distribution (Kubernetes)
- Application layer: App pods run with multiple replicas
If a single component fails at any layer, the others absorb the traffic. The firewall doesn’t know (or care) which Traefik pod handles a request—it just forwards to any worker running Traefik.
Sticky Sessions with Traefik
For stateful applications that need session affinity, Traefik uses sticky session cookies. Services configured in Kubernetes can enable sticky cookies via annotations:
apiVersion: v1
kind: Service
metadata:
annotations:
traefik.ingress.kubernetes.io/service.sticky.cookie: "true"
traefik.ingress.kubernetes.io/service.sticky.cookie.httponly: "true"
traefik.ingress.kubernetes.io/service.sticky.cookie.samesite: "none"
traefik.ingress.kubernetes.io/service.sticky.cookie.secure: "true"
name: my-service
spec:
ports:
- port: 80
selector:
app: my-service
Traefik sets a cookie with a session ID. Subsequent requests from the same client automatically route to the same backend pod. No application changes needed.
So: pf distributes to workers, Traefik distributes to pods (with session stickiness if needed).
Updating OpenBSD VMs: Trivial
One of the best parts: updates are dead simple.
OpenBSD’s syspatch command patches the OS in ~30 seconds. Reboot takes ~1 minute. When the primary firewall reboots:
- CARP automatically fails over to the backup
- pfsync keeps all connection state in sync
- Existing connections continue without interruption
- New connections route through the backup
Zero downtime. No coordination needed. Just reboot and move on.
Then update the backup. Done.
This simplicity is a huge win over cloud load balancers, which often require scheduled maintenance windows or blue-green deployments for updates.
What We Learned
1. State synchronization is non-negotiable for failover
Stateless failover (Keepalived) is fine if your application handles reconnects. Most didn’t. pfsync solved this by keeping state synchronized, so the backup could take over transparently.
2. Dedicated sync networks matter
If pfsync shares bandwidth with data traffic, state sync gets queued and lags. A fast, direct link between firewalls is worth the extra networking.
3. OpenBSD/pf is simple but powerful
pf syntax is clearer than iptables. Configuration lives in one file. Rules are easier to audit and modify. We didn’t need a GUI — pfctl and a text editor were enough.
4. CARP works, but requires planning
CARP VIPs work great, but you need to think about:
- Which interface is the VIP on?
- What’s the priority (master vs backup)?
- How fast do you want failover? (advskew tuning)
- Is your sync network fast enough?
Getting this wrong means either failover doesn’t happen, or the master and backup both think they’re the master (split-brain).
5. Load balancing at the firewall has limits
Round-robin distribution across NodePorts worked fine for HTTP/HTTPS, but it’s not session-aware. Sticky sessions had to happen at the application layer (cookies, JSession, etc.) or via Kubernetes ingress rules.
The Trade-off
What we gave up:
- Automatic scaling (firewall is fixed capacity)
- Cloud flexibility (bound to on-prem hardware)
- SLA guarantees (no vendor support — we own the code)
- Automatic updates (OpenBSD stable releases, manual patching)
What we gained:
- Full control over traffic rules
- No cloud egress charges
- Stateful failover without dropped connections
- Simplicity (two files: pf.conf and CARP config)
- Cost (free OS, standard hardware)
For a private cluster serving internal users, this trade-off made sense. For a public SaaS platform, cloud load balancers might be worth the cost.
Would We Do It Again?
Absolutely. This setup has been running for years in production. Zero regrets.
CARP + pfsync is the right tool for stateful HA failover when you control the infrastructure. The setup is straightforward, the failover is sub-second, and the operational overhead is minimal. Updating firewalls is easier than updating most managed cloud services.
The trade-offs are favorable:
- ✅ Full control over traffic rules
- ✅ No cloud egress charges
- ✅ Sub-second stateful failover (unbeatable for connection continuity)
- ✅ Simple to operate and update
- ❌ You own it (no vendor support)
- ❌ Bound to your infrastructure provider’s network
For a private cluster where you already own the infrastructure, OpenBSD CARP is the pragmatic choice. It works. It keeps working. Updates are trivial. Failover is transparent.
If you’re considering this: You need comfort with:
- Network architecture (VIPs, multicast, ARP)
- Command-line firewall management (pf syntax is learnable)
- Monitoring for split-brain scenarios (rare but possible)
- Testing failover regularly
If you have those skills and control your own infrastructure, stop paying for cloud load balancers. CARP is simpler and better for your use case.