2025-09-25: Pritunl VPN IP Change Incident
Root cause analysis for unexpected Pritunl VPN IP change on September 25, 2025
Incident Summary
Date & Time
- Date: September 25, 2025
- Time: Around 2:30 AM IST
- Duration: ~0.5 hours
- Severity: Medium
What Happened
The Pritunl VPN server became inaccessible when its VM restarted and was assigned a new ephemeral public IP. Since the old IP was used in allowlists and client profiles, users were unable to connect until corrective actions were taken.
Impact
Services Affected
- Completely Down:
- VPN access for all developers, SREs, and operators.
Business Impact
- Teams temporarily lost access to AKS and GKE clusters over VPN.
- No production customer-facing services were directly impacted.
Root Cause
- The VM hosting Pritunl was configured with an ephemeral public IP.
- When the VM restarted, Google Cloud released the old IP and allocated a new one.
- The old IP was hardcoded in allowlists (AKS, GKE) and client profiles, breaking connectivity.
Resolution
Immediate Fix
- Promoted the Pritunl VM IP from ephemeral ā static to prevent future changes.
- Updated authorized IPs in:
- Azure Kubernetes Service (AKS)
- Google Kubernetes Engine (GKE)
- Users were guided to re-import their VPN profiles via https://vpn53.wealthy.systems.
Validation
- Confirmed VPN connectivity restored for all users.
- Verified AKS and GKE allowlists accepted traffic from the new static IP.
Contributing Factors
- Reliance on ephemeral IPs for critical infrastructure.
Future Mitigation Plan
- Always allocate static IPs for production-critical infrastructure (VPNs, gateways, bastions).
- Implement monitoring and alerts on VM restarts and public IP changes.
Lessons Learned
- Critical access services (VPNs, jump hosts) must never rely on ephemeral IPs.
- Even small infra changes (like VM restarts) can cause wide developer downtime.