Patch Management SOP

Our process for applying, testing, and tracking patches across applications and infrastructure.

Patch Management

Field Value
Document ID SOP-005
Classification Internal
Owner SRE Team
Effective Date April 2026
Review Cycle Semi-Annual

This doc explains how we manage patches and updates across our applications, infrastructure, and third-party software.


Patch Classification

Patches are prioritized based on severity:

Priority When to Apply Examples
Emergency Immediately Critical security vulnerability, active exploit, data breach risk
High Within 7 days High-severity security fix, major bug affecting users
Routine Next scheduled window Minor bug fixes, feature updates, dependency upgrades

Application Patches

Release Process

  • Patches are released via a dedicated branch in GitHub.
  • All changes go through code review and are merged via pull request.
  • Patches are deployed to staging first for testing and validation.
  • After staging validation, the patch is deployed to production.

Rollback

  • If a major issue is found after deployment, a rollback is performed by redeploying the previous stable version.
  • Rollback decisions are made by the on-call engineer or team lead based on the severity of the issue.

Infrastructure Patches

Kubernetes (GKE) Upgrades

  • Node upgrades are first applied to the management cluster.
  • The management cluster is monitored for 2 days to ensure stability.
  • After successful validation, node groups in the main cluster are updated group by group.
  • GKE version upgrades follow the same process — management cluster first, then main cluster in phases.

Other Infrastructure Software

  • For software that requires downtime, a maintenance window is scheduled and communicated to the team.
  • A backup is taken before the upgrade.
  • The upgrade is performed during the maintenance window and validated before resuming normal operations.

Testing & Validation

  • All patches (application and infrastructure) are tested in a non-production environment before being applied to production.
  • Automated tests and manual validation are used to verify that the patch does not introduce regressions.
  • Infrastructure patches are validated through a phased rollout (management cluster → main cluster).

Tracking

  • All infrastructure changes are managed via Terraform and tracked in a GitHub repository (GitOps). Every change is version-controlled and auditable.
  • Application deployments are managed via Helm and ArgoCD. All deployment configurations and changes are stored in GitHub.
  • Application patches are tracked through GitHub pull requests.
  • All patches are linked to the relevant issue or vulnerability that triggered them.