Incident Response SOP

Standard operating procedure for detecting, responding to, and recovering from security incidents.

Incident Response SOP

Field Value
Document ID SOP-004
Classification Internal
Owner SRE + CTO
Effective Date April 2026
Review Cycle Semi-Annual

Our procedure for detecting, classifying, responding to, and recovering from security incidents across GCP, AWS, and endpoint infrastructure.


Scope

Applies to all Wealthy information assets:

  • GCP: GKE clusters, Cloud SQL, Cloud Storage, Compute Engine, Cloud Logging
  • AWS: CloudFront (WAF, Bot Control, country blocking), S3, CloudWatch, Lambda β€” used for insurance infrastructure
  • Network: Cloudflare CDN/WAF, GCP Load Balancers, StarkNet Gateway, Pritunl VPN
  • Applications: All production services, APIs, customer/partner-facing apps
  • Endpoints: Employee laptops, mobile devices, portable storage

Incident Types

Type Examples
Electronic/Software Unauthorized access, malware, compromised credentials, phishing, DDoS, unauthorized code changes
Infrastructure GKE pod compromise, Cloud SQL breach, CloudFront/WAF bypass, API auth failures at scale, AWS resource misuse
Physical Lost/stolen laptops, mobile devices, portable storage

Severity Classification

Severity Response Time Examples
Critical 15 min Active data breach, ransomware, full production outage
High 1 hour Suspected unauthorized access, DDoS, partial outage
Medium 4 hours Single account compromise, endpoint malware
Low 24 hours Suspicious email, minor policy violation

Detection & Alerting

GCP Stack

OTEL SDKs β†’ OTEL Agent β†’ Victoria Metrics β†’ Grafana β†’ Slack/Telegram
Uptime Kuma β†’ Slack (availability monitoring)
GCP Security Command Center β†’ Real-time threat findings

AWS Stack

CloudFront Access Logs β†’ CloudWatch Logs β†’ CloudWatch Alarms β†’ SNS β†’ Slack/Email
AWS WAF Logs β†’ CloudWatch β†’ Alerts on rule matches, bot activity, blocked requests
AWS GuardDuty β†’ Threat detection β†’ SNS notifications

Alert Thresholds

Metric Threshold
CPU >80%
Memory >85%
Disk >90%
API Response Time >5 seconds
Error Rate >5%
Service Availability <99%

Response Procedure

Step 1: Detect & Acknowledge (0–5 min)

  • Alert received via Slack/Telegram or reported to security@wealthy.in
  • First responder acknowledges in Slack thread

Step 2: Assess & Classify (5–15 min)

  • Determine severity using classification matrix
  • Identify affected systems (GCP, AWS, or both)
  • Check if CERT-In reportable (data breach, malware, DDoS, defacement, phishing, identity theft)

Step 3: Contain (15–60 min)

  • GCP: Isolate affected GKE pods/nodes, revoke IAM credentials, block IPs at Cloudflare
  • AWS: Update WAF rules, block IPs via CloudFront geo-restriction or WAF IP sets, revoke compromised IAM keys, isolate affected resources
  • Common: Disable compromised accounts, revoke API keys/tokens, preserve logs for forensics

Step 4: Escalate

L1: DevOps Engineers β†’ initial triage and containment
L2: CTO β†’ technical decisions (Critical/High or unresolved)
L3: CEO β†’ business-critical, regulatory, or PR impact

Step 5: Eradicate & Recover

  • Remove malware/malicious artifacts
  • Patch exploited vulnerabilities
  • Restore from clean backups if needed
  • GCP: Redeploy affected GKE workloads, rotate Cloud SQL credentials
  • AWS: Rotate IAM keys, update CloudFront/WAF configs, redeploy Lambda functions if compromised
  • Verify system integrity before returning to production

Step 6: Document

  • Update incident timeline in Slack channel
  • Preserve all logs, screenshots, evidence
  • Create Plane task for follow-up actions

Physical Incident Response

Scenario Actions
Lost/Stolen Laptop (Mac / Windows) Report to security@wealthy.in β†’ Sign out Workspace sessions (admin.google.com β†’ User β†’ Sign out all sessions) β†’ Revoke SSO / VPN / all app sessions β†’ Rotate any stored credentials on that device β†’ If MDM-enrolled in Fleet (DEP for Mac, Azure AD for Windows): trigger remote lock + remote wipe via Fleet UI β†’ Hosts β†’ select host β†’ Wipe β†’ Disable Wazuh agent enrollment β†’ For non-MDM hosts (manual / BYOD): rely on FileVault / BitLocker encryption + key non-disclosure to protect data at rest β†’ Guide user to trigger iCloud Find My Mac (if enabled) for additional remote lock/erase β†’ Check access logs for post-incident activity β†’ File police report if theft β†’ Treat as data breach only if disk encryption was unconfirmed.
Lost/Stolen Mobile Report to security@wealthy.in β†’ Sign out Workspace sessions + revoke app sessions (Workspace Basic Endpoint Mgmt supports this for Android/iOS) β†’ Reset auth credentials β†’ Guide user to trigger Find My iPhone / Find My Device β†’ If device contained KYC documents / customer PII beyond Workspace apps: treat as data breach.
Lost/Stolen Storage (USB / external drive) Report to security@wealthy.in β†’ Assess what was on it β†’ If unencrypted sensitive data: treat as data breach (POL-008). Use of unencrypted external drives for Wealthy data is itself a policy violation under POL-009 Data Classification β€” investigate.

Notification SLAs β€” Every Clock in One Place

Multiple clocks run in parallel when a breach is detected. Each has a different owner and a different escalation path. Missing any of them is a regulatory / contractual finding.

Clock Starts when Owner Deadline Driver
Internal β€” security@wealthy.in First-responder detects suspicious event Anyone who spots it Immediately β€” no delay POL-022 Β§3, POL-008
Internal β€” CTO + Compliance notified security@wealthy.in receives first alert SRE on-call Within 2 hours POL-022 Β§3
Vendor-to-Wealthy (inbound) Vendor detects a breach that affects Wealthy data Vendor (per contract) Within 2 hours of vendor’s detection STD-015 Β§4.1, SOP-010 Β§3.4
Wealthy β†’ CERT-In Wealthy-side detection (direct or via vendor) CTO + Compliance Within 6 hours CERT-In Directions 2022
Wealthy β†’ DPDP Board Personal-data breach affecting Data Principals confirmed CTO + Privacy Officer Without undue delay (Wealthy internal SLA: 24h) DPDP Act 2023 Β§8(6)
Wealthy β†’ affected Data Principals Personal-data breach confirmed CTO + CEO (public-facing) Without undue delay (Wealthy internal SLA: 24h, coordinated with DPDP Board notification) DPDP Act 2023 Β§8(6)
Wealthy β†’ cyber-insurance carrier Breach requiring forensics Legal Per policy (typically 72h) Insurance contract
Wealthy β†’ Insurance/Broking partners Breach affecting a partner’s data Ops Manager + PM Per MSA (typically 24h) Partner MSAs

The 2-hour vendor-to-Wealthy clock exists specifically so that Wealthy can still meet its 6-hour CERT-In clock when the detection originated at a vendor. Without that SLA in the vendor MSA, Wealthy’s own 6-hour window is at risk.

CERT-In β€” the 6-hour clock in detail

Mandatory: report every CERT-In-reportable incident within 6 hours of detection. See CERT-In Compliance (SOP-002) for submission format and the list of reportable categories.

Hour 0:  Detect β†’ Alert security@wealthy.in
Hour 1:  SRE on-call assesses severity + reportability (is it in scope for CERT-In?)
Hour 2:  CTO + Compliance notified; parallel: vendor MSA 2h clock (if vendor-originated)
Hour 4:  Compliance drafts CERT-In report; CTO reviews
Hour 6:  Submit CERT-In report at cert-in.org.in
Hour 6+: Follow-up updates per CERT-In's request cadence

Notification Matrix (who, what, how)

Stakeholder When How SLA (from “Notification SLAs” above)
security@wealthy.in All incidents Email Immediately
DevOps / SRE on-call All technical incidents Slack (+ Telegram alerts from Wazuh / custom-ai for sev 10+) Immediately
CTO High/Critical, all data breaches Slack + Phone Within 2h
CEO Business-critical / regulatory / public-impact Direct call Within 2h
Compliance Regulatory-reportable Email Within 2h
Legal Data breach / contractual implications Email Within 2h
CERT-In Reportable cyber incidents cert-in.org.in submission Within 6h
DPDP Board Personal-data breach affecting Data Principals Email per DPDP rules 24h internal SLA
Affected Data Principals Personal-data breach Email + in-app notice 24h internal SLA
Insurance/Broking partners Partner-data breach Per MSA 24h (typical)
Cyber-insurance carrier Breach requiring forensics Per policy 72h (typical)

All incident information is confidential until resolved and a communication plan is approved by CTO.


Post-Incident Review

  • Critical/High: RCA within 48 hours
  • Medium/Low: RCA within 1 week

RCA must include: timeline, root cause, contributing factors, prevention measures with owners/deadlines, and lessons learned. Store in RCA repository.


Existing Controls

Layer Control
Edge Cloudflare WAF, AWS CloudFront WAF (Bot Control + geo-blocking)
Network Pritunl VPN, GCP firewall rules, AWS security groups
Auth Google SSO + 2FA (internal), OTP + PIN/Biometric (customers/partners)
Monitoring Grafana, OTEL, Uptime Kuma (GCP); CloudWatch, GuardDuty (AWS)
Logging GCP Cloud Logging (GCP); CloudWatch Logs (AWS)
Endpoints Device management with remote wipe


Contact: security@wealthy.in Next Review: April 2027