Incident Response SOP

Standard operating procedure for detecting, responding to, and recovering from security incidents.

Incident Response SOP

Field	Value
Document ID	SOP-004
Classification	Internal
Owner	SRE + CTO
Effective Date	April 2026
Review Cycle	Semi-Annual

Our procedure for detecting, classifying, responding to, and recovering from security incidents across GCP, AWS, and endpoint infrastructure.

Scope

Applies to all Wealthy information assets:

GCP: GKE clusters, Cloud SQL, Cloud Storage, Compute Engine, Cloud Logging
AWS: CloudFront (WAF, Bot Control, country blocking), S3, CloudWatch, Lambda — used for insurance infrastructure
Network: Cloudflare CDN/WAF, GCP Load Balancers, StarkNet Gateway, Pritunl VPN
Applications: All production services, APIs, customer/partner-facing apps
Endpoints: Employee laptops, mobile devices, portable storage

Incident Types

Type	Examples
Electronic/Software	Unauthorized access, malware, compromised credentials, phishing, DDoS, unauthorized code changes
Infrastructure	GKE pod compromise, Cloud SQL breach, CloudFront/WAF bypass, API auth failures at scale, AWS resource misuse
Physical	Lost/stolen laptops, mobile devices, portable storage

Severity Classification

Severity	Response Time	Examples
Critical	15 min	Active data breach, ransomware, full production outage
High	1 hour	Suspected unauthorized access, DDoS, partial outage
Medium	4 hours	Single account compromise, endpoint malware
Low	24 hours	Suspicious email, minor policy violation

Detection & Alerting

GCP Stack

OTEL SDKs → OTEL Agent → Victoria Metrics → Grafana → Slack/Telegram
Uptime Kuma → Slack (availability monitoring)
GCP Security Command Center → Real-time threat findings

AWS Stack

CloudFront Access Logs → CloudWatch Logs → CloudWatch Alarms → SNS → Slack/Email
AWS WAF Logs → CloudWatch → Alerts on rule matches, bot activity, blocked requests
AWS GuardDuty → Threat detection → SNS notifications

Alert Thresholds

Metric	Threshold
CPU	>80%
Memory	>85%
Disk	>90%
API Response Time	>5 seconds
Error Rate	>5%
Service Availability	<99%

Response Procedure

Step 1: Detect & Acknowledge (0–5 min)

Alert received via Slack/Telegram or reported to security@wealthy.in
First responder acknowledges in Slack thread

Step 2: Assess & Classify (5–15 min)

Determine severity using classification matrix
Identify affected systems (GCP, AWS, or both)
Check if CERT-In reportable (data breach, malware, DDoS, defacement, phishing, identity theft)

Step 3: Contain (15–60 min)

GCP: Isolate affected GKE pods/nodes, revoke IAM credentials, block IPs at Cloudflare
AWS: Update WAF rules, block IPs via CloudFront geo-restriction or WAF IP sets, revoke compromised IAM keys, isolate affected resources
Common: Disable compromised accounts, revoke API keys/tokens, preserve logs for forensics

Step 4: Escalate

L1: DevOps Engineers → initial triage and containment
L2: CTO → technical decisions (Critical/High or unresolved)
L3: CEO → business-critical, regulatory, or PR impact

Step 5: Eradicate & Recover

Remove malware/malicious artifacts
Patch exploited vulnerabilities
Restore from clean backups if needed
GCP: Redeploy affected GKE workloads, rotate Cloud SQL credentials
AWS: Rotate IAM keys, update CloudFront/WAF configs, redeploy Lambda functions if compromised
Verify system integrity before returning to production

Step 6: Document

Update incident timeline in Slack channel
Preserve all logs, screenshots, evidence
Create Plane task for follow-up actions

Physical Incident Response

Scenario	Actions
Lost/Stolen Laptop (Mac / Windows)	Report to security@wealthy.in → Sign out Workspace sessions (admin.google.com → User → Sign out all sessions) → Revoke SSO / VPN / all app sessions → Rotate any stored credentials on that device → If MDM-enrolled in Fleet (DEP for Mac, Azure AD for Windows): trigger remote lock + remote wipe via Fleet UI → Hosts → select host → Wipe → Disable Wazuh agent enrollment → For non-MDM hosts (manual / BYOD): rely on FileVault / BitLocker encryption + key non-disclosure to protect data at rest → Guide user to trigger iCloud Find My Mac (if enabled) for additional remote lock/erase → Check access logs for post-incident activity → File police report if theft → Treat as data breach only if disk encryption was unconfirmed.
Lost/Stolen Mobile	Report to security@wealthy.in → Sign out Workspace sessions + revoke app sessions (Workspace Basic Endpoint Mgmt supports this for Android/iOS) → Reset auth credentials → Guide user to trigger Find My iPhone / Find My Device → If device contained KYC documents / customer PII beyond Workspace apps: treat as data breach.
Lost/Stolen Storage (USB / external drive)	Report to security@wealthy.in → Assess what was on it → If unencrypted sensitive data: treat as data breach (POL-008). Use of unencrypted external drives for Wealthy data is itself a policy violation under POL-009 Data Classification — investigate.

Notification SLAs — Every Clock in One Place

Multiple clocks run in parallel when a breach is detected. Each has a different owner and a different escalation path. Missing any of them is a regulatory / contractual finding.

Clock	Starts when	Owner	Deadline	Driver
Internal — security@wealthy.in	First-responder detects suspicious event	Anyone who spots it	Immediately — no delay	POL-022 §3, POL-008
Internal — CTO + Compliance notified	security@wealthy.in receives first alert	SRE on-call	Within 2 hours	POL-022 §3
Vendor-to-Wealthy (inbound)	Vendor detects a breach that affects Wealthy data	Vendor (per contract)	Within 2 hours of vendor’s detection	STD-015 §4.1, SOP-010 §3.4
Wealthy → CERT-In	Wealthy-side detection (direct or via vendor)	CTO + Compliance	Within 6 hours	CERT-In Directions 2022
Wealthy → DPDP Board	Personal-data breach affecting Data Principals confirmed	CTO + Privacy Officer	Without undue delay (Wealthy internal SLA: 24h)	DPDP Act 2023 §8(6)
Wealthy → affected Data Principals	Personal-data breach confirmed	CTO + CEO (public-facing)	Without undue delay (Wealthy internal SLA: 24h, coordinated with DPDP Board notification)	DPDP Act 2023 §8(6)
Wealthy → cyber-insurance carrier	Breach requiring forensics	Legal	Per policy (typically 72h)	Insurance contract
Wealthy → Insurance/Broking partners	Breach affecting a partner’s data	Ops Manager + PM	Per MSA (typically 24h)	Partner MSAs

The 2-hour vendor-to-Wealthy clock exists specifically so that Wealthy can still meet its 6-hour CERT-In clock when the detection originated at a vendor. Without that SLA in the vendor MSA, Wealthy’s own 6-hour window is at risk.

CERT-In — the 6-hour clock in detail

Mandatory: report every CERT-In-reportable incident within 6 hours of detection. See CERT-In Compliance (SOP-002) for submission format and the list of reportable categories.

Hour 0:  Detect → Alert security@wealthy.in
Hour 1:  SRE on-call assesses severity + reportability (is it in scope for CERT-In?)
Hour 2:  CTO + Compliance notified; parallel: vendor MSA 2h clock (if vendor-originated)
Hour 4:  Compliance drafts CERT-In report; CTO reviews
Hour 6:  Submit CERT-In report at cert-in.org.in
Hour 6+: Follow-up updates per CERT-In's request cadence

Notification Matrix (who, what, how)

Stakeholder	When	How	SLA (from “Notification SLAs” above)
security@wealthy.in	All incidents	Email	Immediately
DevOps / SRE on-call	All technical incidents	Slack (+ Telegram alerts from Wazuh / custom-ai for sev 10+)	Immediately
CTO	High/Critical, all data breaches	Slack + Phone	Within 2h
CEO	Business-critical / regulatory / public-impact	Direct call	Within 2h
Compliance	Regulatory-reportable	Email	Within 2h
Legal	Data breach / contractual implications	Email	Within 2h
CERT-In	Reportable cyber incidents	cert-in.org.in submission	Within 6h
DPDP Board	Personal-data breach affecting Data Principals	Email per DPDP rules	24h internal SLA
Affected Data Principals	Personal-data breach	Email + in-app notice	24h internal SLA
Insurance/Broking partners	Partner-data breach	Per MSA	24h (typical)
Cyber-insurance carrier	Breach requiring forensics	Per policy	72h (typical)

All incident information is confidential until resolved and a communication plan is approved by CTO.

Post-Incident Review

Critical/High: RCA within 48 hours
Medium/Low: RCA within 1 week

RCA must include: timeline, root cause, contributing factors, prevention measures with owners/deadlines, and lessons learned. Store in RCA repository.

Existing Controls

Layer	Control
Edge	Cloudflare WAF, AWS CloudFront WAF (Bot Control + geo-blocking)
Network	Pritunl VPN, GCP firewall rules, AWS security groups
Auth	Google SSO + 2FA (internal), OTP + PIN/Biometric (customers/partners)
Monitoring	Grafana, OTEL, Uptime Kuma (GCP); CloudWatch, GuardDuty (AWS)
Logging	GCP Cloud Logging (GCP); CloudWatch Logs (AWS)
Endpoints	Device management with remote wipe

Contact: security@wealthy.in Next Review: April 2027

Incident Response SOP

Incident Response SOP

Scope

Incident Types

Severity Classification

Detection & Alerting

GCP Stack

AWS Stack

Alert Thresholds

Response Procedure

Step 1: Detect & Acknowledge (0–5 min)

Step 2: Assess & Classify (5–15 min)

Step 3: Contain (15–60 min)

Step 4: Escalate

Step 5: Eradicate & Recover

Step 6: Document

Physical Incident Response

Notification SLAs — Every Clock in One Place

CERT-In — the 6-hour clock in detail

Notification Matrix (who, what, how)

Post-Incident Review

Existing Controls

Related Documents