Incident Response SOP
Incident Response SOP
| Field | Value |
|---|---|
| Document ID | SOP-004 |
| Classification | Internal |
| Owner | SRE + CTO |
| Effective Date | April 2026 |
| Review Cycle | Semi-Annual |
Our procedure for detecting, classifying, responding to, and recovering from security incidents across GCP, AWS, and endpoint infrastructure.
Scope
Applies to all Wealthy information assets:
- GCP: GKE clusters, Cloud SQL, Cloud Storage, Compute Engine, Cloud Logging
- AWS: CloudFront (WAF, Bot Control, country blocking), S3, CloudWatch, Lambda β used for insurance infrastructure
- Network: Cloudflare CDN/WAF, GCP Load Balancers, StarkNet Gateway, Pritunl VPN
- Applications: All production services, APIs, customer/partner-facing apps
- Endpoints: Employee laptops, mobile devices, portable storage
Incident Types
| Type | Examples |
|---|---|
| Electronic/Software | Unauthorized access, malware, compromised credentials, phishing, DDoS, unauthorized code changes |
| Infrastructure | GKE pod compromise, Cloud SQL breach, CloudFront/WAF bypass, API auth failures at scale, AWS resource misuse |
| Physical | Lost/stolen laptops, mobile devices, portable storage |
Severity Classification
| Severity | Response Time | Examples |
|---|---|---|
| Critical | 15 min | Active data breach, ransomware, full production outage |
| High | 1 hour | Suspected unauthorized access, DDoS, partial outage |
| Medium | 4 hours | Single account compromise, endpoint malware |
| Low | 24 hours | Suspicious email, minor policy violation |
Detection & Alerting
GCP Stack
OTEL SDKs β OTEL Agent β Victoria Metrics β Grafana β Slack/Telegram
Uptime Kuma β Slack (availability monitoring)
GCP Security Command Center β Real-time threat findings
AWS Stack
CloudFront Access Logs β CloudWatch Logs β CloudWatch Alarms β SNS β Slack/Email
AWS WAF Logs β CloudWatch β Alerts on rule matches, bot activity, blocked requests
AWS GuardDuty β Threat detection β SNS notifications
Alert Thresholds
| Metric | Threshold |
|---|---|
| CPU | >80% |
| Memory | >85% |
| Disk | >90% |
| API Response Time | >5 seconds |
| Error Rate | >5% |
| Service Availability | <99% |
Response Procedure
Step 1: Detect & Acknowledge (0β5 min)
- Alert received via Slack/Telegram or reported to security@wealthy.in
- First responder acknowledges in Slack thread
Step 2: Assess & Classify (5β15 min)
- Determine severity using classification matrix
- Identify affected systems (GCP, AWS, or both)
- Check if CERT-In reportable (data breach, malware, DDoS, defacement, phishing, identity theft)
Step 3: Contain (15β60 min)
- GCP: Isolate affected GKE pods/nodes, revoke IAM credentials, block IPs at Cloudflare
- AWS: Update WAF rules, block IPs via CloudFront geo-restriction or WAF IP sets, revoke compromised IAM keys, isolate affected resources
- Common: Disable compromised accounts, revoke API keys/tokens, preserve logs for forensics
Step 4: Escalate
L1: DevOps Engineers β initial triage and containment
L2: CTO β technical decisions (Critical/High or unresolved)
L3: CEO β business-critical, regulatory, or PR impact
Step 5: Eradicate & Recover
- Remove malware/malicious artifacts
- Patch exploited vulnerabilities
- Restore from clean backups if needed
- GCP: Redeploy affected GKE workloads, rotate Cloud SQL credentials
- AWS: Rotate IAM keys, update CloudFront/WAF configs, redeploy Lambda functions if compromised
- Verify system integrity before returning to production
Step 6: Document
- Update incident timeline in Slack channel
- Preserve all logs, screenshots, evidence
- Create Plane task for follow-up actions
Physical Incident Response
| Scenario | Actions |
|---|---|
| Lost/Stolen Laptop (Mac / Windows) | Report to security@wealthy.in β Sign out Workspace sessions (admin.google.com β User β Sign out all sessions) β Revoke SSO / VPN / all app sessions β Rotate any stored credentials on that device β If MDM-enrolled in Fleet (DEP for Mac, Azure AD for Windows): trigger remote lock + remote wipe via Fleet UI β Hosts β select host β Wipe β Disable Wazuh agent enrollment β For non-MDM hosts (manual / BYOD): rely on FileVault / BitLocker encryption + key non-disclosure to protect data at rest β Guide user to trigger iCloud Find My Mac (if enabled) for additional remote lock/erase β Check access logs for post-incident activity β File police report if theft β Treat as data breach only if disk encryption was unconfirmed. |
| Lost/Stolen Mobile | Report to security@wealthy.in β Sign out Workspace sessions + revoke app sessions (Workspace Basic Endpoint Mgmt supports this for Android/iOS) β Reset auth credentials β Guide user to trigger Find My iPhone / Find My Device β If device contained KYC documents / customer PII beyond Workspace apps: treat as data breach. |
| Lost/Stolen Storage (USB / external drive) | Report to security@wealthy.in β Assess what was on it β If unencrypted sensitive data: treat as data breach (POL-008). Use of unencrypted external drives for Wealthy data is itself a policy violation under POL-009 Data Classification β investigate. |
Notification SLAs β Every Clock in One Place
Multiple clocks run in parallel when a breach is detected. Each has a different owner and a different escalation path. Missing any of them is a regulatory / contractual finding.
| Clock | Starts when | Owner | Deadline | Driver |
|---|---|---|---|---|
| Internal β security@wealthy.in | First-responder detects suspicious event | Anyone who spots it | Immediately β no delay | POL-022 Β§3, POL-008 |
| Internal β CTO + Compliance notified | security@wealthy.in receives first alert | SRE on-call | Within 2 hours | POL-022 Β§3 |
| Vendor-to-Wealthy (inbound) | Vendor detects a breach that affects Wealthy data | Vendor (per contract) | Within 2 hours of vendor’s detection | STD-015 Β§4.1, SOP-010 Β§3.4 |
| Wealthy β CERT-In | Wealthy-side detection (direct or via vendor) | CTO + Compliance | Within 6 hours | CERT-In Directions 2022 |
| Wealthy β DPDP Board | Personal-data breach affecting Data Principals confirmed | CTO + Privacy Officer | Without undue delay (Wealthy internal SLA: 24h) | DPDP Act 2023 Β§8(6) |
| Wealthy β affected Data Principals | Personal-data breach confirmed | CTO + CEO (public-facing) | Without undue delay (Wealthy internal SLA: 24h, coordinated with DPDP Board notification) | DPDP Act 2023 Β§8(6) |
| Wealthy β cyber-insurance carrier | Breach requiring forensics | Legal | Per policy (typically 72h) | Insurance contract |
| Wealthy β Insurance/Broking partners | Breach affecting a partner’s data | Ops Manager + PM | Per MSA (typically 24h) | Partner MSAs |
The 2-hour vendor-to-Wealthy clock exists specifically so that Wealthy can still meet its 6-hour CERT-In clock when the detection originated at a vendor. Without that SLA in the vendor MSA, Wealthy’s own 6-hour window is at risk.
CERT-In β the 6-hour clock in detail
Mandatory: report every CERT-In-reportable incident within 6 hours of detection. See CERT-In Compliance (SOP-002) for submission format and the list of reportable categories.
Hour 0: Detect β Alert security@wealthy.in
Hour 1: SRE on-call assesses severity + reportability (is it in scope for CERT-In?)
Hour 2: CTO + Compliance notified; parallel: vendor MSA 2h clock (if vendor-originated)
Hour 4: Compliance drafts CERT-In report; CTO reviews
Hour 6: Submit CERT-In report at cert-in.org.in
Hour 6+: Follow-up updates per CERT-In's request cadence
Notification Matrix (who, what, how)
| Stakeholder | When | How | SLA (from “Notification SLAs” above) |
|---|---|---|---|
| security@wealthy.in | All incidents | Immediately | |
| DevOps / SRE on-call | All technical incidents | Slack (+ Telegram alerts from Wazuh / custom-ai for sev 10+) | Immediately |
| CTO | High/Critical, all data breaches | Slack + Phone | Within 2h |
| CEO | Business-critical / regulatory / public-impact | Direct call | Within 2h |
| Compliance | Regulatory-reportable | Within 2h | |
| Legal | Data breach / contractual implications | Within 2h | |
| CERT-In | Reportable cyber incidents | cert-in.org.in submission | Within 6h |
| DPDP Board | Personal-data breach affecting Data Principals | Email per DPDP rules | 24h internal SLA |
| Affected Data Principals | Personal-data breach | Email + in-app notice | 24h internal SLA |
| Insurance/Broking partners | Partner-data breach | Per MSA | 24h (typical) |
| Cyber-insurance carrier | Breach requiring forensics | Per policy | 72h (typical) |
All incident information is confidential until resolved and a communication plan is approved by CTO.
Post-Incident Review
- Critical/High: RCA within 48 hours
- Medium/Low: RCA within 1 week
RCA must include: timeline, root cause, contributing factors, prevention measures with owners/deadlines, and lessons learned. Store in RCA repository.
Existing Controls
| Layer | Control |
|---|---|
| Edge | Cloudflare WAF, AWS CloudFront WAF (Bot Control + geo-blocking) |
| Network | Pritunl VPN, GCP firewall rules, AWS security groups |
| Auth | Google SSO + 2FA (internal), OTP + PIN/Biometric (customers/partners) |
| Monitoring | Grafana, OTEL, Uptime Kuma (GCP); CloudWatch, GuardDuty (AWS) |
| Logging | GCP Cloud Logging (GCP); CloudWatch Logs (AWS) |
| Endpoints | Device management with remote wipe |
Related Documents
- Incident Management
- CERT-In Compliance (SOP-002)
- Data Breach Response Policy (POL-008)
- Vulnerability Management (SOP-009)
Contact: security@wealthy.in Next Review: April 2027