# Soteria Supply Chain Defense Skills
## Derived from Vercel Breach (April 19, 2026) + OSINT Intelligence Tools

These skills were created in direct response to the Vercel breach and the security intelligence tools analysis from `ImagesToAnalyze`.

---

## 5. Environment Security Auditor (`audit_env_security`)
**Objective**: Ensure NO sensitive environment variables are exposed, in plaintext, or accessible without encryption.
**Origin**: Direct response to Vercel breach where non-encrypted env vars were exfiltrated.

### Technical Logic:
- **Repository Scan**: Search for `.env`, `.env.local`, `.env.production` files in version control history (not just current state).
- **Hardcoded Secret Detection**: Regex patterns for API keys, database URIs, OAuth tokens, JWT secrets in source code.
- **Platform Config Audit**: 
  - Vercel: Verify all secrets use "sensitive" flag (encrypted at-rest).
  - Cloudflare: Verify `wrangler secret put` usage vs `wrangler.toml` vars.
  - Netlify: Verify environment variable encryption settings.
- **Runtime Secret Verification**: Confirm secrets are injected at runtime, not baked into build artifacts.
- **Rotation Policy Check**: Verify if secret rotation policy exists and is enforced.

### Checks Matrix:
| Check | Prudent | Reckless |
|-------|---------|----------|
| `.env` in `.gitignore` | ✅ Present | ❌ Missing |
| Secrets in source code | ✅ Zero matches | ❌ Any match |
| Platform secrets encrypted | ✅ All "sensitive" | ❌ Any plaintext |
| `.env` publicly accessible | ✅ 404 response | ❌ Any content returned |
| Secret rotation < 90 days | ✅ Policy enforced | ❌ No policy |

**Validation**: Cross-reference with Vercel incident timeline and OWASP Secrets Management Cheat Sheet.

---

## 6. OAuth Surface Auditor (`audit_oauth_surface`)
**Objective**: Map ALL active OAuth integrations and evaluate their risk profile.
**Origin**: Vercel was breached through a compromised OAuth app (Context.ai).

### Technical Logic:
- **OAuth App Discovery**: Enumerate all connected apps in Google Workspace, GitHub, GitLab, Slack.
- **Scope Analysis**: For each app, verify permissions follow Principle of Least Privilege.
  - 🔴 CRITICAL: Apps with `drive.readonly`, `gmail.compose`, or `admin.directory` scopes.
  - 🟡 HIGH: AI tools with any workspace access (code assistants, summarizers).
  - 🟢 LOW: Apps with only `userinfo.email` scope.
- **Shadow IT Detection**: Identify apps connected by individual employees without admin consent.
- **Token Usage Monitoring**: Flag tokens used from unusual geolocations or accessing bulk data.
- **AI Tool Inventory**: Specific focus on AI-powered integrations (Context.ai, Cursor, Codeium, Copilot) and their access levels.

### Risk Classification:
```
CRITICAL → AI tools with Google Workspace OAuth (Vercel-type vector)
HIGH     → Third-party CI/CD tools with code repository access  
MEDIUM   → Analytics/monitoring tools with API access
LOW      → Verified apps with minimal scopes
```

**Validation**: Compare against NIST SP 800-207 Zero Trust principles and recent OAuth exploitation techniques.

---

## 7. Tech Stack Reconnaissance (`recon_tech_stack`)
**Objective**: Map the complete technology surface of a target domain to identify supply chain risks.
**Origin**: BuiltWith/Wappalyzer tool analysis from `ImagesToAnalyze`.

### Technical Logic:
- **Technology Detection**: Using BuiltWith or Wappalyzer, enumerate:
  - Hosting provider (check if provider had recent breaches like Vercel)
  - CMS/Framework (cross-reference with active CVE databases)
  - Analytics/Trackers (evaluate data exposure to third parties)
  - Third-party JavaScript (assess supply chain injection risk)
  - CDN provider and configuration
- **Dependency Graph**: Map external service dependencies → single points of failure.
- **Version Fingerprinting**: Detect outdated libraries/frameworks with known vulnerabilities.
- **Supply Chain Risk Score**: Calculate aggregate risk based on number of third-party dependencies.

### Integration Points:
- **BuiltWith** (free): Domain-level technology detection
- **Wappalyzer** (freemium): Real-time browser-based detection
- **Shodan**: Infrastructure-level exposure detection

**Validation**: Cross-reference detected technologies with NVD (National Vulnerability Database) and vendor security advisories.

---

## 8. Attack Pattern Analyzer (`monitor_attack_patterns`)
**Objective**: Analyze WAF/CDN logs to detect reconnaissance and exploit attempts.
**Origin**: Cloudflare Security Events analysis from aisolutionshub.org screenshots (April 16-17, 2026).

### Technical Logic:
- **WordPress Probe Detection**: Alert on access attempts to:
  - `/wp-admin/`, `/wp-includes/`, `/wp-content/`, `/wp-login.php`
  - WordPress-specific PHP files when target is NOT WordPress
- **Secrets Hunting Detection**: CRITICAL alert on:
  - `/.env`, `/.git/`, `/.gitignore`, `/config.php`
  - `/phpunit/`, `/vendor/phpunit/`
  - `/.well-known/pki-validation/`
- **PHP Exploit Scanning**: Alert on access to suspect PHP files:
  - `/a5.php`, `/ws55.php`, `/term.php`, `/ioxi-o.php`
  - `/admin.php`, `/autoload_classmap/function.php`
- **False User Agent Detection**: Flag requests where:
  - IP belongs to cloud provider (Azure, AWS, GCP)
  - User agent claims to be browser but protocol is HTTP/1.1
  - Verified Bot Category = "None"
- **Burst Pattern Detection**: Multiple paths from same IP in <2 seconds = automated scanning

### Severity Classification (from real aisolutionshub.org data):
| Pattern | Example IPs | Origin | Severity |
|---------|------------|--------|----------|
| `.env` access | 140.245.100.195 | Unknown | 🔴 CRITICAL |
| PHP exploit scan | 20.219.138.200 | India (Azure) | 🔴 HIGH |
| WordPress probe | 20.194.110.188 | Korea (Azure) | 🔴 HIGH |
| PKI validation | 20.219.138.200 | India (Azure) | 🟡 MEDIUM |
| SEO bot (verified) | AhrefsBot (OVH) | Canada | 🟢 LOW |

**Validation**: Compare patterns with MITRE ATT&CK framework (TA0043: Reconnaissance).

---

## 9. Digital Evidence Preserver (`evidence_preservation`)
**Objective**: Create immutable, timestamped copies of web evidence for forensic and legal purposes.
**Origin**: Archive.ph / PageFreezer tool analysis from `ImagesToAnalyze`.

### Technical Logic:
- **Snapshot Trigger**: Automatically archive target pages when:
  - Anomalous traffic spike detected
  - Defacement suspected (content hash mismatch)
  - Breach disclosure published
- **Chain of Custody**: Generate SHA-256 hash of captured content + timestamp.
- **Multi-Source Archival**: Use Archive.ph + Wayback Machine + local snapshot.
- **EXIF/Metadata Stripping**: Before sharing screenshots, strip GPS and device metadata (Jimpl/ExifTool check).

**Validation**: Ensure archived evidence meets digital forensics standards (RFC 3227).

---

## 10. Metadata Forensics Scanner (`forensic_metadata_scan`)
**Objective**: Detect information leakage through image and document metadata.
**Origin**: Jimpl/ExifTool tool analysis from `ImagesToAnalyze`.

### Technical Logic:
- **EXIF Extraction**: For all public-facing images on target domain:
  - GPS coordinates → Does the image reveal physical locations?
  - Camera/device model → Does it reveal employee devices?
  - Software used → Does it reveal internal tools (Photoshop version, OS)?
  - Creation timestamp → Does it reveal timezone/working hours?
- **Document Metadata**: For PDFs, DOCXs, XLSXs:
  - Author field → Does it reveal employee names?
  - Company field → Does it reveal internal org structure?
  - Revision history → Does it reveal draft/sensitive versions?
- **Social Media Audit**: Check if company social media posts contain unstripped metadata.

### Tools:
- **Jimpl** (free): Web-based EXIF reader with GPS mapping
- **ExifTool** (free, open source): CLI-based comprehensive metadata tool

**Validation**: Cross-reference with OPSEC best practices and NIST SP 800-88 (Media Sanitization).

---

## Integration with Existing Skills

These new skills extend the original Soteria pipeline:

```
Original Pipeline:
  ai_fingerprinting → reckless_action_scanner → cyber_defense_validator → intelligence_sync

Enhanced Pipeline (Post-Vercel):
  recon_tech_stack → audit_env_security → audit_oauth_surface → ai_fingerprinting → 
  reckless_action_scanner → cyber_defense_validator → monitor_attack_patterns → 
  forensic_metadata_scan → evidence_preservation → intelligence_sync
```

The enhanced pipeline adds **pre-audit reconnaissance** (skills 7), **supply chain defense** (skills 5-6), and **post-audit monitoring** (skills 8-10).
