Claude Code as a Security Control in the DevOps Workflow
DevOps runbook for automated security reviews with Claude Code: installation, custom commands, audit scripts, and CI integration.
Modern self-hosted app stacks consist of many components: Supabase platform, Next.js app layer, Edge Functions, background jobs, and external APIs. With each component, the likelihood of configuration errors, secret leaks, missing policies, and unsecured endpoints increases.
Traditional security scanners check static rules. They find service_role in an .env file, but they do not recognize that a new Server Action is missing the same ownership check that exists in all other Server Actions. They report open ports, but they do not recognize that a firewall change last week, combined with a new container, opened an unintended access path.
Claude Code closes this gap as a contextual analysis layer across the entire stack.
Claude Code does not replace deterministic checks. It supplements them by interpreting results, identifying connections, and prioritizing recommendations.
This runbook describes how Claude Code is concretely installed, configured, and integrated into the DevOps workflow.
At a Glance - Part 5 of 6 in the DevOps Runbook Series
- Claude Code runs exclusively on the audit server (never on production)
- Three-layer model: deterministic checks, runbook rules, Claude contextual analysis
- Custom commands in .claude/commands/ define the review scope
- Headless mode with --allowedTools restricted to Read, Grep, Glob (no Bash)
- Weekly cron audit plus ad-hoc reviews on pull requests
Series Table of Contents
This guide is part of our DevOps runbook series for self-hosted app stacks.
- Supabase Self-Hosting Runbook
- Running Next.js on Supabase Securely
- Deploying Supabase Edge Functions Securely
- Running Trigger.dev Background Jobs Securely
- Claude Code as a Security Control in the DevOps Workflow - this article
- Security Baseline for the Entire Stack
The first four articles describe the individual components. This article describes the security control layer on top of them.
Architecture Overview
Claude Code does not run on the production server but on the audit server (see Article 1). It has read-only access to the data it analyzes.
Production Server (supabase-prod)
|
+-- Supabase Stack
+-- Next.js App
+-- Edge Functions
+-- Trigger.dev
|
+---- SSH (read-only) ----> Audit Server (audit-runner)
|
+-- Deterministic Checks
| +-- Port Scans (nmap)
| +-- Firewall Diff (iptables-save)
| +-- Container Versions (docker images)
| +-- RLS Status (psql)
| +-- npm audit
| +-- grep-based Code Checks
|
+-- Claude Code (headless)
| +-- Analyzes check results
| +-- Reads Git diffs
| +-- Validates configs against runbooks
| +-- Creates prioritized report
|
+-- Report -> DevOps Team (human decides)
Claude Code does not execute changes on the production system. No deployments, no secret rotation, no container stops.
Core Principle: Three Layers of Security Review
Layer 1: Deterministic Checks (Scripts)
-> Objective, repeatable results
-> Example: "Port 5432 is reachable from outside" = fact
Layer 2: Runbook Rules (Markdown files)
-> Defined target state of the stack
-> Example: "Postgres must only listen on 10.0.1.10"
Layer 3: Claude Code Contextual Analysis (headless)
-> Interprets results, identifies connections
-> Example: "Port 5432 is open AND the new container
has a direct DB connection. This is a problem."
The layers build on each other. Claude receives the results from Layer 1 and the rules from Layer 2 as input and creates its contextual review from them.
Research shows that contextual code reviews identify up to 31% more security issues than purely rule-based scanners (GitHub Security Lab 2024).
Comparison: Three Layers of Security Review
| Property | Deterministic Checks | Runbook Rules | Claude Code Analysis |
|---|---|---|---|
| Execution | Automated (scripts) | Reference document | Headless mode (CLI) |
| Result | Fact (yes/no) | Desired state | Interpretation + priority |
| Frequency | Daily | Static (on change) | Weekly / on PR |
| Strength | Reliable for known patterns | Defined standards | Detects unknown patterns |
| Weakness | Cannot see relationships | Not self-checking | Non-deterministic |
| Example | ”Port 5432 open" | "Postgres only on 10.0.1.10" | "Port open AND new container = problem” |
Part A - Setting Up Claude Code
A1 - Installation and Configuration on the Audit Server
Implementation
Claude Code is installed on the audit server, not on the production server.
# Auf dem audit-runner Server
# Node.js installieren (falls nicht vorhanden)
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo bash -
apt-get install -y nodejs
# Claude Code installieren
npm install -g @anthropic-ai/claude-code
# API Key konfigurieren
# Eigener API Key mit Budget-Limit für CI/CD
export ANTHROPIC_API_KEY="sk-ant-..."
# In .bashrc oder .env für den audit-user persistieren
echo 'export ANTHROPIC_API_KEY="sk-ant-..."' >> /home/audit/.bashrc
Testing headless mode:
# Einfacher Test: funktioniert Claude Code non-interaktiv?
echo "Hello" | claude -p "Antworte mit OK wenn du das lesen kannst" \
--output-format text
# Erwartung: "OK" oder ähnliche Bestätigung
Verifiable Condition
# Claude Code installiert?
claude --version
# Erwartung: Versionsnummer
# API Key gesetzt?
test -n "$ANTHROPIC_API_KEY" && echo "OK" || echo "FEHLT"
# Headless Mode funktioniert?
claude -p "Sage nur: OK" --output-format text --max-turns 1 2>/dev/null
# Erwartung: "OK"
Failure Scenario
If Claude Code runs on the production server and has Bash access there, a faulty prompt could theoretically execute commands on the production system. On the audit server, Claude Code only has access to the collected reports and config copies, not to the live system.
A2 - Setting Up a Custom Security Review Command
Implementation
Claude Code supports custom commands via Markdown files in .claude/commands/. These commands define the context and the review rules for the security review.
# Im Infrastruktur-Repository
mkdir -p .claude/commands
# .claude/commands/security-review.md
Du bist Security Reviewer für einen self-hosted Stack bestehend aus:
- Supabase (PostgreSQL, PostgREST, GoTrue, Kong, Edge Functions)
- Next.js (App-Schicht, Server Actions, Route Handler)
- Trigger.dev v3 (Background Jobs, self-hosted)
- All on EU-based infrastructure (e.g. Hetzner, OVH)
Du erhältst den Output der deterministischen Security Checks und die
aktuellen Konfigurationsdateien.
Prüfe folgendes:
## Infrastruktur (Artikel 1)
- Sind unerwartete Ports von aussen erreichbar?
- Hat sich die Firewall gegenüber der Baseline geändert?
- Sind Container-Versionen gepinnt und aktuell?
- Ist das TLS-Zertifikat noch mindestens 14 Tage gültig?
- Liegt das letzte Backup weniger als 26 Stunden zurück?
## Next.js (Artikel 2)
- Taucht service_role in NEXT_PUBLIC Variablen oder im Client-Bundle auf?
- Haben alle Server Actions einen getUser() Auth Check?
- Haben alle Route Handler einen getUser() Auth Check?
- Existiert middleware.ts mit getUser() (nicht getSession())?
- Sind Security Headers in next.config.js gesetzt?
## Edge Functions (Artikel 3)
- Haben alle Webhook-Functions eine Signaturprüfung?
- Gibt es Functions mit Business-Logik statt Integrations-Logik?
- Ist CORS korrekt konfiguriert (kein Wildcard in Produktion)?
- Gibt es hardcoded Secrets im Function-Code?
## Trigger.dev (Artikel 4)
- Haben alle Tasks maxDuration und concurrencyLimit?
- Haben Tasks mit externen API-Calls Idempotency Keys?
- Nutzen Tasks nur die DB-Felder die sie brauchen (kein SELECT *)?
- Gibt es console.log statt dem Trigger.dev logger?
## Stackübergreifend
- Gibt es Architektur-Drift? (Business-Logik in Edge Functions,
Langläufer in Server Actions, etc.)
- Sind die Änderungen der letzten Woche konsistent mit der
bestehenden Architektur?
- Gibt es neue Tabellen ohne RLS?
- Gibt es Muster die auf systematische Probleme hindeuten?
Erstelle einen priorisierten Bericht:
- KRITISCH: sofort handeln (Daten-Leak möglich, Port offen, Secret exponiert)
- WARNUNG: diese Woche lösen (fehlende Checks, Drift, veraltete Packages)
- INFO: bei Gelegenheit verbessern (Code-Qualität, fehlende Tests)
Für jedes Finding: Was ist das Problem, warum ist es ein Risiko,
was ist die konkrete Lösung.
Verifiable Condition
# Custom Command existiert?
test -f .claude/commands/security-review.md && echo "OK" || echo "FEHLT"
# Command wird von Claude Code erkannt?
claude -p "/security-review" --max-turns 1 2>/dev/null | head -5
# Erwartung: Claude beginnt mit der Analyse
A3 - CLAUDE.md as Project Context
Implementation
Claude Code automatically reads the CLAUDE.md file in the project directory. This file gives Claude the permanent context about your stack.
# CLAUDE.md (im Root des Infrastruktur-Repos)
## Stack-Überblick
Self-hosted Supabase + Next.js + Edge Functions + Trigger.dev v3
on EU-based infrastructure (e.g. Hetzner, OVH).
## Server
- supabase-prod: 10.0.1.10 (Supabase, Next.js, Edge Functions)
- audit-runner: 10.0.1.11 (Security Checks, Claude Code, Monitoring)
- trigger-server: separater Docker-Stack für Trigger.dev v3
## Sicherheitsregeln
- service_role Key darf NUR in: lib/supabase/admin.ts und trigger/lib/supabase.ts
- NEXT_PUBLIC_* darf NUR enthalten: SUPABASE_URL und SUPABASE_ANON_KEY
- Alle public-Tabellen müssen RLS aktiviert haben
- Alle Server Actions und Route Handler müssen getUser() aufrufen
- Alle Webhook Edge Functions müssen Signaturen prüfen
- Alle Trigger.dev Tasks müssen maxDuration und concurrencyLimit haben
- PostgreSQL nur auf 10.0.1.10:5432 (internes Interface)
- Supabase Studio nicht von aussen erreichbar
- Trigger.dev Dashboard nicht von aussen erreichbar
## Deployment
- Infrastruktur wird nur über Git deployed (kein manuelles SSH-Editing)
- Edge Functions werden als Dateien in volumes/functions/ abgelegt
- Trigger.dev Tasks werden über npx trigger.dev deploy --self-hosted deployed
## Verzeichnisstruktur
- app/ = Next.js App (Server Actions, Route Handler, Pages)
- lib/supabase/ = Supabase Client Konfiguration
- middleware.ts = Auth Middleware
- volumes/functions/ = Supabase Edge Functions
- trigger/tasks/ = Trigger.dev Tasks
- infra/ = docker-compose, nginx/caddy, Firewall Configs
- scripts/ = Audit Scripts, Backup, Restore
- runbooks/ = Sicherheits-Runbooks
Combining the CLAUDE.md approach with the security rules from Article 2 creates a seamless context base for automated reviews.
Part B - The Full Audit Workflow
This is where everything the Articles 1 through 4 prepared comes together - from the Supabase infrastructure checks to the Trigger.dev task audits. Each article has its own check scripts. Article 5 combines them and passes the output to Claude Code.
B1 - The Full Audit Script
Implementation
This script collects all results from the deterministic checks and passes them to Claude Code in headless mode.
#!/bin/bash
# scripts/full-security-audit.sh
# Läuft auf dem audit-runner Server
set -euo pipefail
PROD_HOST="10.0.1.10"
REPORT_DIR="/opt/audit/reports"
DATE=$(date +%Y-%m-%d)
REPORT_FILE="${REPORT_DIR}/audit-input-${DATE}.md"
mkdir -p "$REPORT_DIR"
echo "=== Gesamt-Security-Audit ${DATE} ===" | tee "$REPORT_FILE"
# -------------------------------------------
# EBENE 1: Deterministische Checks
# -------------------------------------------
echo -e "\n# Deterministische Check-Ergebnisse\n" >> "$REPORT_FILE"
# --- Infrastruktur (Artikel 1) ---
echo "## Infrastruktur" >> "$REPORT_FILE"
echo "### Offene Ports (extern)" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
nmap -p 22,80,443,3000,3040,5432,5433,8000,9000 app.example.com \
--open -oG - 2>/dev/null >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
echo "### Firewall Diff gegen Baseline" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "iptables-save" | \
diff /opt/baselines/firewall-baseline.txt - >> "$REPORT_FILE" 2>&1 || true
echo '```' >> "$REPORT_FILE"
echo "### Container Status" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "docker compose ps --format 'table {{.Name}}\t{{.Status}}\t{{.Image}}'" \
>> "$REPORT_FILE" 2>&1
echo '```' >> "$REPORT_FILE"
echo "### TLS Zertifikat" >> "$REPORT_FILE"
CERT_EXPIRY=$(echo | openssl s_client -connect app.example.com:443 2>/dev/null | \
openssl x509 -noout -enddate 2>/dev/null | cut -d= -f2)
CERT_EPOCH=$(date -d "$CERT_EXPIRY" +%s 2>/dev/null || echo 0)
NOW_EPOCH=$(date +%s)
DAYS_LEFT=$(( (CERT_EPOCH - NOW_EPOCH) / 86400 ))
echo "Ablauf: ${CERT_EXPIRY} (${DAYS_LEFT} Tage)" >> "$REPORT_FILE"
echo "### Backup Status" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "ls -lh /opt/backups/*.gpg 2>/dev/null | tail -3" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
echo "### RLS Status" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "docker compose exec -T postgres psql -U postgres -c \
\"SELECT tablename, rowsecurity FROM pg_tables WHERE schemaname = 'public' ORDER BY tablename;\"" \
>> "$REPORT_FILE" 2>&1
echo '```' >> "$REPORT_FILE"
echo "### Disk Usage" >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "df -h / | tail -1" >> "$REPORT_FILE"
# --- Next.js (Artikel 2) ---
echo -e "\n## Next.js App-Schicht" >> "$REPORT_FILE"
echo "### service_role im Client Code" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "cd /opt/app && grep -rn 'SERVICE_ROLE\|service_role' app/ \
--include='*.ts' --include='*.tsx' | grep -v 'lib/supabase/admin.ts' | grep -v node_modules" \
>> "$REPORT_FILE" 2>&1 || echo "Keine Treffer" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
echo "### NEXT_PUBLIC mit Secrets" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "cd /opt/app && grep 'NEXT_PUBLIC_' .env* 2>/dev/null | \
grep -iE 'service_role|secret|private|database'" \
>> "$REPORT_FILE" 2>&1 || echo "Keine Treffer" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
echo "### Server Actions ohne Auth" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "cd /opt/app && for file in \$(grep -rl \"'use server'\" app/ --include='*.ts'); do
if ! grep -q 'getUser' \"\$file\"; then
echo \"WARNUNG: \$file\"
fi
done" >> "$REPORT_FILE" 2>&1 || echo "Keine Treffer" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
echo "### Route Handler ohne Auth" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "cd /opt/app && for file in \$(find app/api -name 'route.ts' 2>/dev/null); do
if ! grep -q 'getUser' \"\$file\"; then
echo \"WARNUNG: \$file\"
fi
done" >> "$REPORT_FILE" 2>&1 || echo "Keine Treffer" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
echo "### npm audit" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "cd /opt/app && npm audit --audit-level=high 2>&1 | tail -20" \
>> "$REPORT_FILE" 2>&1
echo '```' >> "$REPORT_FILE"
# --- Edge Functions (Artikel 3) ---
echo -e "\n## Edge Functions" >> "$REPORT_FILE"
echo "### Webhook Functions ohne Signaturprüfung" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "for dir in /opt/supabase/volumes/functions/*-webhook/; do
[ -d \"\$dir\" ] || continue
name=\$(basename \"\$dir\")
if ! grep -qE 'signature|verify|hmac|crypto' \"\$dir/index.ts\" 2>/dev/null; then
echo \"KRITISCH: \$name hat keine Signaturprüfung\"
else
echo \"OK: \$name\"
fi
done" >> "$REPORT_FILE" 2>&1
echo '```' >> "$REPORT_FILE"
echo "### Hardcoded Secrets in Functions" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "grep -rn 'sk_live\|sk_test\|whsec_\|Bearer ey' \
/opt/supabase/volumes/functions/ --include='*.ts' 2>/dev/null" \
>> "$REPORT_FILE" 2>&1 || echo "Keine Treffer" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
# --- Trigger.dev (Artikel 4) ---
echo -e "\n## Trigger.dev Tasks" >> "$REPORT_FILE"
echo "### Tasks ohne maxDuration" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "cd /opt/app && for file in trigger/tasks/*.ts; do
name=\$(basename \"\$file\" .ts)
if ! grep -q 'maxDuration' \"\$file\" 2>/dev/null; then
echo \"WARNUNG: \$name\"
fi
done" >> "$REPORT_FILE" 2>&1
echo '```' >> "$REPORT_FILE"
echo "### Tasks ohne Concurrency Limit" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "cd /opt/app && for file in trigger/tasks/*.ts; do
name=\$(basename \"\$file\" .ts)
if ! grep -qE 'concurrencyLimit|queue:' \"\$file\" 2>/dev/null; then
echo \"WARNUNG: \$name\"
fi
done" >> "$REPORT_FILE" 2>&1
echo '```' >> "$REPORT_FILE"
# --- Git Änderungen ---
echo -e "\n## Code-Änderungen (letzte 7 Tage)" >> "$REPORT_FILE"
echo '```' >> "$REPORT_FILE"
ssh deploy@${PROD_HOST} "cd /opt/app && git log --oneline --since='7 days ago'" \
>> "$REPORT_FILE" 2>&1
ssh deploy@${PROD_HOST} "cd /opt/app && git diff HEAD~10 --stat 2>/dev/null" \
>> "$REPORT_FILE" 2>&1
echo '```' >> "$REPORT_FILE"
# -------------------------------------------
# EBENE 2: Runbook-Regeln (als Kontext)
# -------------------------------------------
echo -e "\n# Aktive Runbook-Regeln\n" >> "$REPORT_FILE"
cat runbooks/security-baseline.md >> "$REPORT_FILE" 2>/dev/null || \
echo "security-baseline.md nicht gefunden" >> "$REPORT_FILE"
# -------------------------------------------
# EBENE 3: Claude Code Analyse
# -------------------------------------------
echo ""
echo "=== Deterministische Checks abgeschlossen ==="
echo "=== Starte Claude Code Analyse ==="
echo ""
# Claude Code im Headless Mode aufrufen
# --allowedTools einschränken: nur Lesen, kein Bash, kein Schreiben
CLAUDE_OUTPUT=$(cat "$REPORT_FILE" | claude -p \
"Du erhältst den vollständigen Security-Audit-Report unseres self-hosted Stacks.
Analysiere ihn gemäss den Regeln in .claude/commands/security-review.md.
Erstelle einen priorisierten Bericht mit KRITISCH / WARNUNG / INFO.
Für jedes Finding: Problem, Risiko, konkrete Lösung." \
--allowedTools "Read,Grep,Glob" \
--append-system-prompt "Du bist ein Senior Security Engineer. Antworte auf Deutsch. Sei konkret und priorisiere strikt." \
--output-format text \
--max-turns 5 \
2>/dev/null)
# Report speichern
CLAUDE_REPORT="${REPORT_DIR}/claude-review-${DATE}.md"
echo "$CLAUDE_OUTPUT" > "$CLAUDE_REPORT"
echo ""
echo "=== Claude Code Review abgeschlossen ==="
echo "Report: $CLAUDE_REPORT"
# Bei kritischen Findings: Alert
if echo "$CLAUDE_OUTPUT" | grep -qi "KRITISCH"; then
echo "$CLAUDE_OUTPUT" | mail -s "KRITISCH: Security Review ${DATE}" ops@example.com
echo "Alert gesendet an ops@example.com"
fi
Verifiable Condition
# Script ausführbar?
test -x scripts/full-security-audit.sh && echo "OK" || echo "NICHT AUSFÜHRBAR"
# Letzter Audit-Report vorhanden?
ls -la /opt/audit/reports/claude-review-$(date +%Y-%m-%d).md 2>/dev/null
# Erwartung: Datei von heute
# Report enthält die erwarteten Sektionen?
grep -c "KRITISCH\|WARNUNG\|INFO" /opt/audit/reports/claude-review-$(date +%Y-%m-%d).md
# Erwartung: mindestens 1
B2 - When the Audit Runs
Three Trigger Points
1. Weekly Cron (Sunday 6:00 AM)
-> Full audit across the entire stack
-> Result: Prioritized report for the week
2. On Pull Requests (CI/CD)
-> Only the changed files
-> Result: Review comment on the PR
3. Ad-hoc (manual)
-> After incidents, major deployments, infrastructure changes
-> Result: Immediate analysis
Weekly cron:
# crontab auf dem audit-runner
0 6 * * 0 /opt/audit/scripts/full-security-audit.sh >> /var/log/security-audit.log 2>&1
PR review (piping Git diff to Claude):
#!/bin/bash
# scripts/pr-security-review.sh
# Wird vom CI bei jedem PR aufgerufen
DIFF=$(git diff origin/main...HEAD)
if [ -z "$DIFF" ]; then
echo "Kein Diff, kein Review nötig"
exit 0
fi
echo "$DIFF" | claude -p \
"Review diesen Git Diff auf Sicherheitsprobleme.
Prüfe insbesondere:
- service_role Nutzung in Client-Code
- Server Actions ohne Auth Check
- Edge Functions ohne Signaturprüfung
- Trigger.dev Tasks ohne maxDuration
- Hardcoded Secrets
- Neue API Endpoints ohne Rate Limiting
Antworte NUR wenn du ein konkretes Problem findest.
Wenn alles in Ordnung ist, sage: Keine Sicherheitsprobleme gefunden." \
--allowedTools "Read,Grep,Glob" \
--output-format text \
--max-turns 3
Ad-hoc review:
# Manuell nach einem Incident
cd /opt/infra-repo
claude -p "Prüfe den aktuellen Zustand des Repositories auf Sicherheitsprobleme. \
Fokus auf die letzten 5 Commits." \
--allowedTools "Read,Grep,Glob,Bash(git log*),Bash(git diff*)" \
--max-turns 10
Verifiable Condition
# Cron Job aktiv?
crontab -l | grep "full-security-audit"
# Erwartung: Eintrag vorhanden
# Letzter Audit weniger als 8 Tage her?
LAST_REPORT=$(ls -t /opt/audit/reports/claude-review-*.md 2>/dev/null | head -1)
if [ -n "$LAST_REPORT" ]; then
AGE=$(( ($(date +%s) - $(stat -c %Y "$LAST_REPORT")) / 86400 ))
echo "Letzter Report: $AGE Tage alt"
[ "$AGE" -gt 8 ] && echo "WARNUNG: Audit überfällig"
fi
Part C - What Claude Can and Cannot Do
C1 - Where Claude Code Excels
Claude identifies connections that deterministic checks cannot see:
Detecting architecture drift: The deterministic checks find that a new Edge Function exists. Claude recognizes that this function contains business logic that belongs in Next.js because it processes user input and transforms data rather than receiving an external event.
Patterns across file boundaries:
A grep finds that getUser() is missing in a Server Action. Claude recognizes that this action is the only one of 15 actions missing the check, and that it was added last week in a commit that also changed three other files, all of which are correct. This points to an oversight, not a systematic problem.
Prioritization: The deterministic checks produce a flat list of 30 findings. Claude groups them: “The 5 service-role findings are all in the admin library and correct. The 2 missing auth checks in the route handlers are the real risk because they affect public API endpoints.”
Configuration context: A port scan shows that port 8000 is open. Claude knows from the stack context (CLAUDE.md) that port 8000 is the internal Kong port and checks whether it should only be listening on localhost.
C2 - What Claude Code Cannot Do and Must Not Do
Claude cannot actively scan. It cannot perform network scans, check running processes, or test ports from outside. The deterministic tools do that (nmap, ss, docker ps). Claude analyzes their output.
Claude must not write to production. The --allowedTools restriction in headless mode ensures that Claude can only read:
# RICHTIG: Nur Lese-Tools erlaubt
claude -p "..." --allowedTools "Read,Grep,Glob"
# FALSCH: Bash erlaubt = Claude kann beliebige Befehle ausführen
claude -p "..." --allowedTools "Read,Grep,Glob,Bash"
When Bash is needed (e.g., for git log), only allow specific commands:
--allowedTools "Read,Grep,Glob,Bash(git log*),Bash(git diff*)"
Claude is not deterministic. The same input can produce slightly different reports. That is why Claude does not replace deterministic checks. It interprets their results.
Claude does not know real-time data. It works with the snapshots provided at the time of the audit. Between the audit and reading the review, the state may have changed.
C3 - Security Rules for Claude Code in CI
# Auf dem Audit-Server: Claude Code Konfiguration
# 1. Eigener API Key mit Budget-Limit
# Separater Key, nicht der Entwickler-Key
# Budget: max. 50 USD/Monat für CI Reviews
# 2. --allowedTools immer einschränken
# Niemals unbeschränkten Bash-Zugriff in automatisierten Scripts
# 3. --max-turns begrenzen
# Verhindert Endlosschleifen bei verwirrenden Inputs
# Empfehlung: 3-5 für PR Reviews, 5-10 für Full Audits
# 4. Output immer in Datei speichern
# Nicht nur auf stdout, damit der Report nachvollziehbar bleibt
# 5. Claude Code hat keinen SSH-Zugriff auf Produktion
# Die deterministischen Scripts sammeln die Daten
# Claude bekommt nur den Text-Output zur Analyse
Deployment Checklist
Installation
[ ] Claude Code installed on audit-runner
[ ] API key configured (separate CI key with budget limit)
[ ] Headless mode works (test with claude -p)
Configuration
[ ] .claude/commands/security-review.md created
[ ] CLAUDE.md present in the infrastructure repo
[ ] --allowedTools restricted to Read,Grep,Glob
[ ] --max-turns limited to 5-10
Automation
[ ] Full audit script present and executable
[ ] Weekly cron job active
[ ] PR review script integrated in CI
[ ] Alert on critical findings configured
Security
[ ] Claude Code runs ONLY on the audit-runner
[ ] Claude Code has NO SSH access to production
[ ] No unrestricted Bash access in --allowedTools
[ ] Reports are saved and archived
[ ] Budget limit set on the API key
Conclusion
Claude Code is not an automatic security scanner and not a replacement for deterministic checks. It is a contextual analyst that interprets the results of scanners and grep checks, identifies connections, and prioritizes recommendations.
The workflow is always the same: deterministic tools collect facts, Claude Code interprets them, a human decides. This works because each layer does what it does best. Scripts are reliable with known patterns. Claude identifies unknown patterns. Humans make decisions.
The combination of the check scripts from Articles 1-4, the custom security review command, and the weekly audit cron creates a security control that uncovers both known and unexpected risks.
Those who pursue these principles together with a Cert-Ready-by-Design architecture build verifiable security instead of retroactive audits.
Series Audit Checklists
Prepared prompts for Claude Code. Each checklist automatically verifies the security points from its respective runbook and reports PASS, WARNING, or CRITICAL.
Series Table of Contents
- Supabase Self-Hosting Runbook
- Running Next.js on Supabase Securely
- Deploying Supabase Edge Functions Securely
- Running Trigger.dev Background Jobs Securely
- Claude Code as a Security Control in the DevOps Workflow - this article
- Security Baseline for the Entire Stack
The next and final article describes the Security Baseline for the full stack, which consolidates all rules from Articles 1-5 into a single verifiable YAML file.

Bert Gogolin
CEO & Founder, Gosign
AI Governance Briefing
Enterprise AI, regulation, and infrastructure - once a month, directly from me.