# Trigger.dev Background Jobs Security Audit

You are a Senior DevOps Security Auditor. Your task is to verify whether the
Trigger.dev setup has been correctly implemented according to the runbook (Article 4).

Execute the following checks in order. Use the available tools.
For each check: Report PASS, WARNING, or CRITICAL with a brief justification.
At the end, produce a summary with recommended actions.

---

## A1: Trigger.dev as a Separate Service

Verify that Trigger.dev runs separately from Next.js and Supabase.

```bash
# Trigger.dev Container/Prozesse
echo "=== Trigger.dev Container ==="
docker ps --format '{{.Names}}\t{{.Image}}\t{{.Status}}' 2>/dev/null | grep -i "trigger"

# Eigene Datenbank?
echo "=== Trigger.dev Datenbank ==="
ss -tlnp | grep 5433 2>/dev/null || echo "Port 5433 nicht aktiv"

# Supabase DB auf anderem Port?
echo "=== Supabase Datenbank ==="
ss -tlnp | grep 5432 2>/dev/null || echo "Port 5432 nicht aktiv"

# Trigger.dev docker-compose vorhanden?
echo "=== Trigger.dev Compose ==="
find /opt -name "docker-compose*" -path "*trigger*" 2>/dev/null | head -5

# Dashboard nur intern?
echo "=== Dashboard Port ==="
ss -tlnp | grep 3040 2>/dev/null
```

Expectation:
- Trigger.dev containers running (platform + DB)
- Dedicated PostgreSQL on port 5433 (not the Supabase DB on 5432)
- Dashboard (port 3040) only on 127.0.0.1 (not 0.0.0.0)

Risk: Trigger.dev on the Supabase DB = job queue queries compete with user requests. Dashboard externally reachable = access to run history and logs.

---

## A2: Database Access Controlled

Verify that tasks properly restrict database access.

```bash
# Trigger Tasks Verzeichnis finden
TASKS_DIR=$(find . -path "*/trigger/tasks" -type d 2>/dev/null | grep -v node_modules | head -1)
TASKS_DIR=${TASKS_DIR:-"trigger/tasks"}

# Shared Supabase Client vorhanden?
echo "=== Shared Client ==="
find . -path "*/trigger/lib/*" -name "*.ts" 2>/dev/null | grep -v node_modules

# Wo wird service_role / DATABASE_URL verwendet?
echo "=== service_role in Tasks ==="
grep -rn "SERVICE_ROLE\|service_role\|createTaskClient\|createAdminClient" \
  ${TASKS_DIR}/ --include="*.ts" 2>/dev/null | head -10

# Direkte DB-Connection?
echo "=== Direkte DB-Connection ==="
grep -rn "DATABASE_URL\|postgres(\|pg\.connect\|new Pool" \
  trigger/ --include="*.ts" 2>/dev/null | grep -v node_modules | head -10

# Connection Pool Limits?
echo "=== Connection Pool Config ==="
grep -rn "max:\|pool\|idle_timeout\|connect_timeout" \
  trigger/ --include="*.ts" 2>/dev/null | grep -v node_modules | head -10

# SELECT * Patterns (zu breiter Zugriff)?
echo "=== SELECT * Patterns ==="
grep -rn "\.select('\\*')\|\.select(\`\\*\`)\|SELECT \\*" \
  ${TASKS_DIR}/ --include="*.ts" 2>/dev/null | head -10
```

Expectation:
- Shared client factory in trigger/lib/ (not created individually in each task)
- service_role only in the client factory, not directly in tasks
- Connection pool with max limit (5-10)
- No SELECT * (only required fields)

Risk: Tasks bypass RLS. Without scope restrictions = full DB access. Without pool limits = connection starvation.

---

## B1: Task Definitions Correct

Verify that all tasks are correctly defined per the v3 API.

```bash
TASKS_DIR=$(find . -path "*/trigger/tasks" -type d 2>/dev/null | grep -v node_modules | head -1)
TASKS_DIR=${TASKS_DIR:-"trigger/tasks"}

# Alle Task-Dateien
echo "=== Task-Dateien ==="
ls ${TASKS_DIR}/*.ts 2>/dev/null

# Exportierte Tasks?
echo "=== Nicht-exportierte Tasks ==="
for file in ${TASKS_DIR}/*.ts 2>/dev/null; do
  [ -f "$file" ] || continue
  name=$(basename "$file" .ts)
  if ! grep -q "export const" "$file"; then
    echo "KRITISCH: $name ist nicht exportiert"
  fi
done

# Task IDs vorhanden und eindeutig?
echo "=== Task IDs ==="
grep -rn "id:" ${TASKS_DIR}/ --include="*.ts" 2>/dev/null | grep "task({" -A1 | grep "id:" | \
  awk -F'"' '{print $2}' | sort | uniq -c | sort -rn | head -10

# Werden Tasks aus Client-Code getriggert?
echo "=== Tasks aus Client Code ==="
grep -rn "tasks\.trigger\|\.trigger(" app/ --include="*.tsx" 2>/dev/null | \
  grep -v "use server" | grep -v node_modules | head -5
```

Expectation:
- All tasks exported (export const)
- All tasks have unique IDs
- Tasks are ONLY triggered from server context (not from .tsx client components)

Risk: Non-exported tasks are silently ignored on deploy. Tasks triggered from client code expose the Trigger API.

---

## B2: Idempotency

Verify that tasks are idempotent, especially those with external API calls.

```bash
TASKS_DIR=$(find . -path "*/trigger/tasks" -type d 2>/dev/null | grep -v node_modules | head -1)
TASKS_DIR=${TASKS_DIR:-"trigger/tasks"}

# Tasks mit externen API-Calls
echo "=== Tasks mit externen Calls ==="
for file in ${TASKS_DIR}/*.ts 2>/dev/null; do
  [ -f "$file" ] || continue
  name=$(basename "$file" .ts)
  if grep -qE "fetch\(|stripe\.|resend\.|sendgrid\.|openai\.|anthropic\." "$file"; then
    HAS_IDEM=$(grep -cE "idempotencyKey|idempotencyKeys" "$file" || echo 0)
    if [ "$HAS_IDEM" -gt 0 ]; then
      echo "OK: $name hat externe Calls + Idempotency Keys"
    else
      echo "WARNUNG: $name hat externe Calls OHNE Idempotency Keys"
    fi
  fi
done

# DB-Operationen: upsert statt insert?
echo "=== Insert vs Upsert ==="
grep -rn "\.insert(" ${TASKS_DIR}/ --include="*.ts" 2>/dev/null | head -5
grep -rn "\.upsert(" ${TASKS_DIR}/ --include="*.ts" 2>/dev/null | head -5

# Idempotency beim Triggern (aus Next.js)?
echo "=== Idempotency beim Triggern ==="
grep -rn "idempotencyKey" app/ --include="*.ts" 2>/dev/null | grep -v node_modules | head -10
```

Expectation:
- Tasks with external API calls (Stripe, email, etc.) have idempotencyKeys
- DB operations use upsert where possible (instead of insert)
- Trigger invocations in Next.js have idempotencyKey where double-triggers are possible

Risk: Without idempotency on retry = duplicate payments, duplicate emails, duplicate DB records.

---

## B3: Timeouts and Concurrency

Verify that all tasks have maxDuration and concurrency limits.

```bash
TASKS_DIR=$(find . -path "*/trigger/tasks" -type d 2>/dev/null | grep -v node_modules | head -1)
TASKS_DIR=${TASKS_DIR:-"trigger/tasks"}

# Tasks ohne maxDuration
echo "=== maxDuration ==="
for file in ${TASKS_DIR}/*.ts 2>/dev/null; do
  [ -f "$file" ] || continue
  name=$(basename "$file" .ts)
  if grep -q "maxDuration" "$file"; then
    DURATION=$(grep "maxDuration" "$file" | grep -oP '\d+' | head -1)
    echo "OK: $name (maxDuration: ${DURATION}s)"
  else
    echo "WARNUNG: $name hat KEIN maxDuration"
  fi
done

# Tasks ohne Concurrency Limit
echo "=== Concurrency ==="
for file in ${TASKS_DIR}/*.ts 2>/dev/null; do
  [ -f "$file" ] || continue
  name=$(basename "$file" .ts)
  if grep -qE "concurrencyLimit|queue:" "$file"; then
    LIMIT=$(grep "concurrencyLimit" "$file" | grep -oP '\d+' | head -1)
    echo "OK: $name (concurrency: ${LIMIT:-shared queue})"
  else
    echo "WARNUNG: $name hat KEIN Concurrency Limit"
  fi
done

# Shared Queues?
echo "=== Shared Queues ==="
grep -rn "queue({" trigger/ --include="*.ts" 2>/dev/null | grep -v node_modules | head -5
```

Expectation:
- All tasks have maxDuration (email: 30s, PDF: 120s, AI: 300s)
- All tasks have concurrencyLimit or use a shared queue
- Shared queues for tasks that hit the same external service

Risk: Without maxDuration = task hangs indefinitely on API timeout. Without concurrency limits = all worker slots blocked.

---

## B4: Retry Strategy

Verify that tasks have a sensible retry configuration.

```bash
TASKS_DIR=$(find . -path "*/trigger/tasks" -type d 2>/dev/null | grep -v node_modules | head -1)
TASKS_DIR=${TASKS_DIR:-"trigger/tasks"}

echo "=== Retry Konfiguration ==="
for file in ${TASKS_DIR}/*.ts 2>/dev/null; do
  [ -f "$file" ] || continue
  name=$(basename "$file" .ts)
  if grep -q "retry:" "$file"; then
    ATTEMPTS=$(grep -A5 "retry:" "$file" | grep "maxAttempts" | grep -oP '\d+' | head -1)
    echo "OK: $name (maxAttempts: ${ATTEMPTS:-?})"
  else
    echo "WARNUNG: $name hat KEINE Retry-Konfiguration (nutzt Default: kein Retry)"
  fi
done

# onFailure Hooks für kritische Tasks?
echo "=== onFailure Hooks ==="
for file in ${TASKS_DIR}/*.ts 2>/dev/null; do
  [ -f "$file" ] || continue
  name=$(basename "$file" .ts)
  if grep -q "onFailure" "$file"; then
    echo "OK: $name hat onFailure Hook"
  else
    echo "INFO: $name hat keinen onFailure Hook"
  fi
done
```

Expectation:
- All tasks have retry configuration with maxAttempts and exponential backoff
- Critical tasks (payments, important sync) have onFailure hooks

Risk: Without retry = a transient API error leads to permanent data loss. Without onFailure = failed tasks go unnoticed.

---

## B5: Secrets Management

Verify that secrets are managed correctly.

```bash
# Hardcoded Secrets in Tasks?
echo "=== Hardcoded Secrets ==="
grep -rn "sk_live\|sk_test\|SG\.\|sk-proj\|sk-ant\|Bearer ey\|whsec_" \
  trigger/ --include="*.ts" 2>/dev/null | grep -v node_modules || \
  echo "Keine hardcoded Secrets gefunden (gut)"

# Secrets über process.env geladen?
echo "=== process.env Nutzung ==="
grep -rn "process\.env\." trigger/ --include="*.ts" 2>/dev/null | \
  grep -v node_modules | head -10

# .env.trigger nicht im Git?
echo "=== .env.trigger im Git? ==="
git ls-files .env.trigger 2>/dev/null || echo "Nicht im Git (gut)"

# .env.trigger Rechte
echo "=== .env.trigger Rechte ==="
ENV_TRIGGER=$(find . /opt -name ".env.trigger" 2>/dev/null | head -1)
if [ -n "$ENV_TRIGGER" ]; then
  stat -c "%a %U" "$ENV_TRIGGER"
fi
```

Expectation:
- No hardcoded API keys in the code
- Secrets loaded via process.env
- .env.trigger not in Git, permissions 600

Risk: Secrets in code end up in the Git repo and in worker images.

---

## B6: Logging

Verify that tasks use the Trigger.dev logger and do not log sensitive data.

```bash
TASKS_DIR=$(find . -path "*/trigger/tasks" -type d 2>/dev/null | grep -v node_modules | head -1)
TASKS_DIR=${TASKS_DIR:-"trigger/tasks"}

# console.log statt logger?
echo "=== console.log in Tasks ==="
grep -rn "console\.log\|console\.error\|console\.warn" \
  ${TASKS_DIR}/ --include="*.ts" 2>/dev/null || echo "Keine console.log gefunden (gut)"

# Trigger.dev logger importiert?
echo "=== logger Import ==="
grep -rn "from.*@trigger.dev.*logger\|import.*logger" \
  trigger/ --include="*.ts" 2>/dev/null | grep -v node_modules | head -5

# Sensible Daten in Logs?
echo "=== Sensible Daten in logger Calls ==="
grep -rn "logger\.\(info\|error\|warn\)" trigger/ --include="*.ts" 2>/dev/null | \
  grep -iE "password|secret|key|token|email.*:" | grep -v node_modules | head -5
```

Expectation:
- No console.log in tasks (always use the Trigger.dev logger)
- No sensitive data (passwords, tokens, user emails) in log calls

Risk: Trigger.dev stores all logs in the dashboard. Sensitive data is visible there to anyone with dashboard access, even months later.

---

## C1: Dead Letter Handling

Verify that failed tasks are properly caught.

```bash
# failed_jobs Tabelle vorhanden?
echo "=== failed_jobs Tabelle ==="
docker compose exec -T postgres psql -U postgres -c \
  "SELECT tablename FROM pg_tables WHERE tablename = 'failed_jobs';" 2>/dev/null || \
  echo "Konnte DB nicht prüfen"

# Unresolvierte Failed Jobs?
echo "=== Offene Failed Jobs ==="
docker compose exec -T postgres psql -U postgres -c \
  "SELECT count(*) as offen FROM failed_jobs WHERE resolved = false;" 2>/dev/null || \
  echo "Konnte nicht prüfen (Tabelle fehlt oder DB nicht erreichbar)"

# onFailure Hooks in kritischen Tasks?
TASKS_DIR=$(find . -path "*/trigger/tasks" -type d 2>/dev/null | grep -v node_modules | head -1)
TASKS_DIR=${TASKS_DIR:-"trigger/tasks"}

echo "=== Tasks OHNE onFailure ==="
for file in ${TASKS_DIR}/*.ts 2>/dev/null; do
  [ -f "$file" ] || continue
  name=$(basename "$file" .ts)
  if ! grep -q "onFailure" "$file"; then
    # Ist es ein kritischer Task?
    if grep -qE "payment\|invoice\|order\|billing\|charge\|subscription" "$file"; then
      echo "WARNUNG: Kritischer Task $name hat keinen onFailure Hook"
    else
      echo "INFO: $name hat keinen onFailure Hook"
    fi
  fi
done
```

Expectation:
- failed_jobs table exists (or equivalent system)
- No old unresolved failed jobs
- Critical tasks (payments, orders) have onFailure hooks

Risk: Failed tasks without alerting go undetected until a customer reports the issue.

---

## C2: Monitoring

Verify that Trigger.dev monitoring is configured.

```bash
# Dashboard erreichbar (intern)?
echo "=== Dashboard ==="
curl -s -o /dev/null -w "%{http_code}" http://localhost:3040 2>/dev/null || echo "Nicht erreichbar"

# Dashboard NICHT extern erreichbar?
echo "=== Dashboard extern ==="
EXTERNAL_HOST=$(hostname -I | awk '{print $1}')
curl -s -o /dev/null -w "%{http_code}" --connect-timeout 3 "http://${EXTERNAL_HOST}:3040" 2>/dev/null || echo "Nicht extern erreichbar (gut)"

# Health Check Script vorhanden?
echo "=== Health Check ==="
find /opt -name "*trigger*health*" -o -name "*check*trigger*" 2>/dev/null | head -5

# Cron Job für Monitoring?
echo "=== Cron ==="
crontab -l 2>/dev/null | grep -i "trigger"
```

Expectation:
- Dashboard reachable internally
- Dashboard NOT reachable externally
- Health check or monitoring configured

---

## Summary

Now produce a summary in the following format:

```
# Trigger.dev Security Audit - [DATE]

## Result

PASS:     X of Y checks
WARNING:  X checks
CRITICAL: X checks

## Critical Findings (act immediately)
- ...

## Warnings (resolve this week)
- ...

## Passed
- ...

## Recommended Next Steps
1. ...
2. ...
3. ...
```

Prioritize strictly: Critical findings first, then warnings.
For each finding: What is the problem, why is it a risk, what is the fix.
