AI Infrastructure:
Governance Handbook
for the CTO

Build, Buy, Hybrid - EU AI Act-compliant
infrastructure before August 2026

Author: Bert Gogolin, Managing Director
Publisher: Gosign GmbH, Hamburg
Date: March 2026
Scope: 28 pages

Contents

1 Why the CTO must lead AI Infrastructure Governance
2 Build, Buy, Hybrid: the B/B/H framework
3 EU AI Act: 6 technical requirements
4 Security & Data Sovereignty
5 4 infrastructure patterns in production
6 Infrastructure Readiness Assessment
7 Next steps
644 bn
USD AI infrastructure spending by 2027
Gartner 2024
28%
Cloud spending wasted
Flexera 2024
40%
Security incidents from misconfiguration
ENISA 2024

1 - Why the CTO must lead AI Infrastructure Governance

AI infrastructure is growing faster than the governance structures that control it.

According to HashiCorp (2024), 82% of enterprises operate multi-cloud environments - but only 31% have a centralized governance strategy. The result: Shadow AI. Business units use external LLM APIs without approval. Data science teams deploy models on uncontrolled endpoints.

The Stanford HAI AI Index Report (2025) documents: Investment in AI infrastructure is growing 29% annually, but governance budgets grow only 8%. This gap creates technical debt.

Three governance layers

LayerResponsibilityWho
Architecture GovernanceApproved patterns, models, APIsCTO + Enterprise Architecture
Operations GovernanceSLAs, monitoring, incident response, cost mgmt.Infrastructure + DevOps
Compliance GovernanceEU AI Act, GDPR, audit trail, data residencyCTO + CISO + Legal
Cost GovernanceBudgeting, chargeback, waste detectionFinOps + CTO
Security GovernanceZero Trust, encryption, access managementCISO + Platform
CTO checklist

Before the first AI agent goes to production:

According to Flexera (2024), enterprises waste an average of 28% of their cloud spending. For AI workloads with GPU instances, the waste rate is even higher.

2 - Build, Buy, Hybrid: the B/B/H framework

Every AI infrastructure component requires a fundamental decision: build in-house, buy, or combine.

CriterionBuildBuyHybrid
ControlFullLimitedDifferentiated
Data ResidencyGuaranteedContract-dependentControllable
Time-to-Value3-6 months1-4 weeks4-8 weeks
Operating costsFixed + personnelVariable (pay-per-use)Mixed
Vendor lock-inNoneHighMedium

Decision matrix by workload

WorkloadRecommendationRationale
LLM inference (standard)BuyCost-efficient at variable volume
LLM inference (sensitive)BuildData must not leave the EU
Agent OrchestrationHybridFramework self-hosted, LLM calls routed
Document IntelligenceBuildDocuments contain PII
Vector DatabaseHybridManaged for non-sensitive, self-hosted for PII
MonitoringBuySpecialized tools with EU region

Hidden costs & hidden risks

Build - hidden costs

GPU hardware: NVIDIA H100: USD 25,000-40,000 per card. Production cluster: 4-8 cards minimum.

Personnel: MLOps engineers, platform engineers, security specialists. 40% of enterprises lack the skills (Gartner 2024).

Maintenance: Model updates, security patches, infrastructure upgrades. Ongoing.

Buy - hidden risks

Data Residency: Where are prompts processed? Are they used for training?

Vendor lock-in: Proprietary APIs, embedding formats. Migration costs 3-6 months.

Availability: 12 hours downtime per quarter on average (Stanford HAI 2025).

The hybrid recommendation

For most enterprise scenarios, a hybrid approach is recommended: Model Gateway as central control layer (self-hosted), routing by sensitivity, fallback strategy for provider outages, cost optimization through intelligent model routing.

3 - EU AI Act: 6 technical requirements

What lawyers read as compliance obligations are infrastructure requirements for the CTO.

Starting August 2026, six mandatory requirements apply (subject to Digital Omnibus Package - possible postponement to December 2027):

RequirementArt.Infrastructure measure
Risk management9Confidence routing, circuit breaker, canary deployments
Data governance10Data lineage, immutable storage, data catalog
Record-keeping12Structured logging, retention 10y+, tamper-proof
Transparency13Observability stack, decision explanation API, model cards
Human oversight14HITL gateway (architectural), kill switch < 1s, auditor portal
Accuracy/robustness15Benchmark pipeline, adversarial testing, multi-region redundancy

Risk management (Art. 9) - technical

Confidence routing: Every agent output receives a confidence score. Below threshold: escalation. Circuit breaker: Automatic deactivation upon anomalies. Canary deployments: New model versions rolled out gradually, automatic rollback upon degradation.

Record-keeping (Art. 12) - technical

Structured logging: Every API call, every agent decision, every HITL intervention. Retention: Lifetime of the system + 10 years (Art. 19). Tamper-proof: Append-only logs in immutable storage.

Human oversight (Art. 14) - technical

HITL gateway: Architecturally enforced human approval. No bypass. Kill switch: Immediate deactivation, latency < 1 second. Auditor portal: Read-only dashboard for compliance auditors.

Compliance checklist

Sanctions: Up to EUR 15 million or 3% of global annual turnover.

4 - Security & Data Sovereignty

40% of security incidents in cloud environments are caused by misconfiguration - not by attacks (ENISA 2024).

4 pillars of Data Sovereignty

PillarRequirementImplementation
Data ResidencyAll processing in EU data centersSelf-hosted models or EU region at provider
EncryptionAt rest, in transit, in useAES-256, TLS 1.3, mTLS, Confidential Computing
Zero TrustNo implicit trustIdentity-based access, least privilege, micro-segmentation
Supply ChainModel and software provenance verifiedModel provenance, SBOM, container scanning, signed artifacts

Data Residency in detail

ComponentEU requirementImplementation
LLM inferencePrompts must not leave the EUSelf-hosted or EU region at provider
Vector DatabaseEmbeddings contain encoded knowledgeEU region or self-hosted
LoggingLogs contain PIIEU storage with WORM policy
BackupsSame rules as production dataEU region, encrypted

GDPR compliance for LLM usage

ScenarioRiskMeasure
PII in promptsArt. 6 - Legal basisPII stripping before API call
Customer data in RAGArt. 5 - Purpose limitationAccess control at document level
Logs with user dataArt. 17 - Right to erasurePseudonymization + retention policy
Embeddings with PIIArt. 22 - Automated decisionsTransparency documentation

5 - 4 infrastructure patterns in production

Pattern 1: Agent Orchestration

Agent frameworks are built for experiments, not for production. Enterprise agents need: defined permissions, audit trails, rollback, cost controlling.

ComponentFunctionTechnology
OrchestratorWorkflow, task routing, parallelizationTemporal, Prefect, Custom
Permission LayerAgent permissions for tools/APIsOPA, Cedar
State ManagementContext, memory, task progressRedis, PostgreSQL
ObservabilityTraces, token usage, latencyOpenTelemetry, Langfuse

Result: MTTR for agent failures -70%. Uncontrolled API costs -40-60% (Gosign projects).

Pattern 2: Document Intelligence

80% of enterprise data is unstructured (IDC 2024). A Document Intelligence pipeline classifies, extracts and vectorizes documents automatically.

StageFunctionTechnology
IngestionIngest PDF, Word, scans, emailTika, Unstructured.io
OCRConvert scans to textTesseract, PaddleOCR
ClassificationIdentify document typeFine-tuned classifier
ExtractionExtract structured dataLLM + schema validation
EmbeddingVectorize documentsSentence Transformers
StorageVectors + metadatapgvector, Qdrant

Result: 92-97% classification accuracy. Manual processing -60-80%.

Pattern 3: Model Gateway

Central layer between applications and LLM providers. Routing, PII detection, rate limiting, caching, fallback, logging.

Request typeRoutingRationale
Contains PIISelf-hosted modelData stays in the EU
Standard classificationCheapest modelCost optimization
Complex analysisMost capable modelQuality prioritized
Provider A downProvider BAvailability

Result: LLM costs -30-50% through routing and caching. Compliance through centralized PII screening.

Pattern 4: Monitoring & Observability

AI systems fail silently. An LLM with poor responses does not throw an error.

LayerWhat is measuredTools
InfrastructureCPU, GPU, memory, networkPrometheus, Grafana
ApplicationLatency, error rate, throughputOpenTelemetry, Jaeger
ModelConfidence, tokens, hallucinationLangfuse, WhyLabs
BusinessZero-touch rate, escalationCustom dashboards
CostAPI costs per team/projectInfracost, Custom
ComplianceAudit completeness, HITL rateCustom + SIEM

Result: Quality issues detected 4x faster. Incident impact duration -65% (Gartner 2024).

6 - Infrastructure Readiness Assessment

10 questions for the CTO. Rate each with 0 (no), 1 (partially) or 2 (yes).

#Question012
1Complete inventory of all AI systems and APIs (including Shadow AI).
2Approved reference architecture for AI workloads with defined patterns.
3All AI data processing verifiably in EU data centers.
4Model Gateway with PII screening and centralized logging.
5Structured logging for every API call and every agent decision.
6Kill switch for individual agents and entire AI system (< 1s).
7GPU/API costs tracked per team, project and use case.
8Automated benchmark and adversarial evaluation before deployment.
9Backup and DR strategy specific to AI infrastructure.
10All 6 EU AI Act requirements (Art. 9-15) verifiably met.
ScoreRatingRecommendation
16-20Production-readyOptimization and scaling. Ready for regulated workloads.
10-15Foundation in placeClose gaps: logging, PII screening, kill switch.
5-9Catching up neededReference architecture, Model Gateway, Shadow AI inventory.
0-4Action requiredStart immediately. Inventory + reference architecture.
Investment distribution (recommendation vs. actual)
ItemActualRecommendation
Models & compute70%35-40%
Infrastructure platform15%25-30%
Governance & compliance5%15-20%
Observability & monitoring5%10-15%
Security5%10-15%

7 - Next steps

The 90-day plan

MonthFocusOutcome
1Inventory & architectureAI inventory, reference architecture, data residency verified, cost baseline
2Gateway & governanceModel Gateway live, structured logging, kill switch, observability stack
3Compliance & pilotEU AI Act checklist, benchmark pipeline, adversarial testing, compliance audit

Recommended infrastructure stack

LayerRecommendationAlternatives
Model GatewayLiteLLM, PortkeyCustom (Go/Python)
Agent OrchestrationTemporal + CustomPrefect, Airflow
Vector Databasepgvector (PostgreSQL)Qdrant, Weaviate
ObservabilityOpenTelemetry + GrafanaDatadog, Langfuse
Policy EngineOPACedar, Casbin
Secret ManagementVaultAWS KMS, SOPS
Container RuntimeKubernetesNomad, ECS
CI/CDGitHub ActionsGitLab CI, Tekton
Consultation

We analyze your AI infrastructure and identify the critical gaps.

Compliance, security and cost governance - 30 minutes, free of charge, no obligation.

Bert Gogolin - Managing Director, Gosign GmbH

Contact: www.gosign.de/en/contact

Web: www.gosign.de