A candidate is rejected. A special payment is denied. A salary-grade classification comes out lower than expected. In all three cases an AI agent was involved - and in all three cases the affected person can soon say a sentence that used to be reserved for lawyers: “Explain to me the role of the AI system and the main elements of this decision.”
That is not a hypothetical scenario. It is the core wording of Article 86 of Regulation (EU) 2024/1689 - the most underestimated article of the EU AI Act. The industry debates model documentation, risk classes, and deadlines. Article 86 asks a different question: can you explain one single, specific decision? Not your system. This one decision.
Most companies cannot. Not because their AI is bad - but because their architecture documents the wrong unit. They record conversations, tokens, and system behaviour. Yet only what was captured as a decision can be contested. This article describes the missing artefact: the decision act.
At a Glance - The Decision Act
- Only what was captured as a decision can be contested. The decision act is the atomic, immutable record of a single micro-decision: input, applied business rule with version and source, result, confidence, model version, timestamp, contestation path. Logs answer "What happened?" - the decision act answers "Why was it decided this way?".
- Art. 86 of the EU AI Act gives affected persons a right to an explanation of the individual decision - and the European Court of Justice already demands the same case-level justification under the GDPR today. A system description is not enough.
- Chat logs, observability traces, and model cards document interaction, infrastructure, and system - but not the substantive justification. Microsoft's own documentation confirms that the Copilot audit log captures neither prompt content nor model version. And according to Anthropic's research, reasoning text from language models reflects the actual basis of a decision in only about a quarter of cases.
- An AI agent never makes "one decision" - it makes dozens of micro-decisions per case. Each one can be legally relevant: the ECJ treats the automated score itself as a decision, not just the final outcome.
- Contestability cannot be retrofitted: a decision that was never modelled as a decision cannot be justified as one afterwards. That is why the decision act is an architectural principle of the Decision Layer - not a compliance feature.
Agentic AI means transferring decision rights
The numbers from the major consultancies tell one consistent story in 2026. McKinsey calls it the gen AI paradox: nearly eight in ten companies use generative AI, while almost as many see no measurable bottom-line effect - because horizontal copilots are rolled out broadly while the value-creating vertical use cases remain stuck in pilot mode. Gartner predicts that over 40 percent of agentic AI projects will be cancelled by the end of 2027 - explicitly including inadequate risk controls among the causes. The same forecast expects at least 15 percent of day-to-day work decisions to be made autonomously by AI agents by 2028.
Taken together, those two statements define the real tension: companies are transferring decision rights to machines without owning the infrastructure to answer for those decisions. McKinsey frames the agentic era in exactly those terms in its 2026 trust study - the governing question shifts from “Is the model accurate?” to “Who is accountable when the system acts?” (State of AI trust in 2026).
The governance gap is quantified across every source:
| Finding | Figure | Source |
|---|---|---|
| Agentic AI projects cancelled by end of 2027 - including for inadequate risk controls | over 40% | Gartner, 2025 |
| Enterprises with mature governance for autonomous agents | only 21% | Deloitte, 2025 |
| Technology executives whose AI adoption is already outpacing their governance capability | 77% | IBM IBV, 2026 |
| CIOs/CTOs held accountable for AI systems they do not fully control | around two thirds | IBM IBV, 2026 |
| Reported AI incidents in 2024 - a record high, +56.4% year over year | 233 | Stanford HAI AI Index, 2025 |
| Fewer AI incidents where governance is embedded directly in the system rather than enforced by manual oversight | 25% | IBM IBV, 2026 |
The last row is the most important one: embedded governance measurably outperforms after-the-fact control. Which raises the concrete question of what “embedded” means technically. And it is sharpest in HR: transferring decision rights to agents shifts the operating model - and every one of those decisions lands on a specific person: their sick note, their pay, their grading. The answer starts with the law.
What Article 86 requires - and why system documentation cannot satisfy it
The EU AI Act distinguishes two fundamentally different transparency levels that compliance debates constantly conflate:
| Level | Legal basis | Addressee | Point in time | Answers the question |
|---|---|---|---|---|
| System transparency | Art. 11 (technical documentation), Art. 13 (instructions for use) | authority, deployer | before operation | ”How does the system work in general?” |
| Individual-case explanation | Art. 86 AI Act, Art. 15(1)(h) and Art. 22 GDPR | the affected person | after the decision | ”Why did this one decision come out this way?” |
Article 86 gives any person subject to a decision based on a high-risk AI system listed in Annex III, with legal or similarly significant effect, a claim against the deployer: to “clear and meaningful explanations of the role of the AI system in the decision-making procedure and the main elements of the decision taken”. That is case-specific language, not system-specific language. A model card describes the model. Article 86 asks about the case.
The European Court of Justice has already spelled out the same principle for the GDPR - in two rulings that anyone responsible for AI decisions should know:
SCHUFA, C-634/21 (December 2023): the automated calculation of a score is itself an automated individual decision within the meaning of Art. 22 GDPR, as soon as a third party draws strongly on it. The consequence reaches far beyond credit scoring: the traceability obligation attaches to the machine evaluation itself - not only to the downstream final decision that a human signs.
Dun & Bradstreet Austria, C-203/22 (February 2025): the controller must explain the procedure actually applied in a way that lets the person understand which of their data was used, and how, in the decision. A counterfactual explanation can suffice - for instance: which variation in the input data would have changed the result? And trade secrets do not justify blanket refusal; in case of conflict, the information must be disclosed to the competent authority or court, which then weighs the interests.
A counterfactual explanation per individual case presupposes that the actually used inputs and the applied rule are stored, structured, per decision. It cannot be reconstructed from a global system description.
Add to this a point that compliance debates routinely miss: Art. 12 obliges high-risk systems to log automatically over their lifetime, and Art. 26(6) obliges deployers to retain those logs for at least six months. But the law defines the required log content only for biometric systems. For HR and finance systems, the AI Act mandates the logging capability - and leaves the content schema open. What gets recorded per decision is a design responsibility of the deployer. Logging token counts and timestamps satisfies the letter of Art. 12 and fails Art. 86.
On timing: under current law the high-risk obligations apply from 2 August 2026; the postponement to 2 December 2027 provisionally agreed in the Digital Omnibus on 7 May 2026 has not yet been formally adopted. It changes nothing about the high-risk classification of HR processes (Annex III No. 4) - nor about the civil-law duty to explain decisions, which applies worldwide already. For UK readers: Article 86 does not apply domestically - but UK GDPR Art. 22 and the ICO’s guidance on automated decision-making impose the equivalent duty to explain, and Art. 86 applies in full to UK companies deploying high-risk AI in the EU. Deadlines and obligations in detail: EU AI Act 2026: status, deadlines, what to do.
There is no such thing as “an AI decision” - there are dozens of micro-decisions
Anyone taking Article 86 seriously in operations immediately hits a granularity problem: what exactly is “the decision”?
This decomposition is not tied to any jurisdiction. A credit decision breaks down by the same mechanics: identity check (rulebook), income extraction from documents (AI, with confidence), score calculation (rulebook, versioned), threshold routing (rulebook), borderline vote (human). Which law binds the individual micro-decision - GDPR Art. 22, UK GDPR, LGPD Art. 20 or the US Equal Credit Opportunity Act - depends on the market. The decomposition itself is the same everywhere. How deep it reaches in a regulated process is shown by the example that keeps European HR departments busy every day:
Processing a sick note - that sounds like one transaction. In fact it contains more than a dozen separate decisions: Was the electronic certificate retrieved correctly under §109 SGB IV (German Social Code)? Does continued pay apply under §3 EFZG (Continued Remuneration Act) - and at what amount under §4? Is this a repeat illness with offsetting under §3(1) sentence 2? Has the six-week threshold been reached that triggers the mandatory reintegration-management invitation under §167 SGB IX? Must the representative body for severely disabled employees be involved under §178 SGB IX? Who gets informed about what - and what, explicitly, not?
The Decision Layer decomposes every business process into exactly these micro-decisions and assigns each one, in advance, to one of three decider types - human, rulebook, or AI:
| Micro-decision (sick note) | Decider | Why |
|---|---|---|
| eAU retrieval under §109 SGB IV incl. blocking period | Rulebook | deterministic, no room for interpretation |
| Calculate continued pay under §3, §4 EFZG | Rulebook | fixed deadline and amount arithmetic |
| Check offsetting for repeat illness | Rulebook | deterministic 12-month rule |
| Trigger reintegration (BEM) invitation at six weeks, §167 SGB IX | Rulebook | statutory threshold |
| Flag unusual absence patterns statistically | AI (indicator) | anomaly detection; output is a review hint, not a personnel decision |
| Conduct the reintegration meeting | Human | relationship of trust, §167 procedure |
| Illness-related termination | Human | three-step test under dismissal-protection law, reversed burden of proof under AGG §22 |
This decomposition is not an academic exercise - it is the precondition for contestability. The ECJ’s SCHUFA logic applies at every level: if an automated score is already a decision in the legal sense, then the machine deadline check, the machine document classification, the machine grading recommendation are each a potentially contestable act - not just the final outcome. A company that documents only the final outcome can answer the question “Why was my application rejected?” only with: “That is what the process produced.” Which is precisely the black-box answer no employment court accepts.
How deep this decomposition goes in practice is documented in the HR agent catalog: 48 agents, each with a complete micro-decision table - decider type, justification, legal basis, and contestation path per step.
Why logs, traces, and model cards do not answer the question
The industry has responded to the documentation need - but on three levels that all miss the individual decision:
| Level | Tooling | Documents | Does NOT answer |
|---|---|---|---|
| Interaction | chat logs, conversation history, Copilot audit events | who talked to the system when, which resources were touched | which business rule carried the decision |
| Infrastructure | LLM observability (traces, spans), OpenTelemetry GenAI | prompts, tool calls, tokens, latency along the execution | why the result came out this way - and whether it is correct |
| System | model cards, technical documentation, governance platforms (model inventory, risk register) | how the model works in general, what risks the system has | any specific individual case |
This is not a polemic - it is what the vendor documentation says. Microsoft’s Purview documentation states that the Copilot audit log does not record the actual user prompts and responses, and that model name and model version are not available in Microsoft 365 Copilot scenarios. Within the Salesforce ecosystem the gap is named openly: classic audit trails were built for humans clicking buttons - you can see that something changed, but not why. And governance platforms such as IBM watsonx.governance or Credo AI answer the question “Is this system inventoried, risk-assessed, and policy-compliant?” - an important question, but a different one from “What was Ms M.’s grading on 14 June at 10:42 based on?”.
That leaves the seemingly obvious way out: let the model justify itself. Reasoning models produce justification chains - why not archive those as evidence? Because they demonstrably fail to reflect what the decision was actually based on. Anthropic’s research on chain-of-thought faithfulness shows that even under favourable conditions, reasoning models verbalise the decision cues they actually used in only about a quarter of cases - relevant influences stay hidden even from someone who reads the entire reasoning chain. A generated justification is text about the decision. It is not the basis of the decision.
Contestability research draws the decisive conclusion: explainability and contestability are two means to the same end - and explanation alone, without a path to contest, misses it (Schmude et al., 2025). Contestability presumes a decision may be incorrect, and demands the evidence to overturn it. Post-hoc explanation methods show, at best, that errors may exist somewhere in the neighbourhood of a decision; they do not prove that this one decision was flawed. And Moreira et al. (2025) shift the standard from mere comprehension to agency: an explanation must enable affected stakeholders to actively challenge, scrutinise, and influence an outcome - not merely understand why it was reached.
Nobody can do that with a token trace. With a decision act, they can.
The decision act: the piece of evidence per decision
The decision act is the atomic, immutable, structured record of a single substantive micro-decision. It is created at the moment of decision - not as an after-the-fact reconstruction - and contains everything that Art. 86, Art. 12, and the ECJ case law require per individual case:
| Field | Content | What it serves legally |
|---|---|---|
| Input | which data, which document, which hash | C-203/22: “which data was used, and how” |
| Business rule + version + source | e.g. collective agreement §12(3), version 2024-Q3, effective 2024-07-01 | the carrying justification - the field logs and traces do not have |
| Decider type | human, rulebook, or AI - with routing justification | Art. 14: proof of effective human oversight |
| Confidence | confidence value and threshold at decision time | evidence of risk management (Art. 9), escalation traceability |
| Model and prompt version | which language model, which prompt template, which version | reproducibility; the field missing from the Copilot audit log |
| Result + timestamp | what was decided, when (UTC) | retention under Art. 26(6), GoBD-style write-once requirements |
| Human-in-the-loop note | who reviewed, whether they confirmed or deviated | Art. 14(4): override, reverse, stop - documented |
| Contestation path | who can contest this decision, and where | makes the Art. 86 claim operationally answerable |
This is what it looks like in the record of a real case - four of 13 micro-decisions of a sick-note process (German example):
eAU received 14 May 2026, 07:12. Case processed in 54 seconds - decomposed into 13 documented micro-decisions.
-
02Calculate continued pay under §3, §4 EFZG Rule
6 weeks from 14 May 2026, 100% gross
input: eAU period 14-27 May · master data EMP_0x52a8rule: EFZG-§3+§4 · v2026-01 -
06Check reintegration threshold (§167 SGB IX) Rule
42 sick days within 12 months reached - invitation triggered
input: absence history, 12 months (3 periods)rule: SGB-IX-§167 · v2025-07 -
09Flag absence pattern AI 91%
Notice to HR operations: third short-term absence in six months
input: anonymised absence metadata - no diagnosis (GDPR Art. 9)model-reason: pattern match short-term repeat absence · indicator, not a personnel decisionformally contestable · Art. 14 EU AI Act -
13Conduct reintegration meeting Human
Return-to-work plan agreed
decided-by: reintegration officer - 17 May 2026, 14:10
German administrative law offers a precise analogy for this artefact: the Verwaltungsakt, the formal administrative act. An authority deciding about a citizen does not issue “output” - it issues an act: individually addressed, justified, with legal basis and instructions on appeal, individually contestable. Nobody would accept an agency answering an appeal with: “Unfortunately we only have the server log.” The decision act transfers exactly this standard of form to machine decisions. A trail is a track you leave behind. An act is a document you issue - and against which an appeal is possible.
Accounting has known the same logic for centuries: no auditor accepts a total without journal entries and supporting documents. German GoBD rules require individual records, immutability, and traceability per business transaction - with retention periods of six to ten years under §147 AO and §257 HGB, far beyond the six months of Art. 26(6). The decision act is nothing other than double-entry bookkeeping for decisions: one document per transaction, one act per decision.
That no standard term exists yet for this unit is evident in the market itself: “decision trail”, “decision provenance”, “decision trace”, “accountability record” all circulate - TechTarget declares AI decision trails the new audit trail, the FINOS AI Governance Framework calls for “Agent Decision Audit and Explainability”. The requirement has arrived in the discourse. The artefact has no name yet. We have called it, since the beginning of the Decision Layer: the decision act.
Where the act is created becomes clear in the seven layers of a production agent stack - the decision act is the artefact of layer 04:
-
01
Presentation Layer
User interfaces, prompt gateways, access portals for business units and employee representation.
-
02
Orchestration Layer
Multi-agent coordination, event routing, state management across process boundaries.
-
03
Agent Layer Gosign core
Document, workflow and knowledge agents - the executing layer with AI judgement.
-
04
Decision Layer Gosign core
This is where the decision act is created: one act per micro-decision, versioned rulebooks, human-in-the-loop routing for discretion, contestability under Art. 14.
-
05
Governance Layer
Audit trail, signed records, cert-ready controls for EU AI Act Art. 9/12/13/14, ISO 27001, ISA.
-
06
Model Layer
Model registry, versioning, confidence calibration - model-agnostic.
-
07
Integration Layer
Connectors to SAP, DATEV, Workday - posting-ready outputs into the target systems.
Contestability is an architecture decision - not a feature
The uncomfortable consequence of all this: contestability cannot be retrofitted. Three reasons why an after-the-fact compliance tool structurally fails:
First: what was never modelled as a decision cannot be justified as one. An agent that processes a case “in one go” produces a result without structure. The decomposition into micro-decisions must happen before execution - it defines where an act is created, which rule applies there, and who decides there. No decision structure can be extracted from a conversation log afterwards, any more than a cost-centre accounting can be extracted from a bank statement.
Second: effective human oversight needs the decision context - per case. Art. 14(4) requires that the overseeing person can correctly interpret the output, disregard or reverse it, and stop the system - while remaining aware of the tendency to rely automatically on machine output. The legislator names automation bias explicitly. The US NIST warns in the same spirit against ceremonial human-in-the-loop: an approval button without context, authority, and time protects nobody. Whoever is supposed to override needs the input, the applied rule, the confidence, and the escalation reason in front of them - precisely the fields of the decision act. That is why human-in-the-loop in the Decision Layer is an enforced routing, not an optional click: for defined decision types the workflow pauses, the target system remains untouched, and the human decision is itself documented as an act - including whether it followed the recommendation or deviated from it.
Third: correction needs the rule reference. Contestation is not an end in itself - it is meant to fix errors. When a decision is successfully contested, the immediate question is: was it an individual error or a rule error? A decision act that carries the applied rule version answers that with a query: all acts under the same rule version are identifiable, the rule is corrected in a new version, the affected cases are re-decided in a targeted way. A contested individual case becomes a fixed error class. Without the rule reference, the only option is manually reviewing every case - the opposite of scaling.
Research calls this principle contestability by design (Alfrink et al., 2022): systems must remain open to human intervention across their lifecycle - as a procedural relationship between decision subjects and human controllers, built in rather than bolted on. And Deloitte (2025) names the three governance building blocks enterprises lack for agents almost verbatim to this architecture: clear boundaries defining which decisions agents may take independently versus which require human approval; real-time monitoring that tracks agent behaviour and flags anomalies; and audit trails that capture the full chain of agent actions to ensure accountability. The three-decider framework, confidence routing, and the decision act are the technical answer to exactly that list - as a design principle, not as a retrofitted compliance layer.
Who contests - and what happens next
Contestability sounds abstract but has four very concrete faces. The Decision Layer defines, for every micro-decision, to whom it must justify itself:
| Who contests | Legal basis | What they see | What follows |
|---|---|---|---|
| The affected person | Art. 86 AI Act, Art. 15, 22 GDPR | the explanation of their own decision: role of the AI system, main elements, counterfactual on request | correction of the individual case; on a rule error, re-run of all affected cases |
| The works council | §87(1) No. 6 BetrVG (German Works Constitution Act), works agreements | acts of personnel-relevant decisions, proof that works-agreement rules were technically enforced | co-determination based on evidence instead of distrust |
| The auditor | GoBD, IDW PS, ISA | complete acts per posting: document, rule, version, confidence, approval | system audit instead of case-by-case reconstruction; any sample directly evidenced |
| The supervisory authority | Art. 26(6), Art. 12 AI Act; market surveillance | exportable, machine-readable logs across the required retention period | ability to respond within days instead of a forensics project |
Note what this table does to an organisation: it reverses the burden-of-proof dynamic. Without decision acts, every contestation is a forensics project with an open outcome - and under AGG §22, in discrimination cases the employer carries the burden of proof as soon as indications exist. With decision acts, the answer is a query. The works council that blocks AI projects today because it gets no transparency becomes the beneficiary of the same infrastructure that serves the auditor. Others promise transparency. The decision act enforces it technically.
The table has a quiet flip side: who may contest is set by law and agreement - but no norm by itself ensures that decision acts come into existence, are complete and endure. That responsibility needs a clear owner in the organisation; it does not happen as a by-product of operations.
Context: a principle is taking hold
An honest look at the 2026 market: the insight that agent decisions must be individually provable is not exclusive. Gartner has created a market category of its own for control layers that monitor and block agent actions - guardian agents - and predicts they will account for 10 to 15 percent of the agentic AI market by 2030. The decision intelligence field - explicitly modelling, evaluating, and improving decisions - got its first Magic Quadrant in January 2026. The WEF calls for progressive governance: oversight intensity coupled to the agent’s autonomy level. Research is working on decision provenance. The direction is unambiguous - and it is exactly the direction of the Decision Layer.
What these approaches do not deliver is the binding to the business substance. A guardian agent blocks at runtime; it leaves no act that holds up in an employment court three years later. Observability preserves the execution chain; it does not decompose decisions into individually justified, rule-bound artefacts. The defensible difference of the decision act lies in three properties: the attribution of every micro-decision to a versioned business rule with its source - collective agreement, works agreement, tax law - rather than to a generated reasoning chain; architecturally enforced human-in-the-loop per decision type rather than a configurable setting; and the mapping of acts onto the audit standards that auditors and internal audit actually work with - ISA, IDW PS, GoBD. To our knowledge, no horizontal vendor documents the first field; it is also the field on which contestability hinges.
Precision also requires the disclaimer: the decision act is an architecture statement, not a certificate of conformity. Whether a specific system meets the legal requirements remains a case-by-case assessment by the deployer and its legal advisers. What the architecture delivers is something else - and more valuable: it makes the requirements satisfiable. A company with decision acts can answer the Article 86 question. A company with chat logs cannot, however well-documented its intentions.
The real question
The EU AI Act did not invent the explainability of individual decisions - it codified it for AI systems and armed it with a right of the affected person. The ECJ set the standard before the first high-risk deadline even bites. And the consultancies have quantified what happens when companies transfer decision rights without being able to prove decisions: cancelled projects, blocked rollouts, personally accountable CIOs.
The question for any company deploying AI agents is therefore not “Are we compliant?”. It is: if tomorrow an employee, a works council, an auditor, or an authority contests one single decision made today - can you produce the act? Input, rule, version, confidence, result, approval?
Whoever can answer that with yes no longer has an AI risk problem. They have a decision infrastructure.
→ Decision Layer - overview and examples
→ Decision Layer explained: the four components
→ EU AI Act: HR AI remains high-risk - deadline 2026/2027
Book a meeting - We will show you, on one of your processes, what micro-decisions, decision acts, and contestation paths look like in practice.

Bert Gogolin
CEO & Founder, Gosign
AI Governance Briefing
Enterprise AI, regulation, and infrastructure - once a month, directly from me.