What Are AI Agents? Document Agents, Workflow Agents, Knowledge Agents Explained
AI agents: Document Agents, Workflow Agents, Knowledge Agents. How they execute domain tasks autonomously and differ from chatbots and RPA
What Is an AI Agent?
An AI agent is a specialized software component that autonomously executes domain-specific tasks based on Large Language Models (LLMs). Unlike a chatbot, which answers questions, an agent takes action: it reads a document, evaluates the content, makes a decision, and triggers an action in a target system.
AI agents in an enterprise context do not operate freely. They work within defined boundaries: rule sets, scopes, confidence thresholds, escalation rules. Every agent decision is traceable and auditable.
The foundation of an AI agent is a language model - Claude, ChatGPT, Gemini, Llama, Mistral, DeepSeek, or gpt-oss. The model provides language comprehension. The agent provides the domain logic, system integration, and governance.
Distinction: AI Agent vs. Chatbot vs. RPA
These three concepts are frequently confused. They solve different problems.
Chatbot: A chatbot answers questions in natural language. It has no capacity for action. When an employee asks “How many vacation days do I have left?”, the chatbot answers. It does not book vacation, it does not check rules, it does not generate an audit trail.
RPA (Robotic Process Automation): RPA automates rule-based, repetitive tasks via user interfaces. An RPA bot clicks through SAP, copies data from A to B, fills out forms. RPA has no language comprehension. When the form changes, the bot breaks.
AI Agent: An AI agent understands context, interprets unstructured data, and makes decisions. It reads an invoice, regardless of format, understands the content, applies rule sets, and generates a booking proposal. When the invoice format changes, the agent continues to work because it understands the content, not the layout.
| Property | Chatbot | RPA | AI Agent |
|---|---|---|---|
| Language comprehension | Yes | No | Yes |
| Capacity for action | No | Yes (rule-based) | Yes (context-based) |
| Unstructured data | Yes | No | Yes |
| Governance/Audit | No | Partial | Yes (via Decision Layer) |
| System integration | Superficial | Via UI | Via API/Integration Layer |
| Adaptation to changes | Change prompt | Reprogram bot | Agent understands context |
Three Types of Enterprise AI Agents
In the Gosign architecture, there are three agent types. Each type has a defined scope of responsibility and operates within the boundaries set by the Decision Layer.
Document Agents
Document Agents read, understand, and process documents. Invoices, sick leave certificates, contracts, attestations, receipts, credit notes.
The key difference from OCR or template recognition: Document Agents have genuine language comprehension. They do not recognize fields at specific positions on a page but understand the content of the document. An invoice in PDF format, as a scanned image, or as an email attachment - the Document Agent understands all three.
A Document Agent for invoice processing reads an incoming invoice and extracts: vendor, amount, service description, date, tax rate, banking details. It produces a structured data record that is handed off to the Decision Layer.
Document Agents do not work in isolation. They are the entry point of a workflow - after reading the document, the Workflow Agent takes over.
Workflow Agents
Workflow Agents orchestrate processes across systems. They coordinate the flow between Document Agent, Decision Layer, and target system.
A Workflow Agent for invoice processing coordinates: the Document Agent reads the invoice, the Decision Layer reviews the booking proposal, at high confidence the booking goes to the target system, at low confidence it escalates to the case worker, after approval the booking is finalized, the entire operation is documented in the audit trail.
Workflow Agents also handle exceptions: What happens when the target system is unreachable? What happens during a timeout? What happens when the case worker does not respond? The Workflow Agent has escalation rules, retry mechanisms, and timeout logic.
Workflow orchestration runs on n8n or Camunda, depending on the complexity and compliance requirements of the client. Workflows are visually represented, versioned, and adjustable without programming.
Knowledge Agents
Knowledge Agents provide context-based answers from organizational knowledge. Company agreements, policies, collective bargaining agreements, compliance rules, internal policies.
The difference from a search function: a Knowledge Agent understands the question, searches the relevant context, and delivers an answer with source citation and rule version. When a case worker asks “Does the night shift bonus also apply to part-time employees in the Western tariff region?”, the Knowledge Agent provides the answer with a reference to the applicable company agreement in the current version.
Knowledge Agents use RAG (Retrieval Augmented Generation): organizational knowledge is indexed in a vector database. The agent retrieves the relevant passages and generates an answer based on these sources, not based on its training data.
Every answer includes: the source, the rule version, the validity date. No hallucinations, no fabricated rule references.
How AI Agents Work Together in Enterprise Architecture
The three agent types do not work in isolation. In a typical enterprise implementation, the Workflow Agent orchestrates the overall process and delegates to specialized Document and Knowledge Agents.
An example from HR case management: a sick leave certificate arrives (email with PDF attachment). The Document Agent reads the certificate and extracts: employee, time period, medical certificate, follow-up certificate. The Knowledge Agent checks: which rules apply to this employee? Collective agreement, company agreement, individual supplementary agreements. The Decision Layer evaluates: is the continued pay calculation correct? Are the deadlines met? The Workflow Agent coordinates the entire flow and ensures all systems are updated.
Each agent has its defined scope. No agent “decides” alone on business-critical operations. The Decision Layer sits between the agent and the target system and ensures governance.
Model-Agnosticism: The Agent Is Not the Model
A common misconception: the agent is not the language model. The model (Claude, ChatGPT, Llama) provides language comprehension. The agent provides the domain logic, the system integration, the governance.
In the Gosign architecture, the Model Layer is interchangeable. When a new model becomes available - more powerful, more cost-effective, with a better license - it can be integrated without changing the layers above. The business logic in the Decision Layer, the workflows, the rule sets remain unchanged.
An agent can also use multiple models: a cost-effective open-source model for pre-classification and a more powerful model for complex decisions. The routing between models is configurable.
This model-agnosticism prevents vendor lock-in. No organization is dependent on a single model provider.
Prerequisites for Enterprise AI Agents
AI agents in enterprise environments require more than a language model:
Governance: Every agent decision must be traceable and auditable. The Decision Layer ensures this.
Integration: Agents must be integrated with existing systems - SAP, DATEV, Sage, Workday, SuccessFactors, SharePoint. The Integration Layer decouples the agent logic from the target system.
Infrastructure: Agents need a runtime environment: LLM hosting, vector databases for RAG, workflow engine, API gateway. This infrastructure can be operated in the cloud, self-hosted, or hybrid.
Employee Representation: In many jurisdictions, employee representative bodies (such as works councils in Germany) have co-determination rights when introducing AI systems. The architecture must account for this from the outset.
More on this: AI Agents in Detail
Book a consultation - We will show you which agent type is the right fit for your process.