How to Audit an HR Workflow for Agent Readiness
Break down HR processes into decision points and classify each for human, rule-based or AI automation. With scoring.
What this methodology does
This article describes a repeatable method for evaluating whether an HR workflow - or specific parts of it - can be handled by an AI agent. The method produces three outputs: a decision map showing every individual decision point in the workflow, a classification of each decision point by type, and an agent readiness score that quantifies how much of the workflow is automatable.
The methodology is tool-agnostic. It does not assume any specific AI platform, vendor, or technology stack. It works equally for organizations evaluating their first agent deployment and for those expanding an existing one to new domains.
McKinsey (2024) estimates that 60-70% of work activities in HR departments can be partially automated with existing technology. The workflow audit identifies exactly which parts - decision point by decision point.
At a Glance - Workflow Audit for Agent Readiness
- A workflow audit decomposes HR processes into individual decision points and classifies each as human-only, rule-based, or AI-eligible.
- The output is three deliverables: a decision map, a classification per point, and an Agent Readiness Score quantifying automation potential.
- Typical HR workflows contain 10-30 decision points - most organizations underestimate this number significantly.
- The methodology is tool-agnostic: no specific AI platform, vendor, or technology stack is assumed.
- One workflow audit takes 2-4 hours. An organization can audit its top-10 workflows in approximately two weeks.
Step 1: Identify the workflow boundary
Before analyzing decision points, define what the workflow includes and excludes.
A workflow has a trigger (what starts it), a set of processing steps, and one or more endpoints (what constitutes completion). It also has a scope boundary - where it interfaces with other workflows.
Example: Sick leave processing
- Trigger: Employee submits a sick note (paper, email, portal upload, or manager notification)
- Endpoints: SAP record updated, manager notified, payroll adjusted if applicable, return-to-work action scheduled if threshold exceeded
- Scope boundary: This workflow does NOT include the return-to-work interview itself (that is a separate workflow). It does NOT include the BEM process (betriebliches Eingliederungsmanagement) - that is triggered by this workflow but managed separately.
Common mistake: defining the boundary too broadly. “Onboarding” is not one workflow - it is a collection of 5 - 8 sub-workflows (contract generation, IT provisioning, compliance documentation, workspace setup, training enrollment, buddy assignment, probation tracking). Audit each sub-workflow separately.
Step 2: Map every decision point
Walk through the workflow from trigger to endpoint and identify every point where a decision is made. A decision point is any moment where the process requires choosing between two or more paths based on input data.
Most organizations underestimate the number of decision points in their processes. The reason: experienced employees make many decisions unconsciously. They don’t perceive “checking if the sick note has a start date” as a decision - they just do it. But for an agent, every such check is an explicit decision that must be specified.
Method: The “what could go wrong” technique
For each step in the workflow, ask: “What could go wrong here? What would cause this step to fail or require a different action?” Each answer reveals a decision point.
Sick leave processing - decision map:
| # | Decision point | Question the process answers |
|---|---|---|
| 1 | Document receipt | Is this a sick note, a doctor’s certificate, a rehabilitation notice, or something else? |
| 2 | Completeness check | Does the document contain: employee name, diagnosis period start, diagnosis period end, doctor’s name, doctor’s signature? |
| 3 | Employee identification | Which employee does this belong to? (Name matching, employee ID, department) |
| 4 | Entity assignment | Which legal entity is this employee in? (Determines applicable rules) |
| 5 | Collective agreement lookup | Which collective agreement (Tarifvertrag) applies to this employee? |
| 6 | Continued pay eligibility | Is the employee within the 6-week continued pay period (§ 3 EFZG)? Has the waiting period (§ 3 Abs. 3 EFZG) been completed? |
| 7 | Duration assessment | Is this a single day, a short absence (≤3 days), or an extended absence? |
| 8 | Pattern detection | Does this absence, combined with previous absences, trigger any threshold? (e.g., >6 weeks total in 12 months → BEM obligation) |
| 9 | Payroll impact | Does this absence affect payroll? (Overtime cancellation, shift differential, bonus proration) |
| 10 | System update | What needs to be updated in SAP/SuccessFactors? (Absence record, time account, payroll flag) |
| 11 | Notification routing | Who needs to be informed? (Direct manager, HR business partner, payroll, works council if BEM triggered) |
| 12 | Follow-up scheduling | Is a follow-up action required? (Return-to-work date, BEM invitation, occupational health referral) |
Twelve decision points in a process that most organizations describe as “employee submits sick note, we process it.” The gap between perceived simplicity and actual complexity is typical.
Step 3: Classify each decision point
For each decision point, determine which of three types it belongs to. The classification criteria must be applied strictly - the most common error is classifying a decision as “rule-based” when it actually requires interpretation.
Type H: Human-only
The decision requires one or more of the following:
- Empathy or individual judgment - the outcome depends on understanding a specific person’s situation, not on applying a general rule
- Legal risk from automation - an incorrect automated decision would create significant legal, financial, or reputational exposure
- Co-determination mandate - the works council has explicitly required human involvement for this decision type
- Ethical sensitivity - the decision involves weighing competing interests where reasonable people would disagree
Test question: “If two different experienced professionals looked at the same case, would they reliably reach the same conclusion?” If no - it is Type H.
Type R: Rule-based (deterministic)
The decision follows explicit logic that can be written as an unambiguous if-then statement. All of the following must be true:
- The rule exists in writing - in a law, collective agreement, company policy, or documented procedure
- The input data is structured - names, dates, numbers, codes, categories - not free text requiring interpretation
- The output is deterministic - given the same input, the rule always produces the same output
- There are no exceptions that require judgment - or the exceptions are themselves rule-based (“if exception X, then apply rule Y instead”)
Test question: “Could I write this decision as a formula in a spreadsheet, with no human interpretation required?” If yes - it is Type R.
Type A: AI-eligible (probabilistic with boundaries)
The decision involves processing unstructured or semi-structured information - but within defined boundaries and with verifiable outcomes. All of the following must be true:
- The task is classification, extraction, or matching - not creation, judgment, or evaluation
- The boundaries are defined - the set of possible outcomes is known and finite
- The output can be verified - there is a way to check whether the AI’s decision was correct
- A confidence threshold can be set - the system can express certainty, and low-certainty cases can be escalated to a human
Test question: “Is this interpreting information against known categories, or is it making a judgment about a unique situation?” If interpreting against categories - it is Type A. If judging a unique situation - it is Type H.
Applied to our sick leave example:
| # | Decision point | Type | Reasoning |
|---|---|---|---|
| 1 | Document classification | A | Unstructured input (scanned document), but classifying into known categories (sick note, certificate, rehab notice). Confidence-scoreable. |
| 2 | Completeness check | A | Extracting fields from unstructured document. Known required fields. Verifiable. |
| 3 | Employee identification | A | Name matching against employee database. May require fuzzy matching. Confidence-scoreable. |
| 4 | Entity assignment | R | Employee ID → entity mapping is deterministic. Lookup table. |
| 5 | Collective agreement lookup | R | Entity + employee category → collective agreement. Deterministic. |
| 6 | Continued pay eligibility | R | Start date + absence history + § 3 EFZG rules. Pure calculation. |
| 7 | Duration assessment | R | Calendar calculation. Deterministic. |
| 8 | Pattern detection | R | Absence history + threshold rules. Deterministic (though the response to a triggered threshold may be Type H). |
| 9 | Payroll impact | R | Absence type + duration + pay rules. Deterministic. |
| 10 | System update | R | Determined by decisions 4 - 9. Execution step. |
| 11 | Notification routing | R | Routing rules based on entity, absence type, threshold triggers. Deterministic. |
| 12 | Follow-up scheduling | R | Based on duration, threshold, and policy. Deterministic - though the follow-up action itself (e.g., the BEM conversation) is Type H. |
Result: 0 human-only, 8 rule-based, 3 AI-eligible, 1 conditional (R for scheduling, H for the follow-up action itself).
Note: Decision 8 (pattern detection) is Type R for the detection itself, but the BEM process it triggers contains Type H decisions. This is important - the audit is about the decision point, not the downstream workflow.
Step 4: Calculate the agent readiness score
The score is a simple ratio that quantifies how much of the workflow is automatable.
Formula:
Agent Readiness Score = (Type R count + Type A count) / Total decision points × 100
For our sick leave example: (8 + 3) / 12 = 91.7%
This is a high score. It indicates that almost the entire workflow can be handled by an agent, with human involvement only for downstream actions (the BEM conversation, the return-to-work interview) that are separate workflows.
Score interpretation:
| Score | Interpretation | Deployment recommendation |
|---|---|---|
| >80% | High agent readiness | Strong candidate for immediate deployment. Most decision points can be automated. Focus governance design on the few Type H handoffs. |
| 60 - 80% | Moderate agent readiness | Good candidate for phased deployment. Automate Type R and Type A decisions first, keep Type H manual. Expect meaningful human-in-the-loop workload. |
| 40 - 60% | Mixed readiness | Consider automating only the rule-based sub-processes. The workflow may not justify full agent deployment but benefits from targeted automation of high-volume Type R decisions. |
| <40% | Low agent readiness | Not recommended for agent deployment in the near term. This workflow is judgment-heavy. Invest in process documentation and standardization first. |
Typical scores by HR domain (based on practitioner experience):
| Domain | Typical score | Reason |
|---|---|---|
| Payroll & compensation | 85 - 95% | Almost entirely rule-based |
| Time & attendance | 80 - 90% | High rule density, few judgment calls |
| Expense processing | 75 - 85% | Rule-based with some document classification |
| Onboarding administration | 60 - 75% | Mix of rule-based and semi-structured |
| Benefits enrollment | 65 - 80% | Rule-heavy but with eligibility edge cases |
| Recruiting screening | 40 - 55% | Significant judgment in evaluation |
| Performance management | 20 - 35% | Heavily judgment-based |
| Employee relations | 15 - 30% | Almost entirely empathy and judgment |
Step 5: Document the audit result
The audit produces a structured output that serves three purposes: it guides agent design (which decisions does the agent handle?), it informs the works council agreement (which decisions remain human?), and it supports the EU AI Act documentation requirement (why was this decision classified as automated?).
Minimum documentation per decision point:
- Decision point ID and description
- Classification (H, R, or A) with reasoning
- Rule source (for Type R): Which law, collective agreement, or policy defines this rule? Which version?
- Confidence threshold (for Type A): At what confidence level does the AI decide autonomously vs. escalate?
- Escalation path (for Type A and edge cases in Type R): Who handles the case when the agent cannot decide?
- Audit trail requirement: What must be logged for this decision? (Input data, rule applied, output, timestamp, agent or human)
This documentation becomes the specification for agent configuration. It is also the artifact that the works council reviews and the auditor examines.
Common classification errors
Error 1: Classifying as R when the rule has exceptions requiring judgment
“Overtime is paid at 150%.” Sounds rule-based. But: Does overtime during a public holiday get 200%? What about overtime during a shift that spans midnight and crosses into a public holiday? What if the employee has a individual contract that overrides the collective agreement?
If the exceptions require human judgment to resolve, the decision point is Type H for the exception path - even if the standard case is Type R. The correct approach: classify the standard case as R and add a separate decision point for exception handling, classified as H or A.
Error 2: Classifying as A when the AI output cannot be verified
“The AI evaluates whether the candidate is a cultural fit.” This is not Type A - there is no verifiable correct answer. Cultural fit is inherently subjective. Type A requires that the output can be checked: “The AI classified this document as a sick note” can be verified by a human looking at the same document.
Error 3: Conflating the decision with the downstream action
“Pattern detection triggers a BEM process” - the detection is Type R (threshold calculation), but the BEM process contains Type H decisions (reintegration planning). The audit must separate the trigger decision from the triggered workflow.
Error 4: Ignoring implicit decisions
“We check the sick note.” This sounds like one step. But it contains at least three decisions: Is it a valid document? Is it complete? Does it match the employee? Each must be classified separately. The audit must decompose until each decision point has exactly one question and one classification.
What this methodology does not cover
This audit determines agent readiness at the decision-point level. It does not determine:
- Sequencing and prioritization - which workflows to automate first (this depends on strategic criteria beyond readiness: transaction volume, governance complexity, organizational appetite)
- Technology selection - which AI platform, model, or vendor to use
- Organizational change - how to manage the transition for employees whose roles change
- Works council negotiation strategy - how to structure the co-determination process
These are strategic questions that sit above the workflow audit. The audit provides the fact base for those decisions.
Further reading
AI Governance Briefing
Concise updates on enterprise AI, regulation, and infrastructure. Once a month.