AI 底层逻辑 / 经典论文

AI EventStorming：Agent 工作流发现

一句话:

246 行ai-foundations/papers/84-event-storming-agent-workflow-design.md

AI EventStorming / Agent Workflow Discovery 解读

面向对象: Senior BA / Product Architect / Solution Architect / AI PM / Process Owner / Domain Expert。核心问题: AI agent workflow 不能只靠 PRD 和流程图设计。Agent 会触发命令、调用工具、产生事件、遇到策略门禁和异常补偿。EventStorming 能把真实业务事件、决策热点、系统边界和 AI 插入点摊开, 让团队发现风险和设计机会。学习目标: 用 EventStorming 的 event、command、actor、policy、external system、hotspot、read model 映射 AI workflow、tool call、HITL、exception、compensation 和 eval trace。

Source Anchors

Source	Link	用途
EventStorming	https://www.eventstorming.com/	参考事件风暴的事件、时间线、协作建模和复杂业务探索
Domain Language / DDD	https://www.domainlanguage.com/ddd/	将事件风暴与 bounded context、domain event、ubiquitous language 连接
NIST AI RMF	https://www.nist.gov/itl/ai-risk-management-framework	将 AI workflow 的风险、评估和控制连接到治理闭环

一句话:

AI EventStorming 是用业务事件链发现 AI 应该在哪里辅助、建议、决策、执行、监控或退出, 并把每个 agent 行动变成可追踪事件和控制点。

1. 为什么 AI Workflow 不能只靠 PRD

PRD 常描述:

用户要什么功能。
系统显示什么界面。
AI 生成什么内容。

AI workflow 还需要知道:

哪个业务事件触发 AI。
AI 发出什么 command。
哪个 actor 负责确认。
哪个 policy 决定是否允许。
哪个 external system 被调用。
哪个 new event 表示动作完成。
失败后如何补偿。
哪些事件进入 eval 和审计。

传统流程图容易隐藏:

非 happy path。
跨部门交接。
手工 workaround。
事件顺序。
状态变更。
工具副作用。
审计证据。

EventStorming 的优势是把“发生过什么”放在中心, 这非常适合设计 agentic workflow。

2. EventStorming-to-AI Mapping

EventStorming 元素	AI workflow 映射	示例
Domain Event	已发生的业务事实	`DisputeSubmitted`, `AlertEscalated`
Command	触发系统或人的动作	`DraftResponse`, `RequestDocuments`
Actor	发出命令或承担责任的人/系统	Customer、Analyst、AI Agent、Supervisor
Policy	决定何时触发命令的规则	high amount -> manual review
External System	被调用系统	CRM、case system、core banking
Read Model	决策所需视图	customer profile、case summary
Hotspot	不确定、冲突或风险点	advice boundary、missing data
Time Axis	事件发生顺序	SLA、等待、升级、补偿

AI 的插入点不是“哪里能生成文字”, 而是“哪个 command 可以被 AI 辅助、建议、决策或执行”。

3. AI Insertion Patterns on Event Chain

Pattern	事件链中的位置	控制重点
Assist	actor 发出 command 前提供摘要/证据	citation、completeness、human review
Recommend	policy 之前推荐下一步	rationale、counterfactual、override
Decide	policy 自动选择路径	DMN/rule alignment、audit、appeal
Act	command 调用工具	permission、approval、idempotency、rollback
Monitor	event 后监控异常	alert threshold、false positive、triage
Simulate	event 前预测影响	scenario validity、assumption record

事件链示例:

CustomerDisputeSubmitted
  -> AI drafts dispute type recommendation
  -> Policy checks amount/risk/customer segment
  -> Human approves high-risk route
  -> CaseCreated
  -> AI requests missing evidence
  -> DocumentsReceived
  -> Agent drafts resolution note
  -> Supervisor approves
  -> CustomerNotified

4. Agent Workflow Trace

每个 agent 行动都应能追踪:

trigger event
  -> command
  -> context read
  -> model/tool route
  -> policy gate
  -> human checkpoint
  -> output/action
  -> resulting event
  -> evidence

Trace Field	用途
Trigger event	为什么 AI 被触发
Command	AI 准备做什么
Context read	AI 读取了哪些数据/知识
Policy gate	哪些规则允许/拒绝/升级
Tool call	调用什么系统, 副作用是什么
HITL checkpoint	谁批准或 override
Resulting event	业务状态如何变化
Evidence	用于审计和 eval 的记录

这条 trace 能连接 PRD、BPMN、DMN、ADR、eval、audit binder。

5. Exception and Compensation

AI agent 必须设计异常和补偿, 否则无法进入生产。

Failure	Compensation
Tool timeout	retry with idempotency key, then queue manual repair
Wrong document request	cancel request, notify customer, record correction
Policy conflict	stop automation, escalate to policy owner
Missing evidence	ask for clarification or route to human
Low confidence	assist-only mode, human decision
Unauthorized action	deny, security alert, incident review
Bad output sent	correction message, complaint path, postmortem

事件风暴时要显式贴出这些事件:

AIActionRejected
HumanOverrideRecorded
CompensationStarted
IncidentRaised
PolicyConflictDetected
CustomerCorrectionSubmitted

6. Financial Retail Case: Payment Dispute

Big picture event storm:

TransactionFlagged
CustomerDisputeSubmitted
EvidenceRequested
EvidenceReceived
DisputeTypeClassified
CaseRouted
ProvisionalCreditAssessed
MerchantContacted
ResolutionDrafted
SupervisorApproved
CustomerNotified
AppealSubmitted
CaseClosed

AI opportunities:

Event / Command	AI Role	Risk
EvidenceRequested	recommend missing docs	wrong burden on customer
DisputeTypeClassified	classify type	wrong path, SLA impact
ResolutionDrafted	draft explanation	policy breach, wrong promise
AppealSubmitted	summarize new evidence	missed evidence
CaseClosed	monitor complaints/reopen	false closure

Design decision:

AI can assist and recommend in early stages。
AI cannot auto-close high-value or vulnerable customer cases。
Supervisor approval required before customer-facing resolution。
Every AI suggestion creates traceable event and evidence。

7. Artifact Templates

AI Event Storm Board Schema

Element	Color / Tag	Example
Domain Event	event	`CaseEscalated`
Command	command	`DraftResolution`
Actor	actor	AI Agent / Analyst / Customer
Policy	policy	high risk requires review
External System	system	CRM / core banking
Hotspot	hotspot	advice boundary
Eval Tag	eval	groundedness, tool safety
Control Tag	control	HITL, audit, rollback

Hotspot-to-Eval Map

Hotspot	Risk	Eval Case	Control	Owner
AI may ask wrong documents	customer friction	missing-doc scenario set	human review	Ops Lead

8. ADR Draft

项目	内容
决策	高风险 agent workflow 在 PRD 前进行 AI EventStorming, 并输出 event-command-policy-tool-HITL trace
背景	Agent workflow 涉及事件、命令、工具副作用、异常补偿和审计证据, 传统 PRD 容易遗漏
替代方案	只写 user stories; 只画 BPMN; 直接 prototype
选择理由	EventStorming 能暴露领域事件、热点、跨系统边界和异常路径
影响	需要 domain expert、BA、PM、architect、risk、ops 共同参与
反转条件	低风险单步辅助工具可使用轻量 event trace

9. 面试表达

30 秒版本

AI EventStorming 是把 agent workflow 从“功能描述”还原成业务事件链。我们用 domain events、commands、actors、policies、external systems 和 hotspots 找出 AI 应该辅助、建议、决策、执行或退出的位置, 同时设计 tool gate、HITL、补偿和证据 trace。

2 分钟版本

我不会直接从“做一个 agent”开始。我会和业务、风险、运营、技术一起做 event storm: 先贴出真实业务事件, 再找触发命令、负责 actor、规则 policy、外部系统和热点。然后把每个 AI 插入点映射成 assist/recommend/decide/act/monitor, 并定义 policy gate、人工检查点和 resulting event。比如支付争议流程, AI 可以推荐缺失材料、分类 dispute type、草拟说明, 但高金额或脆弱客户场景不能自动关闭案件。每个 AI action 都要有 trace id、source evidence、human override 和补偿路径。

Solution Architect / BA Lead 版本

EventStorming 对 AI 的价值是发现架构风险。它能把 domain event、tool side effect、policy rule、HITL queue、exception compensation 和 eval case 连成一条链。这样 BA 不是只写需求, 而是在设计可运营、可审计、可恢复的 AI workflow。