AI 底层逻辑 / 经典论文

AI Agent Autonomy：委派架构

一句话:

296 行ai-foundations/papers/99-ai-agent-autonomy-delegation-architecture.md

AI Agent Autonomy / Delegation Architecture 解读

面向对象: AI Product Architect / AI PM / Solutions Architect / Senior BA / AI Governance Lead。核心问题: Agent 的风险不在于它会不会聊天, 而在于它能不能代表人或组织采取行动。自主权必须被拆成可授权、可撤销、可审计、可降级的 delegation architecture。学习目标: 建立 AI autonomy levels、delegation contract、tool authority、human escalation、kill switch、runtime monitoring 和 evidence 的完整设计语言。

Source Anchors

Source	Link	用途
NIST AI RMF	https://www.nist.gov/itl/ai-risk-management-framework	用 Govern / Map / Measure / Manage 组织自主 agent 风险
EU AI Act, Regulation (EU) 2024/1689	https://eur-lex.europa.eu/eli/reg/2024/1689/oj	参考 high-risk AI、human oversight、transparency 和 risk management
OECD AI Principles	https://oecd.ai/en/ai-principles	参考 human-centred values、transparency、robustness、accountability
ISO/IEC 42001	https://www.iso.org/standard/81230.html	用 AI management system 管理责任、运行控制和持续改进
OWASP LLM Top 10	https://owasp.org/www-project-top-10-for-large-language-model-applications/	参考 excessive agency、tool misuse、prompt injection 等 agent security 风险

一句话:

Autonomy is delegated authority, not model intelligence.

1. 为什么 Autonomous Agent 是产品架构问题

很多 AI 产品会经历三个阶段:

assistant answers
  -> copilot suggests
  -> agent acts

风险也随之变化:

阶段	AI 输出	主要风险
Assistant	解释、总结、草稿	wrong answer, hallucination, unsafe advice
Copilot	推荐操作、生成工单、预填字段	automation bias, weak review, hidden assumptions
Agent	调 API、改状态、发通知、触发流程	unauthorized action, financial/customer harm, accountability gap

所以自主权不是一个 UI 开关。它是一个架构边界:

什么任务可以被委派。
谁授权 agent 做这件事。
agent 能读什么数据。
agent 能调用什么工具。
agent 能不能写入系统状态。
什么时候必须升级给人。
什么时候自动停止。
事后如何证明它做了什么。

2. Autonomy Level Taxonomy

Level	名称	Agent 权限	适用场景	典型控制
L0	Human-only	AI 不参与	法律决定、最终信贷审批	无 AI action
L1	Answer-only	只能解释和总结	政策问答、知识检索	citation、refusal、source freshness
L2	Draft-only	生成草稿, 人类提交	客服回复、SAR narrative 初稿	reviewer approval、diff view
L3	Recommend-action	推荐动作, 人类点击执行	case routing、next-best-action	confidence threshold、reason code
L4	Bounded-execute	在低风险边界内自动执行	更新非关键 CRM 字段、创建内部 task	allowlist、budget、rollback
L5	Conditional-autonomy	满足策略条件时自动执行, 异常升级	标准退款、KYC 文档追踪	policy gate、exception queue
L6	Supervised-multi-step	多步计划和工具调用, 过程受监控	AML case triage、贷款服务流程	supervisor、trace、checkpoint
L7	Restricted-domain autonomy	领域内高自治, 但有硬边界	内部运营优化、库存/排班建议	kill switch、SLO/KRI、periodic validation
L8	Enterprise autonomy	跨域行动	金融零售高影响场景通常不应默认开放	board-level risk acceptance

PM/架构师要避免一句话需求:

让 Agent 自动处理客户问题

应该改写成:

Agent 可以在 L3 推荐分流动作, 在 L4 自动创建内部 case task, 但不能自动承诺赔付、改变账户状态或发送监管敏感通知。

3. Delegation Contract

Delegation contract 是 agent autonomy 的核心 artifact。

字段	问题	示例
Delegator	谁委派	customer service manager, AML operations lead
Agent role	agent 代表谁工作	complaint triage agent
Task boundary	做什么	classify complaint, create case, draft response
Prohibited task	禁止做什么	deny claim, waive fee above threshold
Data boundary	能看什么	customer profile, complaint history, product policy
Tool boundary	能调什么工具	CRM read, case create, policy search
Write authority	能写入什么	case note, task assignment
Budget/time	资源边界	max 3 model calls, max 20 seconds
Confidence gate	何时允许继续	source support >= threshold
Human escalation	何时升级	vulnerable customer, legal threat, low confidence
Revocation	如何撤销	disable tool scope, revoke session token
Evidence	如何证明	trace id, tool call log, reviewer decision

没有 delegation contract 的 agent, 实际上是一个不清楚责任边界的影子员工。

4. 架构模式

4.1 Approval-before-action

AI 可以计划和草拟, 但写入型动作必须等待人类确认。

适合:

客户通知。
账户状态变更。
费用减免。
投诉结论。

关键设计:

人类看到原始证据、AI 理由、动作影响。
审批 UI 明确显示该动作会改变什么。
审批记录进入 evidence trail。

4.2 Bounded execution

AI 可以自动执行低风险、可回滚、可监控的动作。

例子:

创建内部 follow-up task。
给 case 添加非最终备注。
把客户问题分类到队列。

边界:

allowlist only。
no irreversible action。
no customer commitment。
no hidden financial decision。

4.3 Supervisor agent

一个 supervisor 负责检查 worker agent 的计划、权限、证据和异常。

注意:

supervisor 不能只是另一个没有权限边界的 LLM。
supervisor 应该结合 policy engine、规则、trace 和 threshold。
高影响动作仍然需要 human gate。

4.4 Circuit breaker / kill switch

kill switch 不是失败后手工删代码。它应该是产品能力:

disable specific tool scope。
disable specific agent role。
stop a workflow class。
force all actions to draft-only。
route all high-risk cases to human queue。

5. 金融零售案例

5.1 Customer service copilot

推荐 autonomy:

L1: 查政策、解释术语。
L2: 草拟回复。
L3: 推荐 case category。
L4: 自动创建内部 follow-up task。
禁止: 自动承诺赔付、修改合同、关闭投诉。

5.2 AML investigation assistant

推荐 autonomy:

L1/L2: 总结交易、草拟 case narrative。
L3: 推荐 red flags 和下一步调查。
L4: 自动拉取公开证据或内部只读数据。
禁止: 自动提交 SAR、自动解除客户风险标记。

5.3 Lending policy assistant

推荐 autonomy:

L1: 检索政策和解释条件。
L2: 草拟 policy rationale。
L3: 推荐需要人工复核的缺口。
禁止: 自动批准/拒绝贷款、自动改变 risk grade。

5.4 Wealth advisory assistant

推荐 autonomy:

L1: 教育性解释。
L2: 草拟 meeting note。
L3: 推荐 advisor review items。
禁止: 未经授权给出个人化投资建议或自动下单。

6. 风险与控制映射

Risk	Symptom	Control
Excessive agency	agent 可以调用超出任务所需的工具	least privilege tool scope
Unauthorized action	AI 替人提交高影响决定	approval-before-action
Silent escalation failure	应升级的 case 没升级	escalation rules + monitoring
Automation bias	人类无实质复核	reviewer UX + override tracking
Accountability gap	说不清谁批准了什么	delegation contract + trace
Prompt injection	外部内容诱导 tool call	tool gateway + instruction hierarchy
Model/vendor drift	同样任务行为变化	validation + release notes + revalidation
Weak revocation	出事后不能快速停用	kill switch hierarchy

7. PM / BA / Architect 分工

角色	关键产物
PM	autonomy level decision, value/risk tradeoff, customer impact boundary
Senior BA	task decomposition, exception taxonomy, escalation trigger, evidence requirement
Solution Architect	tool boundary, workflow state, policy gate, audit trail, rollback
Risk / Compliance	prohibited use, residual risk, approval criteria
Operations	reviewer queue, SLA, training, incident playbook

高级 BA 的价值不是写 user story, 而是把任务、权限、证据、例外和责任连接起来。

8. Autonomy Decision Record

# AI Autonomy Decision Record

Use case:
Agent role:
Business owner:
Risk tier:

Approved autonomy level:
Allowed tasks:
Prohibited tasks:
Read data scope:
Write/action scope:
Tool allowlist:
Human approval points:
Escalation triggers:
Kill switch owner:
Monitoring signals:
Evidence retained:
Review cadence:
Residual risk:
Decision:

9. 面试表达

30 秒版本:

我不会先问 agent 能不能做, 而是先定义它被委派了什么权力。我的做法是把 agent autonomy 拆成任务边界、数据边界、工具边界、写入权限、人工升级、撤销机制和运行证据。低风险可回滚任务可以 bounded execution, 高影响客户或金融动作必须 approval-before-action 或 human escalation。

2 分钟版本:

Agent 自主权本质上是 delegated authority, 不是模型聪明程度。设计时我会先给 use case 定 risk tier, 然后做 autonomy level decision。比如客服 agent 可以读政策、总结客户历史、草拟回复、创建内部 follow-up task, 但不能自动承诺赔付或关闭投诉。架构上我会使用 delegation contract: 明确 delegator、agent role、allowed task、prohibited task、data scope、tool scope、write authority、confidence gate、escalation trigger、revocation 和 evidence。对写入型动作, 使用 tool gateway、policy gate、approval-before-action、trace logging 和 kill switch。上线后监控不是只看 usage, 而是看 unauthorized action、escalation miss、override rate、incident signal、tool failure、customer complaint 和 reviewer load。这样可以把 AI agent 从 demo 变成可治理的生产系统。

10. Portfolio Exercise

选择一个金融零售 AI use case:

画出当前流程。
标出哪些步骤是 read、recommend、draft、execute。
为每个步骤分配 autonomy level。
写 delegation contract。
设计 escalation triggers。
设计 tool allowlist。
设计 kill switch。
设计 evidence query: 事故发生时如何复盘。

输出:

1 页 autonomy decision record。
1 张 tool authority matrix。
1 张 escalation flow。
1 个面试讲述版本。