AI 底层逻辑 / 经典论文

AI Contract-First Tool/API：契约优先集成

一句话:

426 行ai-foundations/papers/89-contract-first-ai-tool-api-design-openapi-asyncapi.md

Contract-First AI Tool / API Design with OpenAPI / AsyncAPI 解读

面向对象: AI Solutions Architect / Platform Architect / AI Product Manager / Integration Architect / Senior BA。核心问题: AI agent 不是靠 prompt 文本“请调用正确工具”就能安全集成企业系统。工具、API、事件、结构化输出、权限、审批、幂等和审计都需要契约优先设计。学习目标: 用 OpenAPI、AsyncAPI、JSON Schema、CloudEvents 和 contract testing 思维, 把 AI tool/API/event 集成设计成可验证、可版本化、可治理的企业能力。

Source Anchors

Source	Link	用途
OpenAPI Specification	https://spec.openapis.org/oas/latest.html	参考 HTTP API 契约描述, 用于 tool/API 的 operation、schema、security、response 设计
AsyncAPI Specification	https://www.asyncapi.com/docs/reference/specification/latest	参考事件驱动和消息系统契约, 用于 agent event、workflow event、notification event
JSON Schema	https://json-schema.org/	参考结构化输入输出、tool arguments、event payload、model response schema
CloudEvents	https://cloudevents.io/	参考事件 metadata 一致性, 支撑跨系统事件追踪和路由
OpenTelemetry	https://opentelemetry.io/docs/	参考 trace、metrics、logs, 把契约执行连接到 runtime observability 和 evidence

一句话:

Contract-first AI integration 是先定义 tool/API/event 的结构、权限、副作用、错误、版本和证据要求, 再让模型或 agent 在这些受控契约内行动。

1. 为什么 AI 工具调用不能靠 prompt 文本约定

低成熟度做法:

System prompt: You can call account lookup, refund, email, and case update tools.

这种做法的问题:

风险	说明
Tool boundary 模糊	模型不知道每个工具真实权限、副作用和限制
参数不稳定	自然语言约定无法保证字段、类型、枚举、必填项
错误处理弱	timeout、partial failure、validation error、business error 没有统一语义
审批缺失	高风险动作是否需要人工审批无法仅靠 prompt 保证
版本不可控	工具变化后 prompt、eval、调用端可能不同步
审计困难	无法证明当时调用的 contract、policy、payload、approval
安全漏洞	prompt injection 可能诱导调用越权工具或泄露数据

企业 AI agent 的集成原则:

Prompt can propose.
Contract constrains.
Policy authorizes.
Telemetry proves.
Human approves when risk requires.

2. Contract-First Taxonomy

AI 系统至少有六类契约:

Contract type	描述	代表技术/产物
Tool contract	Agent 可调用什么工具, 输入输出和副作用是什么	OpenAPI operation、JSON Schema、tool card
Event contract	Agent/workflow 发布或消费什么事件	AsyncAPI、CloudEvents
Structured output contract	模型必须返回什么结构	JSON Schema、constrained decoding、validator
Policy contract	哪些角色、数据、场景、风险等级允许调用	OPA/Cedar/DMN、policy matrix
Eval contract	契约如何被测试和回归验证	test cases、mock server、trajectory eval
Evidence contract	调用后必须产生哪些 trace、log、approval、audit record	OpenTelemetry spans、audit event schema

如果只定义 API schema, 但不定义副作用、权限、审批和证据, 仍然不是 AI-ready contract。

3. OpenAPI for AI Tools

OpenAPI 适合描述同步 HTTP API 和 tool operations。

3.1 Tool Operation Card

Field	Example
tool_id	`case.create_note`
operationId	`CreateCaseNote`
business purpose	为争议 case 添加运营备注
input schema	`caseId`, `noteText`, `source`, `confidence`, `traceId`
output schema	`noteId`, `createdAt`, `status`
side effect	写入 case management system
risk level	medium
required permission	`case:write_note`
approval	not required for internal note; required if note is customer-visible
idempotency	`Idempotency-Key` required
audit event	`case.note.created`
eval coverage	parameter correctness, unsafe note refusal, duplicate prevention

3.2 AI Tool Contract 扩展字段

普通 API contract 不够, AI tool contract 需要额外说明:

Extension	说明
`x-ai-risk-level`	read、draft、low-risk-write、high-risk-write、irreversible
`x-ai-human-approval`	none、sampled、required、dual-control
`x-ai-side-effect`	none、reversible、compensatable、irreversible
`x-ai-data-classification`	public、internal、confidential、restricted
`x-ai-policy-profile`	哪组 policy 决定 allow/block/approve
`x-ai-evidence-required`	trace id、approval id、actor、source citations、reason
`x-ai-eval-suite`	需要通过的 tool trajectory eval

这些可以作为内部约定嵌入 API governance, 也可以在 tool registry 中维护。

3.3 Error Contract

AI agent 必须理解错误类别, 否则会盲目重试或编造结果。

Error class	Agent behavior
validation_error	修正参数或请求澄清
authorization_error	停止调用并升级人工
policy_block	向用户说明无法执行或进入审批
business_rule_violation	根据规则提示替代路径
transient_error	有预算地重试
dependency_timeout	降级或排队
conflict_duplicate	使用 idempotency result

错误契约必须进入 eval, 不只是进入开发文档。

4. AsyncAPI and Event Contracts for Agents

Agent workflow 不一定都是同步请求。很多企业 AI 场景是事件驱动:

文档上传后触发 extraction。
风险告警触发 investigation assistant。
客户投诉触发 triage workflow。
人工审批完成后恢复 agent workflow。
model/prompt/eval 版本变更触发 regression run。

4.1 Event Contract

Field	Example
channel	`case.dispute.triage.requested`
producer	case management system
consumer	dispute agent workflow
payload schema	case id、customer segment、amount band、reason code、priority
CloudEvents metadata	id、source、type、subject、time、traceparent
ordering	per case id
retry	exponential backoff, DLQ after 5 attempts
idempotency	event id + case id
privacy	no full PAN, no unnecessary PII
audit	event stored with workflow trace

4.2 Agent Event Types

Event	Meaning
`agent.workflow.started`	Agent workflow 启动
`agent.tool.proposed`	模型提出工具调用
`agent.policy.blocked`	policy 阻断
`agent.approval.requested`	需要人工审批
`agent.approval.completed`	审批完成
`agent.tool.executed`	工具执行
`agent.workflow.completed`	工作流完成
`agent.workflow.failed`	工作流失败
`agent.workflow.escalated`	升级人工

这些事件让 agent 从黑盒推理变成可观测工作流。

5. JSON Schema for Structured IO

结构化输出契约可用于:

工具参数。
模型回答。
分类结果。
风险分级。
eval judge 输出。
人工复核表单。
事件 payload。

示例字段设计:

Field	Design rule
enum	对关键 decision 使用枚举, 不让模型自由造词
required	对审计和执行必须字段强制 required
format	对日期、email、uri、id 等指定格式
min/max	对金额、分数、置信度和步数设边界
additionalProperties	高风险 payload 默认关闭
reason	允许 explanation, 但不得作为执行依据
source_refs	需要引用时强制列出 source id/version

结构化输出的关键不是“方便解析”, 而是把 AI 输出变成 contract-bound object, 可以验证、审计、回归测试和版本管理。

6. Contract Governance

6.1 Versioning

Change	Compatibility	Required action
Add optional field	backward compatible	update docs and tests
Add required field	breaking	new version, agent update, eval rerun
Rename enum value	breaking	migration and compatibility layer
Relax validation	risky	security/risk review
Tighten validation	may break agent	regression eval
Change side effect	high risk	ADR, approval, release gate
Change policy requirement	high risk	risk signoff, evidence update

6.2 Contract Testing

Test	Purpose
Schema validation	参数和响应符合 contract
Mock server test	Agent 能在 contract 下完成任务
Negative test	不合法参数被拒绝
Policy test	不同 role/risk/data 下 allow/block/approve 正确
Idempotency test	重试不会重复执行副作用
Error behavior test	Agent 不编造结果, 能升级或降级
Telemetry test	每次调用都有 trace、span、audit event
Compatibility test	新版本不破坏已批准 workflow

6.3 Contract Review Board

AI tool/API/event contract 的 review 不应只由工程决定:

Role	Review focus
Product	工具是否支持真实用户任务, 是否引入过度自动化
Architect	边界、版本、可复用、依赖和演进
Security	权限、数据泄露、攻击面
Risk/compliance	高风险动作、审批、证据
Operations	错误处理、重试、SLO、incident
Data owner	输入输出数据分类和最小化

7. Financial Retail Case: Dispute Resolution Agent

Agent 可做:

读取交易详情。
读取争议政策。
生成 case summary。
创建内部 note。
请求人工审批。

Agent 不可做:

自动退款。
直接发送客户通知。
修改监管报告。
查看无关客户数据。

7.1 Tool Contract Matrix

Tool	Risk	Contract	Policy	Evidence
`transaction.lookup`	read	OpenAPI read operation	customer/case scope	trace span
`policy.search`	read	retrieval schema	approved source only	citation ids
`case.summarize`	draft	structured output schema	no customer send	summary eval
`case.create_note`	medium write	OpenAPI write operation	case:write_note	audit event
`approval.request`	workflow	event contract	required for high-risk	approval id
`refund.execute`	high-risk write	unavailable to agent	human-only	separate workflow

7.2 Event Flow

case.dispute.triage.requested
  -> agent.workflow.started
  -> agent.tool.proposed(transaction.lookup)
  -> agent.tool.executed
  -> agent.tool.proposed(policy.search)
  -> agent.tool.executed
  -> agent.output.generated(case_summary)
  -> agent.approval.requested if high amount
  -> agent.workflow.completed or escalated

每个 event 都带:

traceparent。
case id。
workflow id。
contract version。
policy decision。
actor / agent identity。

7.3 Release Gate

Gate	Required evidence
Contract review	tool cards、schemas、side-effect matrix
Security review	permission test、prompt injection tests
Eval review	trajectory eval、negative cases、error behavior
Ops review	retry、DLQ、idempotency、dashboard
Risk approval	high-risk action policy、HITL rule
Release	trace sample、audit event sample、rollback plan

8. Product Requirements for Contract-First AI

高级 PM/BA 写需求时, 不只写“Agent 可以查询交易”, 而要写:

Requirement dimension	Example
Business capability	Agent can support dispute triage by collecting relevant transaction and policy evidence
Allowed action	Read transaction details for the active case only
Prohibited action	Agent cannot initiate refund or customer notification
Contract	Must use approved `transaction.lookup` OpenAPI operation
Data boundary	No full card number in prompt or trace
Error behavior	If lookup fails, escalate to human and do not infer transaction details
Eval	Tool selection and argument correctness >= threshold
Evidence	Tool call span includes trace id, case id, policy decision, response status

这类需求比普通 user story 更适合 AI 系统, 因为它把“能做什么”和“如何安全地做”放在一起。

9. Common Failure Modes

Failure mode	表现	修正
Prompt-only tools	工具边界只写在 prompt 里	建立 tool registry + schema + policy
Schema without side effect	只定义字段, 不定义风险	加入 risk、approval、idempotency、audit
No negative cases	只测成功调用	测越权、错误参数、超时、业务规则拒绝
Breaking change surprise	API 改了, agent 没跑回归	versioning + compatibility eval
Event without trace	事件能跑但无法查证	CloudEvents metadata + trace context
Agent sees too many tools	工具暴露过宽	capability discovery by role/risk/task
Audit afterthought	上线后才补日志	evidence contract before implementation

10. Templates

10.1 Tool Contract Card

Field	内容
Tool name	工具名称
Business purpose	支持的业务任务
Contract link	OpenAPI operation / schema
Input / output	字段、类型、枚举、required
Data classification	输入输出数据等级
Side effect	none / reversible / compensatable / irreversible
Risk level	read / draft / write / high-risk
Authorization	role、scope、policy profile
Approval	none / sampled / required / dual-control
Idempotency	key、dedupe window
Error behavior	validation、auth、policy、business、transient
Telemetry	required spans、metrics、audit event
Eval suite	positive、negative、trajectory、regression

10.2 Event Contract Card

Field	内容
Event type	事件名称
Producer / consumer	生产者和消费者
Trigger	什么时候发出
Payload schema	JSON Schema / AsyncAPI message
Metadata	id、source、subject、time、traceparent
Ordering	ordering key
Retry / DLQ	重试和死信策略
Privacy	不允许出现的数据
Evidence	事件存储、trace、audit
Compatibility	版本策略

10.3 Compatibility Policy

Rule	Policy
Required field change	新 major version
Enum removal	新 major version
Side-effect change	ADR + security/risk review
Policy relaxation	risk approval
Error contract change	eval regression
Deprecation	通知、迁移窗口、consumer inventory

11. 面试表达

30 秒版本:

我会用 contract-first 设计 AI agent 的工具和事件集成。Prompt 可以让模型提出工具调用, 但真正的边界要由 OpenAPI/AsyncAPI/JSON Schema、policy、approval、idempotency 和 telemetry 定义。这样工具调用可以被验证、版本化、审计和回归测试, 不会把企业系统暴露给不可控的自然语言约定。

2 分钟版本:

以争议处理 agent 为例, 我会把 transaction lookup、policy search、case note、approval request 都做成工具或事件契约。每个契约说明输入输出、数据等级、副作用、权限、审批、幂等、错误行为和审计事件。Agent 只能发现符合当前 role/risk/task 的工具; 高风险动作不暴露或必须走 HITL。上线前跑 schema validation、negative tests、policy tests、trajectory eval 和 telemetry evidence check。这样 AI 集成不是 demo glue code, 而是企业级集成架构。

深挖追问:

追问	回答要点
OpenAPI 和 AsyncAPI 怎么分工	同步 HTTP tool 用 OpenAPI, 事件驱动 workflow 用 AsyncAPI
JSON Schema 的价值	把模型输入输出变成可验证对象
如何控制 tool side effect	side-effect classification、policy、approval、idempotency、audit
如何处理版本变化	compatibility policy、consumer inventory、regression eval
如何防 prompt injection	最小工具暴露、policy enforcement、schema validation、negative tests

12. Practice Assignment

为一个 AI agent use case 设计 contract pack:

3 张 tool contract cards。
2 张 event contract cards。
1 个 structured output schema。
1 张 side-effect matrix。
1 套 negative contract tests。
1 条 release gate: contract evidence completeness。

完成标准:

每个 tool 都有 risk、permission、approval、idempotency、audit。
每个 event 都有 schema、metadata、retry、trace。
至少 5 个 negative cases 覆盖越权、错误参数、policy block、timeout、business rule。
能解释为什么某个高风险动作不应该暴露给 agent。