AI Contract-First Tool/API Design / OpenAPI / AsyncAPI Playbook
本手册解决一个企业 AI 落地中的硬问题:
AI Contract-First Tool / API Design with OpenAPI and AsyncAPI Playbook
受众: AI Product Architect, AI Platform PM, Enterprise Architect, Solution Architect, Integration Architect, 高阶 BA, 金融零售 AI 工具和平台负责人。 核心问题: 当 AI agent 要调用 tool, API, event stream, workflow task 和外部系统时, 如何用 contract-first 方法定义边界, 权限, 副作用, 输出结构, eval, 审计和版本治理, 而不是依赖 prompt 文本约定。 学习目标: 能把 OpenAPI, AsyncAPI, JSON Schema, CloudEvents 和 OpenTelemetry 组合成一套 AI agent 工具契约治理体系, 覆盖 tool boundary, structured IO, event-driven integration, approval, idempotency, audit evidence, compatibility policy, contract testing 和金融零售作品集表达。
重要说明: 本文是架构学习, 产品设计和作品集材料, 不是法律意见, 合规结论, 审计意见或生产批准。金融零售正式项目必须由 Business Owner, Architecture, Engineering, Security, Privacy, Legal, Compliance, Model Risk, Operational Risk, Internal Audit 和系统 owner 共同确认适用边界, 客户影响, 监管义务和上线门禁。
目的, 适用对象和核心观点
目的
本手册解决一个企业 AI 落地中的硬问题:
Agent 可以读什么, 写什么, 发什么, 触发什么事件, 影响什么客户权益,
这些边界必须是机器可验证, 可测试, 可审计, 可版本化的契约,
而不是藏在 prompt, 会议纪要, wiki 或 connector 代码里的非正式约定。
目标不是教 OpenAPI 或 AsyncAPI 入门, 而是训练如何把这些标准用于 AI agent 的工具边界, 事件驱动集成, structured output, 权限, 审批, eval 和审计。
适用对象
| Role | 本手册帮助你输出什么 |
|---|---|
| AI Platform PM | Tool catalog, contract intake, release gate, adoption metric, tool reuse strategy。 |
| Enterprise Architect | Agent integration principles, API/event taxonomy, governance model, cross-domain contract standard。 |
| Solution Architect | OpenAPI / AsyncAPI / JSON Schema / CloudEvents 落地设计, gateway, observability, versioning。 |
| Advanced BA / CBAP | 把业务能力, 决策边界, 信息需求, exception path 和 audit evidence 转成契约资产。 |
| Risk / Compliance Technology Lead | Approval rule, policy evidence, audit trail, replay scope, change control 和监管问询证据。 |
核心观点
| 观点 | 解释 |
|---|---|
| Prompt is not a contract | Prompt 可以描述行为偏好, 但不能承担接口验证, 权限执行, 副作用控制, 兼容性治理和审计证据。 |
| Tool is a governed capability | Agent tool 不是函数包装, 而是带 owner, schema, policy, side effect, audit 和 lifecycle 的业务能力。 |
| API is for commands and queries | OpenAPI 适合同步 query 和 command, 尤其是需要即时结果, 可明确失败和可验证错误模型的能力。 |
| Event is for facts and decoupling | AsyncAPI + CloudEvents 适合已发生事实, 多 consumer, 异步处理, replay 和 event-driven agent。 |
| JSON Schema is the IO control surface | 输入和输出都要 schema 化, 让模型输出, tool argument, observation, eval 和 contract test 共享同一结构语言。 |
| Audit starts at contract design | 审计不是上线后补日志, 而是在契约里定义 correlation, causation, actor, policy decision, version 和 evidence refs。 |
| Compatibility is a product policy | 契约变更会影响 agent behavior, eval set, workflow, downstream consumer 和运营风险, 不能只看技术编译通过。 |
Source Anchors
以下官方锚点作为术语和设计边界。本文把它们转成企业 AI tool, API, event 和金融零售治理语言。
| Source | Official link | 本文使用方式 |
|---|---|---|
| OpenAPI Specification | https://spec.openapis.org/oas/latest.html | 描述同步 HTTP API, command / query operation, request / response schema, auth, error, versioning 和 tool mapping。 |
| AsyncAPI Specification | https://www.asyncapi.com/docs/reference/specification/latest | 描述 event channel, message, operation, server binding, producer / consumer ownership 和异步契约。 |
| JSON Schema | https://json-schema.org/ | 统一 tool input, tool output, event payload, structured model output, eval fixture 和 schema compatibility 检查。 |
| CloudEvents | https://cloudevents.io/ | 标准化事件 envelope, 包括 id, source, type, subject, time, dataschema 和扩展 attributes。 |
| OpenTelemetry Documentation | https://opentelemetry.io/docs/ | 设计 trace, span, metric, log, baggage 和 cross-system observability, 连接 model call, tool call, API call, event 和 audit。 |
One-Sentence Positioning
Contract-first AI tool design means every agent capability is specified before execution:
what it accepts, what it returns, who may use it, what side effects it can create,
which approvals it needs, how it fails, how it emits events, how it is tested,
how it is observed, and how it changes without breaking consumers.
中文表达:
契约优先的 AI 工具设计, 是把 agent 的每个外部能力先定义成可验证, 可授权, 可审计, 可测试, 可演进的契约, 再交给模型编排和执行。
1. 为什么 AI Tool / API 不能靠 Prompt 文本约定
AI agent 项目里常见的早期做法是:
System prompt: 只有在必要时才调用退款工具。
Tool description: refund_customer 可以给客户退款。
Developer note: 金额大于 100 美元要找主管。
Wiki: 退款失败时创建工单。
这类约定对 demo 有效, 对金融零售生产系统不够。原因不是 prompt 没用, 而是 prompt 不具备以下工程和治理能力。
| Prompt-only 约定 | 缺失能力 | Contract-first 设计 |
|---|---|---|
| “只在有权限时调用” | 模型不知道真实 user, tenant, case relationship, purpose 和 approval 状态 | Tool gateway 调 PDP / PEP, 检查 user, resource, purpose, risk tier 和 policy version。 |
| “输出 JSON” | 模型可能漏字段, 改枚举, 加解释文本, 或无法表达错误状态 | JSON Schema + structured output validation + repair / reject policy。 |
| “不要重复退款” | 重试, timeout, 并发和 event replay 会造成副作用重复 | Command API 使用 idempotency key, request hash 和 replay result。 |
| “高风险动作要审批” | 模型可能误判风险, 拆分动作或绕过审批 | 契约声明 side effect tier, approval subject, approval scope, expiry 和 binding。 |
| “按事件触发 agent” | 事件语义, schema, ordering, retry, DLQ 和 replay 没定义 | AsyncAPI + CloudEvents + inbox dedupe + consumer contract。 |
| “记录日志” | 日志字段不稳定, 无法串联 model, tool, API, event, approval | OpenTelemetry trace + audit event schema + correlation / causation ID。 |
| “以后兼容” | 字段变更可能破坏 agent prompt, eval, consumer 和 dashboard | Compatibility policy + schema registry + consumer contract tests。 |
关键判断:
Prompt 可以指挥模型如何思考和表达。
契约决定系统允许什么输入, 输出, 副作用, 权限, 事件, 审批, 审计和版本变化。
1.1 Prompt 能做什么
| 能力 | 合理用法 |
|---|---|
| Task framing | 说明当前业务任务, 例如 summarize case, draft response, propose next action。 |
| Reasoning style | 要求列出 evidence, assumption, confidence, exception。 |
| Tool selection hint | 告诉模型优先使用 read-only tool 或先 dry-run。 |
| Output formatting | 引导模型生成符合 schema 的字段和解释。 |
| Conversation UX | 控制语气, 澄清问题, 面向用户的表达。 |
1.2 Prompt 不应承担什么
| 不应承担 | 原因 |
|---|---|
| Authorization | 权限必须由身份, 资源关系, purpose, policy 和 PEP 执行。 |
| Financial side effect control | 资金动作需要 idempotency, approval, ledger consistency 和 audit。 |
| Regulatory decision boundary | KYC, AML, credit, complaint, dispute 等高风险边界需要外部规则和人工责任人。 |
| Schema compatibility | 兼容性是 producer / consumer 契约, 不是生成文本风格。 |
| Replay and recovery | 重放需要事件 ID, causation chain, inbox state, version freeze 和补偿规则。 |
| Audit evidence | 审计需要不可变证据, 版本, policy decision, approver 和 trace。 |
2. Contract-First Taxonomy for AI Agents
Contract-first 不是只写 OpenAPI 文件。它是一套把 AI capability 分层建模的 taxonomy。
2.1 Capability Types
| Type | 语义 | 推荐契约 | AI 角色 | 高风险信号 |
|---|---|---|---|---|
| Resource | 可读取的证据或上下文对象 | URI + metadata + JSON Schema | 检索, 引用, 摘要 | PII, PCI, policy doc, external untrusted content。 |
| Query API | 读取当前系统状态 | OpenAPI GET / safe POST search | 拉取事实, 填充 evidence | 过宽字段, 缺 entitlement, freshness 不明。 |
| Command API | 请求系统执行动作 | OpenAPI command operation | 生成 action proposal 或受控执行 | 写入, 资金, 客户权益, 状态变化。 |
| Event | 已发生事实通知 | AsyncAPI message + CloudEvents | 触发 agent, enrichment, workflow | 多 consumer, replay, ordering, schema break。 |
| Workflow task | 长流程中的受控步骤 | Workflow contract + task IO schema | 证据整理, 草稿, 推荐 | 状态转移, 人审, SLA, compensation。 |
| Decision API | 规则或策略判断 | Decision contract + JSON Schema | 提供 facts, 接收 decision | 资格, KYC, AML, credit, advice boundary。 |
| Tool | Agent 可调用能力 | Tool Contract Card + OpenAPI / JSON Schema | 调用外部能力 | side effect, approval, injection, rate abuse。 |
| Observation | Tool 返回给 agent 的结果 | JSON Schema + trust label | 继续推理和生成 | 工具输出被误当系统指令。 |
2.2 Query, Command, Event, Decision
把能力拆成 CQED 四类, 可以避免一个 do_everything tool 承载所有风险。
| Category | 命名倾向 | 典型例子 | Contract priority |
|---|---|---|---|
| Query | getCaseSummary, searchTransactions | 查询 dispute case, KYC profile, policy snippet | Entitlement, field minimization, freshness, pagination, source reference。 |
| Command | createRefundProposal, sendCustomerNotice | 创建退款提案, 更新 case 状态, 发送客户消息 | Idempotency, approval, validation, structured error, audit。 |
| Event | payment.dispute.opened.v1 | 争议打开, KYC 状态变化, AML evidence requested | CloudEvents metadata, schema, producer, consumer, replay, DLQ。 |
| Decision | evaluateRefundEligibility | 判断资格, policy route, advice boundary, tool authorization | Rule id, policy version, input facts hash, reason, explainability。 |
2.3 Side Effect Tiers
建议把所有 agent capability 按副作用分层, 并把分层写入 tool contract 和 release gate。
| Tier | 说明 | 金融零售例子 | 默认控制 |
|---|---|---|---|
| S0 Read public | 读取公开或非敏感资料 | 产品费率公开 FAQ | logging, source citation。 |
| S1 Read restricted | 读取受限内部或客户资料 | 查询 case summary, KYC 文件摘要 | entitlement, data minimization, audit。 |
| S2 Analyze / draft | 生成内部草稿或建议, 不写 source of truth | dispute memo draft, AML narrative draft | output schema, citation, human review sampling。 |
| S3 Internal write | 写内部系统但不直接外发或动资金 | 创建 task, 更新 case note | approval by policy, idempotency, audit event。 |
| S4 Customer-visible / financial | 外发客户消息, 退款, 调整权益 | send notice, create provisional credit | human approval, dual control by threshold, compensation, full trace。 |
| S5 Regulated final decision | 影响 KYC / AML / credit / complaint / legal record 的最终结论 | close AML alert, reject KYC, adverse action notice | agent assist only, deterministic decision service, named human owner。 |
2.4 Trust Boundaries
| Boundary | 设计问题 | Contract artifact |
|---|---|---|
| User to agent | 用户输入是否包含 prompt injection, PII, coercion 或越权请求 | Input classification schema, intent taxonomy。 |
| Agent to tool | 模型提出的 tool call 是否符合 schema, purpose, permission | Tool Contract Card, JSON Schema, PDP policy。 |
| Tool to source system | Connector 是否最小权限, 是否可审计, 是否带 idempotency | OpenAPI contract, auth scope, audit header standard。 |
| Source system to event bus | 事件是否代表事实, 是否可重放, 是否有 schema owner | AsyncAPI, CloudEvents, schema registry。 |
| Event bus to agent consumer | Agent 消费是否去重, 限流, DLQ, replay safe | Consumer contract, inbox schema, replay policy。 |
| Tool output to model context | observation 是否被当作证据而非指令 | Observation schema, trust label, source provenance。 |
3. OpenAPI for Agent Tools and Synchronous APIs
OpenAPI 的价值不只是给开发者生成 SDK。对 AI agent 来说, 它是同步能力的 machine-readable contract, 可以直接驱动 tool catalog, schema validation, policy review, mock server, contract tests 和 eval fixture。
3.1 哪些能力适合 OpenAPI
| Capability | 适合原因 | 注意事项 |
|---|---|---|
| Read query | 需要即时返回, 参数明确, 可分页 | 不要返回过宽字段, 必须带 entitlement 和 sensitivity labels。 |
| Search query | 需要复杂过滤或语义外的结构化检索 | 使用 POST /search 时仍保持 read-only semantics。 |
| Command request | 写入动作需要明确 request, response, error | 所有写动作定义 idempotency, approval, audit headers。 |
| Dry-run / impact preview | 先计算动作影响, 再审批或执行 | dry-run response 不应修改 source of truth。 |
| Decision service | 同步返回 allow / deny / review / reason | 明确 policy version, rule id, input facts hash。 |
3.2 OpenAPI Operation Design for Tools
| OpenAPI element | Agent-oriented requirement |
|---|---|
operationId | 稳定映射为 tool name, 例如 paymentDisputeCreateRefundProposal, 不随文案变化。 |
summary | 一句话业务能力, 不包含授权承诺。 |
description | 包含 allowed intent, non-goals, side effect tier 和 human review boundary。 |
requestBody | JSON Schema 严格定义字段, 枚举, 格式, 最小最大值, additionalProperties: false。 |
responses | 成功, validation error, entitlement denied, approval required, conflict, dependency timeout 分开建模。 |
security | 说明 user-delegated token, service identity, OAuth scope, mTLS 或 internal auth。 |
parameters | 必要 headers 包括 correlation ID, actor, purpose, idempotency key 和 policy decision reference。 |
examples | 包含 normal, approval required, policy denied, conflict, timeout, redacted output。 |
3.3 Tool Mapping Pattern
把 OpenAPI operation 暴露给 agent 时, 不应直接把全部 API surface 给模型。建议经过 tool gateway 生成受控 tool。
OpenAPI contract
-> API owner approval
-> tool capability card
-> risk tier classification
-> policy mapping
-> schema validation
-> model-facing tool definition
-> gateway execution
-> normalized observation
-> audit event
| API field | Tool-facing interpretation |
|---|---|
operationId | Tool name baseline。 |
| Request schema | Tool input schema。 |
| Response schema | Observation schema。 |
| Error response | Agent recoverability model。 |
| Security scheme | Gateway credential and user delegation requirement。 |
| Tags | Domain, owner, use case group。 |
| Extensions | x-risk-tier, x-side-effect, x-approval-required, x-audit-event, x-idempotency-required。 |
3.4 Recommended OpenAPI Extensions
OpenAPI 允许 vendor extensions。企业 AI 平台可约定以下 x- 字段, 让架构评审和自动化检查更稳定。
x-ai-tool:
exposure: approved
sideEffectTier: S3_INTERNAL_WRITE
allowedIntents:
- payment_dispute_investigation
- customer_complaint_resolution
disallowedIntents:
- marketing_retention
- general_goodwill_refund
approval:
required: conditional
policy: refund_proposal_approval_policy_v3
idempotency:
required: true
keySemantics: case_id + action_type + proposal_hash
audit:
eventType: com.momofinance.agent.tool.executed.v1
requiredHeaders:
- X-Correlation-Id
- X-Actor-Id
- X-Purpose
- X-Policy-Decision-Id
observation:
trustLevel: trusted_system
sourceSystem: payment-dispute-platform
3.5 Example: Refund Proposal Command
This is a compact OpenAPI-style excerpt for an agent-safe command. It creates a proposal, not a direct refund execution.
openapi: 3.1.0
info:
title: Payment Dispute Action API
version: 2026.06.1
paths:
/disputes/{case_id}/refund-proposals:
post:
operationId: paymentDisputeCreateRefundProposal
summary: Create a refund proposal for a payment dispute case
description: Creates an internal proposal for human review. It does not post ledger entries or send customer notifications.
tags:
- payment-dispute
- agent-tool
parameters:
- name: case_id
in: path
required: true
schema:
type: string
pattern: "^DSP-[0-9]{4}-[0-9]{5}$"
- name: Idempotency-Key
in: header
required: true
schema:
type: string
minLength: 32
maxLength: 128
- name: X-Correlation-Id
in: header
required: true
schema:
type: string
- name: X-Actor-Id
in: header
required: true
schema:
type: string
- name: X-Purpose
in: header
required: true
schema:
type: string
enum:
- dispute_investigation
- complaint_resolution
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/CreateRefundProposalRequest"
responses:
"201":
description: Proposal created or idempotently returned
content:
application/json:
schema:
$ref: "#/components/schemas/RefundProposalObservation"
"400":
description: Validation failure
content:
application/json:
schema:
$ref: "#/components/schemas/ToolError"
"403":
description: Entitlement or policy denied
content:
application/json:
schema:
$ref: "#/components/schemas/ToolError"
"409":
description: Idempotency conflict or case state conflict
content:
application/json:
schema:
$ref: "#/components/schemas/ToolError"
"428":
description: Additional approval is required before execution
content:
application/json:
schema:
$ref: "#/components/schemas/ApprovalRequired"
x-ai-tool:
exposure: approved
sideEffectTier: S3_INTERNAL_WRITE
approval:
required: conditional
policy: dispute_refund_proposal_policy_v3
idempotency:
required: true
keySemantics: case_id + action_type + proposal_hash
components:
schemas:
CreateRefundProposalRequest:
type: object
required:
- amount
- currency
- reason_code
- evidence_refs
additionalProperties: false
properties:
amount:
type: string
pattern: "^[0-9]+(\\.[0-9]{2})$"
currency:
type: string
enum:
- USD
reason_code:
type: string
enum:
- goods_not_received
- duplicate_charge
- merchant_error
- fraud_claim
evidence_refs:
type: array
minItems: 1
items:
type: string
agent_rationale:
type: string
maxLength: 2000
RefundProposalObservation:
type: object
required:
- status
- proposal_id
- case_id
- approval_status
- audit_ref
- source_system
additionalProperties: false
properties:
status:
type: string
enum:
- created
- returned_existing
proposal_id:
type: string
case_id:
type: string
approval_status:
type: string
enum:
- pending_review
- supervisor_required
- policy_denied
audit_ref:
type: string
source_system:
type: string
const: payment-dispute-platform
ToolError:
type: object
required:
- error_code
- message
- recoverability
- audit_ref
additionalProperties: false
properties:
error_code:
type: string
enum:
- validation_failed
- entitlement_denied
- policy_denied
- idempotency_conflict
- case_state_conflict
- dependency_timeout
message:
type: string
recoverability:
type: string
enum:
- agent_can_retry
- human_review_required
- do_not_retry
audit_ref:
type: string
ApprovalRequired:
type: object
required:
- required_approval_type
- reason_code
- approval_packet_ref
additionalProperties: false
properties:
required_approval_type:
type: string
enum:
- supervisor
- dual_control
- compliance_review
reason_code:
type: string
approval_packet_ref:
type: string
3.6 OpenAPI Review Questions
| Area | Review question |
|---|---|
| Business boundary | Does the operation represent one clear query or command, not a hidden workflow? |
| Tool exposure | Is this operation approved for agent use, or only for internal service-to-service use? |
| Input schema | Are fields typed, bounded, enumerated, and protected from extra parameters? |
| Output schema | Can an agent reliably distinguish success, pending, denied, conflict, and retryable failure? |
| Permission | Does execution depend on user, tenant, case relationship, purpose, and policy decision? |
| Side effect | Is the side effect tier explicit and enforced outside the model? |
| Idempotency | Does every write command accept an idempotency key and request hash behavior? |
| Audit | Are correlation, actor, purpose, policy decision and approval references mandatory? |
| Version | Is the operation versioned and covered by contract tests before tool exposure changes? |
4. AsyncAPI for Events and Event-Driven Agents
AsyncAPI describes asynchronous APIs: channels, messages, operations, servers and bindings. For AI agents, it is the backbone of event-driven integration.
4.1 When to Use Events
Use events when the message is a fact or state change that multiple consumers may need independently.
| Use event when | Financial retail example |
|---|---|
| A business fact has happened | payment.dispute.opened.v1 after a dispute case is created。 |
| Multiple consumers react differently | Case dashboard, notification workflow, analytics, agent evidence worker。 |
| Processing can be asynchronous | Agent creates case summary after event, not inside the customer transaction path。 |
| Replay is useful | Rebuild evidence packs or regenerate summaries after a model or policy change。 |
| Loose coupling matters | KYC status change should not require KYC service to call every downstream system synchronously。 |
Do not use an event to hide a command. If the semantic meaning is “please perform a refund”, use a command API or workflow task. If the semantic meaning is “refund approved”, use an event.
4.2 Event Types for Agents
| Event category | Meaning | Agent usage |
|---|---|---|
| Domain event | Business fact already happened | Trigger summary, enrichment, risk triage, workflow step。 |
| Command result event | A command completed or failed | Observe execution and update case status。 |
| Audit event | A controlled action or decision occurred | Support investigation, monitoring, policy evidence。 |
| Evaluation event | Model/tool output evaluated | Feed EvalOps and release gates。 |
| Control event | Platform-level state changed | Disable tool, rotate schema, pause consumer。 |
4.3 AsyncAPI Design Elements
| AsyncAPI element | Agent-oriented requirement |
|---|---|
servers | Broker endpoints and environment boundaries, with security scheme and region。 |
channels | Domain-oriented stream names, e.g. payments.disputes.events.v1。 |
operations | Explicit publish / subscribe ownership。 |
messages | CloudEvents envelope plus payload schema。 |
components.schemas | JSON Schema for event data and headers。 |
bindings | Kafka, AMQP, SNS/SQS, EventBridge or broker-specific constraints。 |
| Examples | Include normal, redacted, high-risk, replay marker and schema evolution cases。 |
4.4 CloudEvents Envelope Standard
CloudEvents provides a common event metadata envelope. It does not replace domain payload schema; it wraps it.
{
"specversion": "1.0",
"id": "evt_20260629_140501_00091",
"source": "urn:service:payment-dispute",
"type": "com.momofinance.payment.dispute.opened.v1",
"subject": "dispute/DSP-2026-09291",
"time": "2026-06-29T14:05:01Z",
"datacontenttype": "application/json",
"dataschema": "https://schemas.example.com/payment/dispute-opened/1.0.0",
"correlationid": "corr_7d21d6c4",
"causationid": "cmd_create_dispute_48391",
"tenantid": "retail-us",
"actorid": "service:dispute-case-api",
"riskclass": "customer-impacting",
"traceparent": "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
"data": {
"case_id": "DSP-2026-09291",
"customer_ref": "cust_ref_9f22",
"amount": "128.40",
"currency": "USD",
"reason_code": "goods_not_received",
"opened_channel": "mobile_app"
}
}
Recommended extension attributes:
| Attribute | Purpose |
|---|---|
correlationid | Connect user request, model call, tool call, API command, workflow and event。 |
causationid | Identify the command, event or workflow step that caused this event。 |
tenantid | Enforce multi-tenant isolation and audit filtering。 |
actorid | Identify human, service, agent or workflow actor。 |
policydecisionid | Link to authorization, approval or guardrail decision。 |
schemaid | Immutable schema registry identifier。 |
riskclass | Low, restricted, customer-impacting, financial, regulated。 |
traceparent | Propagate OpenTelemetry distributed trace context。 |
replayid | Mark controlled replay and connect to replay approval。 |
4.5 AsyncAPI Excerpt
asyncapi: 3.0.0
info:
title: Payment Dispute Event API
version: 2026.06.1
servers:
production:
host: events.internal.example.com
protocol: kafka
description: Production event broker for payment dispute domain
channels:
payments.disputes.events.v1:
address: payments.disputes.events.v1
messages:
DisputeOpened:
$ref: "#/components/messages/DisputeOpened"
operations:
publishDisputeOpened:
action: send
channel:
$ref: "#/channels/payments.disputes.events.v1"
messages:
- $ref: "#/components/messages/DisputeOpened"
subscribeDisputeOpenedForAgentEvidence:
action: receive
channel:
$ref: "#/channels/payments.disputes.events.v1"
messages:
- $ref: "#/components/messages/DisputeOpened"
x-agent-consumer:
owner: ai-case-evidence-platform
consumerGroup: dispute-evidence-agent-v1
inboxDedupe: required
dlqOwner: ai-integration-operations
replayAllowed: approved_replay_only
components:
messages:
DisputeOpened:
name: com.momofinance.payment.dispute.opened.v1
title: Payment dispute opened
contentType: application/cloudevents+json
payload:
$ref: "#/components/schemas/DisputeOpenedCloudEvent"
examples:
- name: mobile_goods_not_received
payload:
specversion: "1.0"
id: evt_20260629_140501_00091
source: urn:service:payment-dispute
type: com.momofinance.payment.dispute.opened.v1
subject: dispute/DSP-2026-09291
time: "2026-06-29T14:05:01Z"
datacontenttype: application/json
dataschema: https://schemas.example.com/payment/dispute-opened/1.0.0
correlationid: corr_7d21d6c4
data:
case_id: DSP-2026-09291
customer_ref: cust_ref_9f22
amount: "128.40"
currency: USD
reason_code: goods_not_received
opened_channel: mobile_app
schemas:
DisputeOpenedCloudEvent:
type: object
required:
- specversion
- id
- source
- type
- subject
- time
- datacontenttype
- dataschema
- data
additionalProperties: true
properties:
specversion:
type: string
const: "1.0"
id:
type: string
source:
type: string
type:
type: string
const: com.momofinance.payment.dispute.opened.v1
subject:
type: string
time:
type: string
format: date-time
datacontenttype:
type: string
const: application/json
dataschema:
type: string
format: uri
correlationid:
type: string
causationid:
type: string
traceparent:
type: string
data:
$ref: "#/components/schemas/DisputeOpenedData"
DisputeOpenedData:
type: object
required:
- case_id
- customer_ref
- amount
- currency
- reason_code
- opened_channel
additionalProperties: false
properties:
case_id:
type: string
customer_ref:
type: string
amount:
type: string
pattern: "^[0-9]+(\\.[0-9]{2})$"
currency:
type: string
enum:
- USD
reason_code:
type: string
enum:
- goods_not_received
- duplicate_charge
- fraud_claim
- merchant_error
opened_channel:
type: string
enum:
- mobile_app
- branch
- call_center
- web
4.6 Event Consumer Contract
Every agent consumer should have a consumer contract, not just a subscribed topic.
| Field | Example decision |
|---|---|
| Consumer name | dispute-evidence-agent-v1 |
| Business purpose | Generate evidence summary draft for newly opened dispute cases。 |
| Owner | AI Case Evidence Platform。 |
| Input events | com.momofinance.payment.dispute.opened.v1。 |
| Output artifacts | Internal evidence summary, no customer-visible output。 |
| Output events | com.momofinance.agent.evidence_summary.generated.v1。 |
| Dedupe | Inbox table keyed by CloudEvents id and consumer name。 |
| Ordering | Per case_id; stale events rechecked against case API before action。 |
| Retry | Exponential backoff for dependency timeout; no retry for schema or policy denied。 |
| DLQ | Classified as schema, entitlement, dependency, policy, poison。 |
| Replay | Requires replay approval, fixed schema version, fixed model / prompt / tool versions。 |
| Kill switch | Can pause consumer without stopping dispute domain events。 |
5. JSON Schema for Structured IO
JSON Schema is the common structure layer across OpenAPI, AsyncAPI, tool calls, model output, observations and eval tests.
5.1 Where JSON Schema Should Be Used
| Surface | Why schema matters |
|---|---|
| Tool arguments | Prevent extra fields, wrong enums, prompt-injected parameters and malformed requests。 |
| Tool observations | Make success, failure, partial data, source, sensitivity and freshness machine-readable。 |
| Model structured output | Let orchestration validate extraction, classification, recommendation and draft metadata。 |
| Event payloads | Stabilize producer / consumer contract and replay。 |
| Decision service input | Freeze facts used for allow, deny, review and reason codes。 |
| Eval fixtures | Use same schema for golden outputs, regression tests and red-team cases。 |
| Audit events | Ensure evidence fields are consistently populated across tools and workflows。 |
5.2 Schema Design Principles for Agent IO
| Principle | Practice |
|---|---|
| Close the shape | Use additionalProperties: false for agent-facing request and output schemas where feasible。 |
| Prefer enums | Use controlled values for status, reason_code, risk_tier, decision, recoverability。 |
| Separate data and explanation | Keep machine decision fields separate from human-readable rationale。 |
| Include provenance | Every observation should identify source system, source record, retrieved time and data classification。 |
| Distinguish unknown and absent | Use explicit values like unknown, not_applicable, not_authorized, not empty strings。 |
| Bound free text | Use maxLength for rationales, notes and summaries。 |
| Model money safely | Use string decimal plus currency, not floating point number。 |
| Make partial results explicit | Include completeness, missing_fields, source_errors for degraded paths。 |
5.3 Observation Schema
Tool output should be a normalized observation. It should not be treated as instruction.
{
"$id": "https://schemas.example.com/agent/tool-observation/1.0.0",
"type": "object",
"required": [
"status",
"tool_name",
"tool_version",
"source_system",
"trust_level",
"retrieved_at",
"data_classification",
"result",
"audit_ref"
],
"additionalProperties": false,
"properties": {
"status": {
"type": "string",
"enum": ["success", "partial", "denied", "failed"]
},
"tool_name": {
"type": "string"
},
"tool_version": {
"type": "string"
},
"source_system": {
"type": "string"
},
"trust_level": {
"type": "string",
"enum": ["trusted_system", "trusted_policy", "untrusted_user_content", "vendor_content"]
},
"retrieved_at": {
"type": "string",
"format": "date-time"
},
"data_classification": {
"type": "string",
"enum": ["public", "internal", "pii", "pci", "restricted", "regulated"]
},
"result": {
"type": "object"
},
"citations": {
"type": "array",
"items": {
"type": "object",
"required": ["source_ref", "record_type", "effective_time"],
"additionalProperties": false,
"properties": {
"source_ref": {
"type": "string"
},
"record_type": {
"type": "string"
},
"effective_time": {
"type": "string",
"format": "date-time"
}
}
}
},
"audit_ref": {
"type": "string"
}
}
}
5.4 Structured Model Output Schema
For agent planning and recommendation, schema should separate extraction, proposal and confidence.
{
"$id": "https://schemas.example.com/agent/action-proposal/1.0.0",
"type": "object",
"required": [
"intent",
"risk_tier",
"proposed_action",
"required_tools",
"evidence_refs",
"approval_expectation",
"confidence",
"rationale"
],
"additionalProperties": false,
"properties": {
"intent": {
"type": "string",
"enum": [
"dispute_investigation",
"kyc_gap_review",
"aml_evidence_summary",
"customer_service_response",
"policy_lookup"
]
},
"risk_tier": {
"type": "string",
"enum": ["S0", "S1", "S2", "S3", "S4", "S5"]
},
"proposed_action": {
"type": "string",
"enum": [
"read_case",
"draft_summary",
"create_internal_task",
"create_refund_proposal",
"send_customer_notice",
"route_to_human"
]
},
"required_tools": {
"type": "array",
"items": {
"type": "string"
}
},
"evidence_refs": {
"type": "array",
"minItems": 1,
"items": {
"type": "string"
}
},
"approval_expectation": {
"type": "string",
"enum": ["none", "policy_check", "human_review", "dual_control", "not_allowed"]
},
"confidence": {
"type": "number",
"minimum": 0,
"maximum": 1
},
"rationale": {
"type": "string",
"maxLength": 1500
}
}
}
5.5 Schema Review Questions
| Area | Question |
|---|---|
| Meaning | Does every field have clear business meaning, not just technical name? |
| Authority | Which system is authoritative for this field? |
| Sensitivity | Is PII, PCI, confidential, restricted or regulated data labeled? |
| Enum | Are status and reason values controlled and documented? |
| Evolution | Can optional fields be added without breaking current consumers? |
| Defaults | Are defaults safe, or do they hide missing data? |
| Validation | Are numeric, date, string and array bounds explicit? |
| Prompt risk | Can a free text field carry untrusted instructions into model context? |
| Evidence | Can output be traced to source records and policy versions? |
| Eval | Can the schema be used in golden set and regression tests? |
6. Idempotency, Side Effect, Approval, Audit and Versioning
These five topics decide whether agent tools are production-grade.
6.1 Idempotency
Agent systems retry. HTTP clients retry. Message brokers redeliver. Humans refresh pages. Models may propose the same action twice. Without idempotency, a temporary timeout can become duplicate refund, duplicate notification, duplicate task or duplicate evidence write.
| Action type | Idempotency key design |
|---|---|
| Create proposal | case_id + proposed_action_type + evidence_hash + policy_version |
| Execute approved action | approval_id + approved_action_hash |
| Send customer message | case_id + message_template_id + approved_content_hash + recipient_ref |
| Create internal task | source_event_id + task_type + resource_ref |
| Event consumer processing | cloud_event_id + consumer_name |
Required behavior:
| Scenario | Expected system behavior |
|---|---|
| Same key, same payload hash | Return original result, no duplicate side effect。 |
| Same key, different payload hash | Reject with idempotency conflict。 |
| Timeout after successful write | Retry returns original result and audit reference。 |
| Event replay | Consumer inbox detects processed event and marks replay as duplicate-safe。 |
| Expired key | Retention policy must match business risk and dispute / audit period。 |
6.2 Side Effect
Every contract should declare side effect in a way both humans and automation can enforce.
| Side effect attribute | Example |
|---|---|
sideEffectTier | S3_INTERNAL_WRITE |
sourceOfTruthChange | case_management.case_note.created |
customerVisible | false for internal note, true for outbound notice。 |
financialMovement | false for proposal, true for ledger posting。 |
reversible | compensating_action_required for incorrect notice。 |
compensationOwner | Payment Dispute Operations。 |
Design rule:
If a tool can alter funds, account status, customer rights, KYC/AML state, complaint record or customer-visible communication,
it must not be exposed as a low-friction model tool. It needs policy gate, approval binding, audit and recovery design.
6.3 Approval
Approval must bind to the exact action, parameters, evidence and version. A generic “approved by supervisor” is too weak.
| Approval field | Requirement |
|---|---|
approval_id | Stable identifier passed into execution command。 |
approval_subject | Action type and business object, e.g. dispute refund proposal for DSP-2026-09291。 |
approved_parameters_hash | Hash of amount, currency, recipient, reason, evidence refs and policy version。 |
approver_identity | Human identity, role, tenant and assignment relationship。 |
approval_scope | Single action, case-limited, amount-limited, expiry-limited。 |
approval_expiry | Time after which execution requires renewed approval。 |
human_edit_diff | Difference between agent draft and approved final content。 |
policy_decision_id | Policy result that determined approval requirement。 |
6.4 Audit
Audit evidence must connect the chain from user request to final side effect.
user request
-> identity and purpose
-> model call
-> context resources
-> tool proposal
-> schema validation
-> policy decision
-> approval packet
-> API command
-> event emitted
-> observation returned
-> final response
Minimum audit fields:
| Field | Why it matters |
|---|---|
correlation_id | Links full business interaction。 |
causation_id | Explains why a tool call or event happened。 |
actor_id | Human, agent, service or workflow actor。 |
agent_id | Agent identity and owner。 |
model_id | Model version used for proposal or draft。 |
prompt_version | Prompt or orchestration version。 |
tool_name | Tool identity and version。 |
input_hash | Sensitive input reference without overlogging raw PII。 |
output_ref | Durable reference to observation or artifact。 |
policy_decision_id | PDP result and version。 |
approval_id | Human approval or dual control reference。 |
trace_id | OpenTelemetry trace link。 |
6.5 OpenTelemetry for Agent Contract Observability
OpenTelemetry should trace across AI orchestration, tool gateway, API, event producer and event consumer.
Recommended spans:
| Span | Key attributes |
|---|---|
agent.request | agent.id, session.id, user.role, tenant.id, purpose |
model.generate | model.id, prompt.version, input.tokens, output.tokens, eval.route |
tool.validation | tool.name, tool.version, schema.id, validation.result |
policy.evaluate | policy.id, policy.version, decision, reason_code |
approval.create | approval.type, risk.tier, case.id, expiry |
api.command | http.route, operation.id, idempotency.key.hash, status |
event.publish | event.type, event.id, schema.id, topic |
event.consume | consumer.name, event.id, inbox.status, replay.id |
Metrics:
| Metric | Product / governance use |
|---|---|
| Tool call success / denial rate | Indicates friction, misuse, missing permissions or policy drift。 |
| Approval required / approved / rejected rate | Measures automation boundary quality and operational load。 |
| Schema validation failure rate | Detects model output drift or contract mismatch。 |
| Idempotency conflict count | Finds repeated proposals, retries and unsafe clients。 |
| DLQ count by category | Separates schema, entitlement, dependency, policy and poison failures。 |
| Consumer lag | Shows event-driven agent backlog and SLA risk。 |
| Contract version adoption | Tracks migration and stale consumers。 |
| Manual override rate | Signals model recommendation quality or policy mismatch。 |
6.6 Versioning
Versioning must cover more than APIs.
| Artifact | Version policy |
|---|---|
| OpenAPI | Semantic version per service contract; breaking changes require new major or new operation。 |
| AsyncAPI | Version channel / message and document; breaking event semantic change requires new event type。 |
| JSON Schema | Immutable schema ID; compatibility checked before release。 |
| Tool contract | Version when side effect, permission, approval, input, output or failure model changes。 |
| Prompt | Version prompt templates that influence tool selection, extraction or final wording。 |
| Eval set | Version golden sets and red-team cases linked to contract releases。 |
| Policy | Version PDP policy, decision table, reason mapping and approval rule。 |
| Workflow | Version task IO schema, state transition, timeout and compensation map。 |
7. Contract Testing, Mocking and Simulation
Contract-first only works when contracts are executable in tests.
7.1 Test Layers
| Layer | Test goal |
|---|---|
| Schema validation | Reject malformed tool arguments, extra fields, wrong enum, missing evidence。 |
| Provider contract test | API / event producer conforms to OpenAPI / AsyncAPI and JSON Schema。 |
| Consumer contract test | Agent consumer and downstream services tolerate compatible changes and reject breaking ones。 |
| Tool gateway test | Policy, approval, idempotency, rate limit, audit and observation normalization work。 |
| Model-output test | Model structured output conforms to schema under normal and adversarial prompts。 |
| Event replay test | Replay uses inbox dedupe, replay marker, fixed versions and no duplicate side effect。 |
| Failure-mode test | Validation, entitlement denied, approval required, conflict, dependency timeout handled differently。 |
| End-to-end trace test | Correlation ID links model, tool, API, event, workflow and audit store。 |
7.2 Contract Test Cases
| Case | Expected result |
|---|---|
Agent sends unknown field override_approval | Schema validation rejects before policy or API call。 |
Amount string uses 128.4 instead of 128.40 | Validation failure with recoverability agent_can_retry。 |
| User lacks case entitlement | PDP returns deny, tool gateway emits denied audit event, no API call。 |
| Refund proposal repeats after timeout | Same idempotency key returns original proposal and audit ref。 |
| Same idempotency key with different amount | API returns conflict and no side effect。 |
Event payload removes required case_id | Consumer contract test fails release。 |
| New optional event field added | Existing consumer test passes, field ignored or logged。 |
| Agent consumes replayed event | Inbox marks duplicate or replay-safe processing path。 |
| Tool output contains instruction text | Context composer labels output as observation, not instruction。 |
| Approval expires before execution | Command returns approval required, execution blocked。 |
7.3 Mocking Strategy
| Mock type | Use |
|---|---|
| OpenAPI mock server | Let agent orchestration and tool gateway test request / response without source system dependency。 |
| AsyncAPI event simulator | Publish sample CloudEvents to test consumers, DLQ and replay。 |
| Policy decision stub | Test allow, deny, approval required and redaction paths。 |
| Approval service stub | Test approval binding, expiry and human edit diff。 |
| Error injector | Simulate timeout, 429, 409 conflict, poison event and schema mismatch。 |
| Audit collector mock | Verify required fields are emitted even on denied calls。 |
7.4 Simulation for Product and Risk
Simulation should answer product and governance questions, not just pass unit tests.
| Simulation question | Evidence |
|---|---|
| How many tool calls become approval-required after new policy? | Approval workload report by queue, role, region and risk tier。 |
| Does a schema change break agent extraction or event consumers? | Consumer compatibility report and model-output schema failure rate。 |
| Does new tool exposure increase denied calls or prompt injection risk? | Red-team run, denied call taxonomy, false allow count。 |
| Can replay regenerate summaries without duplicate writes? | Replay report with inbox, idempotency and output diff。 |
| Does customer impact remain within agreed boundary? | Case-level outcome comparison and complaint / SLA risk estimate。 |
7.5 Release Gate
| Gate | Required evidence | Pass standard |
|---|---|---|
| G1 Contract completeness | OpenAPI / AsyncAPI / JSON Schema / Tool Contract Card | Owner, schema, auth, side effect, audit and version complete。 |
| G2 Risk classification | Side effect tier and customer / regulatory impact | No high-risk tool without approval and recovery design。 |
| G3 Policy enforcement | PDP / PEP map and tests | High-risk paths cannot bypass PEP。 |
| G4 Contract tests | Provider and consumer contract test run | No critical failure, breaking change reviewed。 |
| G5 Eval regression | Model output and tool selection eval | No unauthorized tool call, no schema-breaking output in critical cases。 |
| G6 Simulation | Historical and synthetic scenario report | Decision delta, approval load and customer impact explained。 |
| G7 Observability | Trace, metrics, audit sample | Correlation links model, tool, API, event and approval。 |
| G8 Rollback | Disable tool, pause consumer, revert schema route | Tested in non-prod with named owner。 |
8. Financial Retail Case Studies
8.1 Payment Dispute Agent
Scenario
An internal agent assists analysts with payment dispute cases. It reads case details, retrieves transaction evidence, drafts a summary, creates an internal refund proposal and triggers human review. It does not directly post ledger entries.
Contract Map
| Capability | Contract | Risk tier | Controls |
|---|---|---|---|
| Get dispute case | OpenAPI query | S1 | Case entitlement, field masking, freshness timestamp。 |
| Search related transactions | OpenAPI query | S1 | Customer relationship check, pagination, PCI redaction。 |
| Draft evidence summary | JSON Schema output | S2 | Citations, source refs, confidence, human review。 |
| Create refund proposal | OpenAPI command | S3 | Idempotency, approval policy, audit event。 |
| Refund approved event | AsyncAPI + CloudEvents | S4 result | Ledger system is producer, agent only observes。 |
| Send customer notice | OpenAPI command via workflow | S4 | Human-approved content, template, dual control by threshold。 |
Key Design
Agent creates a proposal.
Workflow and policy decide approval.
Ledger service executes.
Domain event confirms.
Audit chain preserves evidence.
Failure Modes
| Failure | Contract-first response |
|---|---|
| Agent proposes refund without evidence | Request schema requires evidence refs; policy denies missing evidence。 |
| Duplicate proposal after timeout | Idempotency returns original proposal。 |
| Analyst changes amount after approval | Execution command rejects because approved parameter hash changed。 |
| Event replay triggers duplicate task | Inbox dedupe prevents duplicate workflow task。 |
| Customer notice schema changes | Consumer contract test catches missing approved wording field。 |
8.2 KYC Document Review Agent
Scenario
An agent assists KYC analysts by reading document metadata, extracting fields, comparing jurisdiction / product policy and drafting gaps. It does not approve or reject customers.
Contract Map
| Capability | Contract | Risk tier | Controls |
|---|---|---|---|
| Retrieve document metadata | OpenAPI query | S1 | PII restriction, case entitlement。 |
| Extract document facts | Structured model output schema | S2 | Confidence, field provenance, low-confidence review。 |
| Evaluate document sufficiency | Decision API schema | S3 decision support | Policy version, rule id, reason code。 |
| KYC case status changed | AsyncAPI + CloudEvents | S5 domain fact | Produced by workflow, not by agent。 |
| Create reviewer task | OpenAPI command | S3 | Idempotency key by case + missing doc + policy version。 |
Key Design
The agent output is a gap analysis artifact, not a KYC decision. Final state transition remains in KYC workflow and named human / policy owner path.
8.3 AML Evidence Summarization Agent
Scenario
An AML operations team uses an agent to gather transaction patterns, customer profile references and prior alerts for investigator review. The agent drafts evidence packs and narrative candidates.
Contract Map
| Capability | Contract | Risk tier | Controls |
|---|---|---|---|
Subscribe to aml.alert.evidence_requested.v1 | AsyncAPI | S2 trigger | Consumer contract, DLQ, replay approval。 |
| Query account activity | OpenAPI query | S1 restricted | AML purpose, data minimization, trace。 |
| Generate narrative draft | JSON Schema output | S2 | Distinguish facts, inference, uncertainty, missing evidence。 |
| Close alert | Not exposed as agent tool | S5 | Human investigator and AML system only。 |
| SAR-related artifact | Workflow task contract | S5 support | Human review, legal / compliance policy path。 |
Key Design
AML agent can assemble and explain evidence.
It cannot make the final suspicious activity determination or submit regulatory material.
8.4 Customer Service Action Agent
Scenario
A service agent helps representatives respond to customer inquiries, find policy, draft replies, create case notes and propose next actions.
Contract Map
| Capability | Contract | Risk tier | Controls |
|---|---|---|---|
| Search policy | OpenAPI / resource schema | S1 | Effective date, jurisdiction, product filter。 |
| Draft response | JSON Schema output | S2 | Policy citations, prohibited promise check。 |
| Update case note | OpenAPI command | S3 | Idempotency, audit, no customer-visible effect。 |
| Send customer message | OpenAPI command + approval | S4 | Human approval, approved template, delivery receipt event。 |
| Escalate complaint | Workflow task | S4 | Complaint policy, SLA, audit, queue owner。 |
Key Design
Customer-visible commitment is not just language generation. It is a controlled action with approved wording, policy basis, send receipt, audit and recovery path.
9. Templates and Copyable Artifacts
The following artifacts use concrete example values so they can be adapted without relying on blank fields.
9.1 Tool Contract Card
# Tool Contract Card: payment_dispute.create_refund_proposal
## Identity
- Tool name: payment_dispute.create_refund_proposal
- Tool version: 2026.06.1
- Backing API operationId: paymentDisputeCreateRefundProposal
- Business owner: Payment Dispute Operations
- Technical owner: Payment Dispute Platform
- Risk owner: Retail Banking Operational Risk
- Source system: Payment Dispute Case Platform
## Business Capability
- Creates an internal refund proposal for a dispute case.
- Does not post ledger entries.
- Does not send customer communication.
- Supports dispute investigation and complaint resolution workflows.
## Allowed Intents
- dispute_investigation
- complaint_resolution
## Disallowed Intents
- marketing_retention
- discretionary_goodwill_refund_without_case
- customer_negotiation_pressure
## Input Contract
- Schema ID: https://schemas.example.com/payment/refund-proposal-request/1.0.0
- Required fields: case_id, amount, currency, reason_code, evidence_refs, agent_rationale
- Additional fields: rejected
- Money representation: decimal string plus ISO currency
- Evidence rule: at least one source evidence reference
## Output Contract
- Schema ID: https://schemas.example.com/payment/refund-proposal-observation/1.0.0
- Status values: created, returned_existing, policy_denied, approval_required
- Required provenance: source_system, proposal_id, audit_ref, policy_decision_id
## Risk and Permission
- Side effect tier: S3_INTERNAL_WRITE
- Customer visible: false
- Financial movement: false
- Required user relationship: assigned analyst or supervisor on the dispute case
- Required purpose: dispute_investigation or complaint_resolution
- Tenant boundary: retail-us
- Human approval: required before any ledger execution
- Dual control: required when amount is 500.00 USD or above
## Idempotency
- Required: yes
- Key semantics: case_id + action_type + proposal_hash
- Same key and same payload: return original proposal
- Same key and different payload: reject with idempotency_conflict
- Retention: aligned to dispute case retention and audit policy
## Failure Model
- validation_failed: agent may correct schema-compliant input
- entitlement_denied: do not retry without permission change
- policy_denied: route to human with reason
- approval_required: create approval packet
- case_state_conflict: refresh case state
- dependency_timeout: safe retry with same idempotency key
## Audit
- Event type: com.momofinance.agent.tool.executed.v1
- Required identifiers: correlation_id, causation_id, actor_id, agent_id, tool_version, model_id, prompt_version
- Required references: case_id, input_hash, output_ref, policy_decision_id, approval_packet_ref
- Sensitive fields: amount logged, raw customer PII not logged
## Observability
- Trace span: api.command
- Metrics: success_count, denial_count, approval_required_count, idempotency_conflict_count, p95_latency
- Alert: conflict spike or policy_denied spike above agreed threshold
## Versioning
- Breaking input schema change requires new tool major version.
- Approval policy change requires tool contract review.
- Side effect tier change requires architecture and risk sign-off.
9.2 Event Contract
# Event Contract: com.momofinance.payment.dispute.opened.v1
## Event Identity
- Event type: com.momofinance.payment.dispute.opened.v1
- Channel: payments.disputes.events.v1
- Producer: Payment Dispute Case Platform
- Business meaning: A payment dispute case has been opened and assigned an authoritative case id.
- Event tense: past-tense domain fact
- CloudEvents content type: application/cloudevents+json
## Envelope
- specversion: 1.0
- id: globally unique event id
- source: urn:service:payment-dispute
- subject: dispute/{case_id}
- time: producer event creation time
- dataschema: https://schemas.example.com/payment/dispute-opened/1.0.0
- correlationid: original customer or analyst interaction correlation id
- causationid: command or workflow step that opened the dispute
- traceparent: OpenTelemetry trace context
## Data Schema
- Required fields: case_id, customer_ref, amount, currency, reason_code, opened_channel
- PII rule: customer_ref is a reference, not raw PII
- Money rule: amount is decimal string, currency is enum
- Reason code rule: reason_code uses approved dispute taxonomy
## Consumer Contract
- AI evidence consumer: dispute-evidence-agent-v1
- Consumer action: generate internal evidence summary draft
- Consumer side effect: internal artifact only
- Inbox key: event id + consumer name
- Retry: dependency timeout only
- DLQ owner: AI Integration Operations
- Replay: approval required, replay marker required, idempotency validated
## Compatibility
- Adding optional data fields is compatible.
- Removing required fields is breaking.
- Changing reason_code meaning is breaking and requires new event type or major version.
- Moving raw PII into payload is prohibited without privacy and architecture review.
## Audit
- Event publish audit links producer command, actor, policy decision and trace id.
- Agent consumer audit links input event id, model version, prompt version, output artifact and schema version.
9.3 Schema Review Checklist
| Review area | Pass standard |
|---|---|
| Business meaning | Every field has a domain definition and source of authority。 |
| Data classification | PII, PCI, confidential, restricted and regulated fields are labeled。 |
| Minimality | Payload excludes fields not needed by declared consumers or tool purpose。 |
| Validation | Required fields, enum values, numeric bounds, string lengths and formats are explicit。 |
| Money | Amount uses decimal string and currency enum。 |
| Time | Timestamps use ISO date-time and define event time vs processing time。 |
| Free text | Free text fields have max length, trust label and prompt injection handling。 |
| Provenance | Output schema includes source refs, effective time and audit ref。 |
| Error model | Failures are structured by validation, entitlement, policy, conflict, dependency and unknown。 |
| Compatibility | Additive, breaking and prohibited changes are documented。 |
| Eval reuse | Schema can be used for golden outputs and regression assertions。 |
| Audit reuse | Schema includes identifiers needed to reconstruct decision chain。 |
9.4 Compatibility Policy
# Contract Compatibility Policy: Agent Tool and Event Contracts
## Scope
This policy applies to OpenAPI operations exposed as agent tools, AsyncAPI event messages, JSON Schemas for model output, tool observations and audit events.
## Compatible Changes
- Add optional field with safe default behavior.
- Add enum value only when all consumers treat unknown values safely.
- Add response example without changing schema.
- Add non-required CloudEvents extension attribute.
- Improve description without changing semantics.
## Breaking Changes
- Remove required field.
- Rename field.
- Change field type, format, unit or money representation.
- Change enum meaning.
- Add required field.
- Change event business meaning while keeping the same event type.
- Change tool side effect tier.
- Change approval requirement in a way that reduces control strength.
- Change error code semantics.
## Prohibited Without Formal Review
- Add raw PII, PCI or regulated narrative to event payload where only references were approved.
- Expose S4 or S5 tool directly to autonomous agent execution.
- Remove audit identifiers from command, event or observation.
- Allow additional properties on high-risk tool input without schema review.
## Release Requirements
- Provider contract tests pass.
- Consumer contract tests pass for registered consumers.
- Model structured output eval passes for affected prompts.
- Policy and approval regression tests pass for affected tools.
- Simulation report explains decision delta, approval load and customer impact.
- Architecture, risk and business owners approve breaking or high-risk changes.
## Deprecation
- Deprecated versions remain observable in usage dashboard.
- Migration owner and consumer list are recorded.
- High-risk consumers receive migration tests before shutdown.
- Runtime rejects calls to retired S4 and S5 contracts after agreed sunset date.
9.5 Portfolio Evidence Pack
# Portfolio Evidence Pack: Contract-First Agent Integration
## Case Title
Governed Payment Dispute Agent with Contract-First Tool and Event Design
## Business Problem
Dispute analysts need faster evidence gathering and proposal drafting, while refunds, customer notices and complaint handling require strong control, approval and audit.
## Architecture Decision
Use OpenAPI for synchronous case queries and refund proposal commands, AsyncAPI + CloudEvents for dispute domain events, JSON Schema for tool IO and model output, and OpenTelemetry for cross-system traceability.
## Core Artifacts
- Tool Contract Card for payment_dispute.create_refund_proposal
- OpenAPI excerpt for refund proposal command
- AsyncAPI excerpt for payment.dispute.opened event
- JSON Schema for action proposal and tool observation
- Compatibility policy for schema and tool changes
- Contract test matrix for validation, entitlement, idempotency, approval and replay
- Audit evidence chain linking user, model, tool, policy, approval, command and event
## Risk Controls
- No direct ledger posting by agent
- Human approval before customer-visible or financial execution
- Idempotency key for every write command
- Inbox dedupe for event-driven agent consumer
- Policy decision recorded for every allowed, denied and approval-required tool call
- Observability spans across model, gateway, API, event and consumer
## Business Impact Narrative
The design reduces analyst preparation time while keeping final financial and customer-visible actions under policy, workflow and human control. It also gives architecture, risk and audit teams contract evidence rather than relying on prompt behavior.
## Interview Proof Points
- Can explain why prompt guardrails are not a control boundary
- Can separate query, command, event, decision and workflow task
- Can design OpenAPI / AsyncAPI contracts for agent integration
- Can show idempotency, approval and audit design for financial retail use cases
- Can connect contract testing and eval to release governance
10. Operating Model
10.1 Contract Ownership
| Artifact | Business owner | Technical owner | Risk / governance owner |
|---|---|---|---|
| Tool Contract Card | Capability owner | Tool gateway / service owner | Operational risk / security |
| OpenAPI operation | Product domain owner | API platform / service team | Architecture review board |
| AsyncAPI event | Domain event owner | Event platform / producer team | Data governance / architecture |
| JSON Schema | Schema owner | Platform engineering | Data governance |
| Policy rule | Business / risk owner | Policy platform | Compliance / model risk |
| Approval workflow | Operations owner | Workflow platform | Risk / internal controls |
| Eval set | Product owner | EvalOps | Model risk / quality |
| Audit event | Control owner | Observability / audit platform | Internal audit / compliance |
10.2 Intake Workflow
1. Register capability and business outcome.
2. Classify capability as resource, query, command, event, decision, workflow task or tool.
3. Assign side effect tier and data classification.
4. Draft OpenAPI, AsyncAPI and JSON Schema artifacts as applicable.
5. Write Tool Contract Card with owner, permissions, approval, audit and version policy.
6. Map PDP / PEP enforcement points.
7. Define idempotency, replay, DLQ and recovery behavior.
8. Build mock and contract tests.
9. Run model-output eval and prompt injection tests.
10. Simulate customer impact, approval load and failure modes.
11. Approve release with architecture, risk and business owners.
12. Monitor adoption, denial, approval, schema failure, DLQ and incidents.
10.3 RACI
| Activity | PM | BA / CBAP | Solution Architect | Engineering | Security | Risk / Compliance | Operations |
|---|---|---|---|---|---|---|---|
| Capability taxonomy | A | R | R | C | C | C | C |
| Side effect tier | A | R | R | C | C | A/R | C |
| OpenAPI / AsyncAPI design | C | R | A/R | R | C | C | I |
| JSON Schema review | C | R | A/R | R | C | C | I |
| Policy and approval mapping | A | R | R | C | R | A/R | C |
| Contract tests | C | C | R | A/R | C | C | I |
| Simulation | A | R | R | R | C | A/R | R |
| Release approval | A | C | A/R | R | A/R | A/R | C |
| Audit evidence | C | R | R | R | C | A/R | R |
| Incident response | A | C | A/R | R | A/R | A/R | R |
10.4 Metrics
| Metric | Why it matters |
|---|---|
| Approved tools by side effect tier | Shows exposure risk and platform maturity。 |
| Tool reuse rate | Measures whether contract-first design creates reusable capabilities。 |
| Schema validation failure rate | Detects model drift, prompt drift or contract mismatch。 |
| Policy denied tool calls | Reveals misuse, unclear UX or missing entitlement。 |
| Approval queue load | Measures operational impact of control design。 |
| Idempotency conflict count | Finds unsafe retries, duplicate proposals or orchestration bugs。 |
| Event DLQ by category | Separates schema, entitlement, dependency, policy and poison message problems。 |
| Consumer lag for agent workers | Shows event-driven backlog and customer SLA risk。 |
| Contract version adoption | Tracks migration and stale consumers。 |
| Audit replay success rate | Proves evidence chain works under investigation。 |
11. Interview Expression
Q1: 为什么 AI agent 工具设计要 contract-first?
30 秒版本:
因为 agent 调工具不是简单函数调用, 而是在触达企业系统, 客户资料, 资金动作和合规流程。Prompt 可以指导模型, 但不能保证 schema, 权限, 幂等, 审批, 审计和兼容性。Contract-first 用 OpenAPI, AsyncAPI, JSON Schema, CloudEvents 和 tool contract 把边界变成可验证, 可测试, 可审计的系统资产。
2 分钟版本:
我会先把 agent 能力拆成 query, command, event, decision 和 workflow task。同步查询和命令用 OpenAPI 描述, 异步事实和事件流用 AsyncAPI + CloudEvents 描述, 所有输入输出用 JSON Schema 约束, trace 用 OpenTelemetry 串联。然后每个 tool 都有 contract card: owner, side effect, allowed intent, permission, approval, idempotency, failure model, audit 和 versioning。这样模型只能提出工具调用, 真正执行由 gateway, policy engine, approval workflow 和 source system 控制。
Q2: OpenAPI 和 AsyncAPI 在 agent 架构里怎么分工?
OpenAPI 适合同步 query 和 command: agent 需要立即读取 case, 搜索交易, 创建 proposal 或调用 decision service。AsyncAPI 适合事件驱动: 某个业务事实已经发生, 多个 consumer 包括 agent 独立响应, 需要 replay, DLQ 和 consumer contract。简单说, command 是请求做事, event 是说明事情已经发生。Agent 不应该用一串同步 API 假装企业流程, 多步骤和人工审批应交给 workflow, 状态变化用 event 通知。
Q3: JSON Schema 对 AI 产品的价值是什么?
JSON Schema 是 agent IO 的控制面。它约束 tool arguments, observation, structured model output, event payload, decision input 和 audit event。没有 schema, agent 输出就是文本约定, 很难做 contract test, eval, replay 和审计。金融零售里我会特别关注 additionalProperties, enum, money representation, data classification, provenance, free text max length 和 structured error model。
Q4: 如何设计一个高风险写工具?
我不会直接暴露高风险写工具给模型。第一步把它拆成 proposal, approval 和 execution。Agent 可以创建 proposal 或 draft, approval workflow 绑定 exact parameters, evidence, policy version 和 approver, execution command 必须带 approval_id 和 idempotency key。API 返回 structured observation, 同时产生 audit event 和 domain event。对资金, 客户权益, KYC/AML 状态或客户外发, 默认需要 human review 或 dual control。
Q5: 如何处理 agent event consumer 的 replay 和 DLQ?
Agent consumer 要像普通生产级 consumer 一样治理。每条 CloudEvent 用 event id + consumer name 做 inbox 去重, DLQ 按 schema, entitlement, dependency, policy 和 poison 分类。Replay 需要审批, replay marker, 固定 schema/model/prompt/tool version, 并确认不会触发重复写入。高风险 replay 还要记录 replay approval 和 before/after diff, 作为审计证据。
Q6: 如何防止契约变更破坏 agent?
我会把 compatibility policy 做成 release gate。新增 optional field 通常兼容, 删除 required field, 改类型, 改 enum 语义, 新增 required field, 改 side effect tier 或降低 approval control 都是 breaking 或高风险变更。上线前跑 provider contract tests, consumer contract tests, model-output eval, policy regression 和 simulation。对事件语义变化, 不只改字段, 通常需要新的 event type 或 major version。
Q7: 审计链路怎么设计?
审计从契约开始设计。每次工具调用要记录 correlation_id, causation_id, actor_id, agent_id, model_id, prompt_version, tool_version, input_hash, policy_decision_id, approval_id, output_ref 和 trace_id。OpenTelemetry 把 model call, tool validation, policy evaluation, API command, event publish 和 event consume 串起来。这样可以从客户影响反查到模型建议, 证据来源, 人工审批和最终系统动作。
Q8: 你作为 PM/BA/Architect 如何把它做成作品集?
我会选一个金融零售场景, 例如 payment dispute agent。作品集不只展示聊天界面, 而是展示 Tool Contract Card, OpenAPI command, AsyncAPI event, JSON Schema observation, compatibility policy, contract test matrix, approval design, idempotency design 和 audit trace。这样能证明我理解 AI 产品不是 prompt demo, 而是企业架构, 解决方案架构和治理资产。
12. Final Checklist
| Area | Advanced check |
|---|---|
| Capability taxonomy | Every agent capability is classified as resource, query, command, event, decision, workflow task or tool。 |
| Contract artifacts | OpenAPI, AsyncAPI, JSON Schema and Tool Contract Card exist for exposed capabilities。 |
| Prompt boundary | Prompt does not own authorization, side effect control, approval, audit or compatibility。 |
| Schema rigor | Tool input, model output, observation, event payload and audit event are schema validated。 |
| Side effects | Every tool has a side effect tier and enforcement path。 |
| Approval | High-risk actions bind approval to exact parameters, evidence and policy version。 |
| Idempotency | Every write command and event consumer has dedupe behavior。 |
| Event governance | Events use CloudEvents envelope, AsyncAPI contract, consumer contract, DLQ and replay policy。 |
| Observability | OpenTelemetry trace connects model, tool, API, event, workflow and audit。 |
| Testing | Provider, consumer, model-output, policy, replay and failure-mode tests are part of release。 |
| Versioning | API, event, schema, tool, prompt, policy, workflow and eval set versions are governed together。 |
| Portfolio evidence | The case study shows product value, architecture quality and governance evidence。 |
核心记忆:
Production AI agents do not become safe because prompts ask them to behave.
They become governable when every tool, API, event and output is defined as a contract,
enforced by policy, tested before release, observable at runtime and auditable after impact.