返回 Papers
AI 扩展计划 / Playbooks

AI Contract-First Tool/API Design / OpenAPI / AsyncAPI Playbook

本手册解决一个企业 AI 落地中的硬问题:

1,657AI_CONTRACT_FIRST_TOOL_API_DESIGN_OPENAPI_ASYNCAPI_PLAYBOOK.md

AI Contract-First Tool / API Design with OpenAPI and AsyncAPI Playbook

受众: AI Product Architect, AI Platform PM, Enterprise Architect, Solution Architect, Integration Architect, 高阶 BA, 金融零售 AI 工具和平台负责人。 核心问题: 当 AI agent 要调用 tool, API, event stream, workflow task 和外部系统时, 如何用 contract-first 方法定义边界, 权限, 副作用, 输出结构, eval, 审计和版本治理, 而不是依赖 prompt 文本约定。 学习目标: 能把 OpenAPI, AsyncAPI, JSON Schema, CloudEvents 和 OpenTelemetry 组合成一套 AI agent 工具契约治理体系, 覆盖 tool boundary, structured IO, event-driven integration, approval, idempotency, audit evidence, compatibility policy, contract testing 和金融零售作品集表达。

重要说明: 本文是架构学习, 产品设计和作品集材料, 不是法律意见, 合规结论, 审计意见或生产批准。金融零售正式项目必须由 Business Owner, Architecture, Engineering, Security, Privacy, Legal, Compliance, Model Risk, Operational Risk, Internal Audit 和系统 owner 共同确认适用边界, 客户影响, 监管义务和上线门禁。


目的, 适用对象和核心观点

目的

本手册解决一个企业 AI 落地中的硬问题:

Agent 可以读什么, 写什么, 发什么, 触发什么事件, 影响什么客户权益,
这些边界必须是机器可验证, 可测试, 可审计, 可版本化的契约,
而不是藏在 prompt, 会议纪要, wiki 或 connector 代码里的非正式约定。

目标不是教 OpenAPI 或 AsyncAPI 入门, 而是训练如何把这些标准用于 AI agent 的工具边界, 事件驱动集成, structured output, 权限, 审批, eval 和审计。

适用对象

Role本手册帮助你输出什么
AI Platform PMTool catalog, contract intake, release gate, adoption metric, tool reuse strategy。
Enterprise ArchitectAgent integration principles, API/event taxonomy, governance model, cross-domain contract standard。
Solution ArchitectOpenAPI / AsyncAPI / JSON Schema / CloudEvents 落地设计, gateway, observability, versioning。
Advanced BA / CBAP把业务能力, 决策边界, 信息需求, exception path 和 audit evidence 转成契约资产。
Risk / Compliance Technology LeadApproval rule, policy evidence, audit trail, replay scope, change control 和监管问询证据。

核心观点

观点解释
Prompt is not a contractPrompt 可以描述行为偏好, 但不能承担接口验证, 权限执行, 副作用控制, 兼容性治理和审计证据。
Tool is a governed capabilityAgent tool 不是函数包装, 而是带 owner, schema, policy, side effect, audit 和 lifecycle 的业务能力。
API is for commands and queriesOpenAPI 适合同步 query 和 command, 尤其是需要即时结果, 可明确失败和可验证错误模型的能力。
Event is for facts and decouplingAsyncAPI + CloudEvents 适合已发生事实, 多 consumer, 异步处理, replay 和 event-driven agent。
JSON Schema is the IO control surface输入和输出都要 schema 化, 让模型输出, tool argument, observation, eval 和 contract test 共享同一结构语言。
Audit starts at contract design审计不是上线后补日志, 而是在契约里定义 correlation, causation, actor, policy decision, version 和 evidence refs。
Compatibility is a product policy契约变更会影响 agent behavior, eval set, workflow, downstream consumer 和运营风险, 不能只看技术编译通过。

Source Anchors

以下官方锚点作为术语和设计边界。本文把它们转成企业 AI tool, API, event 和金融零售治理语言。

SourceOfficial link本文使用方式
OpenAPI Specificationhttps://spec.openapis.org/oas/latest.html描述同步 HTTP API, command / query operation, request / response schema, auth, error, versioning 和 tool mapping。
AsyncAPI Specificationhttps://www.asyncapi.com/docs/reference/specification/latest描述 event channel, message, operation, server binding, producer / consumer ownership 和异步契约。
JSON Schemahttps://json-schema.org/统一 tool input, tool output, event payload, structured model output, eval fixture 和 schema compatibility 检查。
CloudEventshttps://cloudevents.io/标准化事件 envelope, 包括 id, source, type, subject, time, dataschema 和扩展 attributes。
OpenTelemetry Documentationhttps://opentelemetry.io/docs/设计 trace, span, metric, log, baggage 和 cross-system observability, 连接 model call, tool call, API call, event 和 audit。

One-Sentence Positioning

Contract-first AI tool design means every agent capability is specified before execution:
what it accepts, what it returns, who may use it, what side effects it can create,
which approvals it needs, how it fails, how it emits events, how it is tested,
how it is observed, and how it changes without breaking consumers.

中文表达:

契约优先的 AI 工具设计, 是把 agent 的每个外部能力先定义成可验证, 可授权, 可审计, 可测试, 可演进的契约, 再交给模型编排和执行。

1. 为什么 AI Tool / API 不能靠 Prompt 文本约定

AI agent 项目里常见的早期做法是:

System prompt: 只有在必要时才调用退款工具。
Tool description: refund_customer 可以给客户退款。
Developer note: 金额大于 100 美元要找主管。
Wiki: 退款失败时创建工单。

这类约定对 demo 有效, 对金融零售生产系统不够。原因不是 prompt 没用, 而是 prompt 不具备以下工程和治理能力。

Prompt-only 约定缺失能力Contract-first 设计
“只在有权限时调用”模型不知道真实 user, tenant, case relationship, purpose 和 approval 状态Tool gateway 调 PDP / PEP, 检查 user, resource, purpose, risk tier 和 policy version。
“输出 JSON”模型可能漏字段, 改枚举, 加解释文本, 或无法表达错误状态JSON Schema + structured output validation + repair / reject policy。
“不要重复退款”重试, timeout, 并发和 event replay 会造成副作用重复Command API 使用 idempotency key, request hash 和 replay result。
“高风险动作要审批”模型可能误判风险, 拆分动作或绕过审批契约声明 side effect tier, approval subject, approval scope, expiry 和 binding。
“按事件触发 agent”事件语义, schema, ordering, retry, DLQ 和 replay 没定义AsyncAPI + CloudEvents + inbox dedupe + consumer contract。
“记录日志”日志字段不稳定, 无法串联 model, tool, API, event, approvalOpenTelemetry trace + audit event schema + correlation / causation ID。
“以后兼容”字段变更可能破坏 agent prompt, eval, consumer 和 dashboardCompatibility policy + schema registry + consumer contract tests。

关键判断:

Prompt 可以指挥模型如何思考和表达。
契约决定系统允许什么输入, 输出, 副作用, 权限, 事件, 审批, 审计和版本变化。

1.1 Prompt 能做什么

能力合理用法
Task framing说明当前业务任务, 例如 summarize case, draft response, propose next action。
Reasoning style要求列出 evidence, assumption, confidence, exception。
Tool selection hint告诉模型优先使用 read-only tool 或先 dry-run。
Output formatting引导模型生成符合 schema 的字段和解释。
Conversation UX控制语气, 澄清问题, 面向用户的表达。

1.2 Prompt 不应承担什么

不应承担原因
Authorization权限必须由身份, 资源关系, purpose, policy 和 PEP 执行。
Financial side effect control资金动作需要 idempotency, approval, ledger consistency 和 audit。
Regulatory decision boundaryKYC, AML, credit, complaint, dispute 等高风险边界需要外部规则和人工责任人。
Schema compatibility兼容性是 producer / consumer 契约, 不是生成文本风格。
Replay and recovery重放需要事件 ID, causation chain, inbox state, version freeze 和补偿规则。
Audit evidence审计需要不可变证据, 版本, policy decision, approver 和 trace。

2. Contract-First Taxonomy for AI Agents

Contract-first 不是只写 OpenAPI 文件。它是一套把 AI capability 分层建模的 taxonomy。

2.1 Capability Types

Type语义推荐契约AI 角色高风险信号
Resource可读取的证据或上下文对象URI + metadata + JSON Schema检索, 引用, 摘要PII, PCI, policy doc, external untrusted content。
Query API读取当前系统状态OpenAPI GET / safe POST search拉取事实, 填充 evidence过宽字段, 缺 entitlement, freshness 不明。
Command API请求系统执行动作OpenAPI command operation生成 action proposal 或受控执行写入, 资金, 客户权益, 状态变化。
Event已发生事实通知AsyncAPI message + CloudEvents触发 agent, enrichment, workflow多 consumer, replay, ordering, schema break。
Workflow task长流程中的受控步骤Workflow contract + task IO schema证据整理, 草稿, 推荐状态转移, 人审, SLA, compensation。
Decision API规则或策略判断Decision contract + JSON Schema提供 facts, 接收 decision资格, KYC, AML, credit, advice boundary。
ToolAgent 可调用能力Tool Contract Card + OpenAPI / JSON Schema调用外部能力side effect, approval, injection, rate abuse。
ObservationTool 返回给 agent 的结果JSON Schema + trust label继续推理和生成工具输出被误当系统指令。

2.2 Query, Command, Event, Decision

把能力拆成 CQED 四类, 可以避免一个 do_everything tool 承载所有风险。

Category命名倾向典型例子Contract priority
QuerygetCaseSummary, searchTransactions查询 dispute case, KYC profile, policy snippetEntitlement, field minimization, freshness, pagination, source reference。
CommandcreateRefundProposal, sendCustomerNotice创建退款提案, 更新 case 状态, 发送客户消息Idempotency, approval, validation, structured error, audit。
Eventpayment.dispute.opened.v1争议打开, KYC 状态变化, AML evidence requestedCloudEvents metadata, schema, producer, consumer, replay, DLQ。
DecisionevaluateRefundEligibility判断资格, policy route, advice boundary, tool authorizationRule id, policy version, input facts hash, reason, explainability。

2.3 Side Effect Tiers

建议把所有 agent capability 按副作用分层, 并把分层写入 tool contract 和 release gate。

Tier说明金融零售例子默认控制
S0 Read public读取公开或非敏感资料产品费率公开 FAQlogging, source citation。
S1 Read restricted读取受限内部或客户资料查询 case summary, KYC 文件摘要entitlement, data minimization, audit。
S2 Analyze / draft生成内部草稿或建议, 不写 source of truthdispute memo draft, AML narrative draftoutput schema, citation, human review sampling。
S3 Internal write写内部系统但不直接外发或动资金创建 task, 更新 case noteapproval by policy, idempotency, audit event。
S4 Customer-visible / financial外发客户消息, 退款, 调整权益send notice, create provisional credithuman approval, dual control by threshold, compensation, full trace。
S5 Regulated final decision影响 KYC / AML / credit / complaint / legal record 的最终结论close AML alert, reject KYC, adverse action noticeagent assist only, deterministic decision service, named human owner。

2.4 Trust Boundaries

Boundary设计问题Contract artifact
User to agent用户输入是否包含 prompt injection, PII, coercion 或越权请求Input classification schema, intent taxonomy。
Agent to tool模型提出的 tool call 是否符合 schema, purpose, permissionTool Contract Card, JSON Schema, PDP policy。
Tool to source systemConnector 是否最小权限, 是否可审计, 是否带 idempotencyOpenAPI contract, auth scope, audit header standard。
Source system to event bus事件是否代表事实, 是否可重放, 是否有 schema ownerAsyncAPI, CloudEvents, schema registry。
Event bus to agent consumerAgent 消费是否去重, 限流, DLQ, replay safeConsumer contract, inbox schema, replay policy。
Tool output to model contextobservation 是否被当作证据而非指令Observation schema, trust label, source provenance。

3. OpenAPI for Agent Tools and Synchronous APIs

OpenAPI 的价值不只是给开发者生成 SDK。对 AI agent 来说, 它是同步能力的 machine-readable contract, 可以直接驱动 tool catalog, schema validation, policy review, mock server, contract tests 和 eval fixture。

3.1 哪些能力适合 OpenAPI

Capability适合原因注意事项
Read query需要即时返回, 参数明确, 可分页不要返回过宽字段, 必须带 entitlement 和 sensitivity labels。
Search query需要复杂过滤或语义外的结构化检索使用 POST /search 时仍保持 read-only semantics。
Command request写入动作需要明确 request, response, error所有写动作定义 idempotency, approval, audit headers。
Dry-run / impact preview先计算动作影响, 再审批或执行dry-run response 不应修改 source of truth。
Decision service同步返回 allow / deny / review / reason明确 policy version, rule id, input facts hash。

3.2 OpenAPI Operation Design for Tools

OpenAPI elementAgent-oriented requirement
operationId稳定映射为 tool name, 例如 paymentDisputeCreateRefundProposal, 不随文案变化。
summary一句话业务能力, 不包含授权承诺。
description包含 allowed intent, non-goals, side effect tier 和 human review boundary。
requestBodyJSON Schema 严格定义字段, 枚举, 格式, 最小最大值, additionalProperties: false
responses成功, validation error, entitlement denied, approval required, conflict, dependency timeout 分开建模。
security说明 user-delegated token, service identity, OAuth scope, mTLS 或 internal auth。
parameters必要 headers 包括 correlation ID, actor, purpose, idempotency key 和 policy decision reference。
examples包含 normal, approval required, policy denied, conflict, timeout, redacted output。

3.3 Tool Mapping Pattern

把 OpenAPI operation 暴露给 agent 时, 不应直接把全部 API surface 给模型。建议经过 tool gateway 生成受控 tool。

OpenAPI contract
-> API owner approval
-> tool capability card
-> risk tier classification
-> policy mapping
-> schema validation
-> model-facing tool definition
-> gateway execution
-> normalized observation
-> audit event
API fieldTool-facing interpretation
operationIdTool name baseline。
Request schemaTool input schema。
Response schemaObservation schema。
Error responseAgent recoverability model。
Security schemeGateway credential and user delegation requirement。
TagsDomain, owner, use case group。
Extensionsx-risk-tier, x-side-effect, x-approval-required, x-audit-event, x-idempotency-required

OpenAPI 允许 vendor extensions。企业 AI 平台可约定以下 x- 字段, 让架构评审和自动化检查更稳定。

x-ai-tool:
  exposure: approved
  sideEffectTier: S3_INTERNAL_WRITE
  allowedIntents:
    - payment_dispute_investigation
    - customer_complaint_resolution
  disallowedIntents:
    - marketing_retention
    - general_goodwill_refund
  approval:
    required: conditional
    policy: refund_proposal_approval_policy_v3
  idempotency:
    required: true
    keySemantics: case_id + action_type + proposal_hash
  audit:
    eventType: com.momofinance.agent.tool.executed.v1
    requiredHeaders:
      - X-Correlation-Id
      - X-Actor-Id
      - X-Purpose
      - X-Policy-Decision-Id
  observation:
    trustLevel: trusted_system
    sourceSystem: payment-dispute-platform

3.5 Example: Refund Proposal Command

This is a compact OpenAPI-style excerpt for an agent-safe command. It creates a proposal, not a direct refund execution.

openapi: 3.1.0
info:
  title: Payment Dispute Action API
  version: 2026.06.1
paths:
  /disputes/{case_id}/refund-proposals:
    post:
      operationId: paymentDisputeCreateRefundProposal
      summary: Create a refund proposal for a payment dispute case
      description: Creates an internal proposal for human review. It does not post ledger entries or send customer notifications.
      tags:
        - payment-dispute
        - agent-tool
      parameters:
        - name: case_id
          in: path
          required: true
          schema:
            type: string
            pattern: "^DSP-[0-9]{4}-[0-9]{5}$"
        - name: Idempotency-Key
          in: header
          required: true
          schema:
            type: string
            minLength: 32
            maxLength: 128
        - name: X-Correlation-Id
          in: header
          required: true
          schema:
            type: string
        - name: X-Actor-Id
          in: header
          required: true
          schema:
            type: string
        - name: X-Purpose
          in: header
          required: true
          schema:
            type: string
            enum:
              - dispute_investigation
              - complaint_resolution
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: "#/components/schemas/CreateRefundProposalRequest"
      responses:
        "201":
          description: Proposal created or idempotently returned
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/RefundProposalObservation"
        "400":
          description: Validation failure
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/ToolError"
        "403":
          description: Entitlement or policy denied
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/ToolError"
        "409":
          description: Idempotency conflict or case state conflict
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/ToolError"
        "428":
          description: Additional approval is required before execution
          content:
            application/json:
              schema:
                $ref: "#/components/schemas/ApprovalRequired"
      x-ai-tool:
        exposure: approved
        sideEffectTier: S3_INTERNAL_WRITE
        approval:
          required: conditional
          policy: dispute_refund_proposal_policy_v3
        idempotency:
          required: true
          keySemantics: case_id + action_type + proposal_hash
components:
  schemas:
    CreateRefundProposalRequest:
      type: object
      required:
        - amount
        - currency
        - reason_code
        - evidence_refs
      additionalProperties: false
      properties:
        amount:
          type: string
          pattern: "^[0-9]+(\\.[0-9]{2})$"
        currency:
          type: string
          enum:
            - USD
        reason_code:
          type: string
          enum:
            - goods_not_received
            - duplicate_charge
            - merchant_error
            - fraud_claim
        evidence_refs:
          type: array
          minItems: 1
          items:
            type: string
        agent_rationale:
          type: string
          maxLength: 2000
    RefundProposalObservation:
      type: object
      required:
        - status
        - proposal_id
        - case_id
        - approval_status
        - audit_ref
        - source_system
      additionalProperties: false
      properties:
        status:
          type: string
          enum:
            - created
            - returned_existing
        proposal_id:
          type: string
        case_id:
          type: string
        approval_status:
          type: string
          enum:
            - pending_review
            - supervisor_required
            - policy_denied
        audit_ref:
          type: string
        source_system:
          type: string
          const: payment-dispute-platform
    ToolError:
      type: object
      required:
        - error_code
        - message
        - recoverability
        - audit_ref
      additionalProperties: false
      properties:
        error_code:
          type: string
          enum:
            - validation_failed
            - entitlement_denied
            - policy_denied
            - idempotency_conflict
            - case_state_conflict
            - dependency_timeout
        message:
          type: string
        recoverability:
          type: string
          enum:
            - agent_can_retry
            - human_review_required
            - do_not_retry
        audit_ref:
          type: string
    ApprovalRequired:
      type: object
      required:
        - required_approval_type
        - reason_code
        - approval_packet_ref
      additionalProperties: false
      properties:
        required_approval_type:
          type: string
          enum:
            - supervisor
            - dual_control
            - compliance_review
        reason_code:
          type: string
        approval_packet_ref:
          type: string

3.6 OpenAPI Review Questions

AreaReview question
Business boundaryDoes the operation represent one clear query or command, not a hidden workflow?
Tool exposureIs this operation approved for agent use, or only for internal service-to-service use?
Input schemaAre fields typed, bounded, enumerated, and protected from extra parameters?
Output schemaCan an agent reliably distinguish success, pending, denied, conflict, and retryable failure?
PermissionDoes execution depend on user, tenant, case relationship, purpose, and policy decision?
Side effectIs the side effect tier explicit and enforced outside the model?
IdempotencyDoes every write command accept an idempotency key and request hash behavior?
AuditAre correlation, actor, purpose, policy decision and approval references mandatory?
VersionIs the operation versioned and covered by contract tests before tool exposure changes?

4. AsyncAPI for Events and Event-Driven Agents

AsyncAPI describes asynchronous APIs: channels, messages, operations, servers and bindings. For AI agents, it is the backbone of event-driven integration.

4.1 When to Use Events

Use events when the message is a fact or state change that multiple consumers may need independently.

Use event whenFinancial retail example
A business fact has happenedpayment.dispute.opened.v1 after a dispute case is created。
Multiple consumers react differentlyCase dashboard, notification workflow, analytics, agent evidence worker。
Processing can be asynchronousAgent creates case summary after event, not inside the customer transaction path。
Replay is usefulRebuild evidence packs or regenerate summaries after a model or policy change。
Loose coupling mattersKYC status change should not require KYC service to call every downstream system synchronously。

Do not use an event to hide a command. If the semantic meaning is “please perform a refund”, use a command API or workflow task. If the semantic meaning is “refund approved”, use an event.

4.2 Event Types for Agents

Event categoryMeaningAgent usage
Domain eventBusiness fact already happenedTrigger summary, enrichment, risk triage, workflow step。
Command result eventA command completed or failedObserve execution and update case status。
Audit eventA controlled action or decision occurredSupport investigation, monitoring, policy evidence。
Evaluation eventModel/tool output evaluatedFeed EvalOps and release gates。
Control eventPlatform-level state changedDisable tool, rotate schema, pause consumer。

4.3 AsyncAPI Design Elements

AsyncAPI elementAgent-oriented requirement
serversBroker endpoints and environment boundaries, with security scheme and region。
channelsDomain-oriented stream names, e.g. payments.disputes.events.v1
operationsExplicit publish / subscribe ownership。
messagesCloudEvents envelope plus payload schema。
components.schemasJSON Schema for event data and headers。
bindingsKafka, AMQP, SNS/SQS, EventBridge or broker-specific constraints。
ExamplesInclude normal, redacted, high-risk, replay marker and schema evolution cases。

4.4 CloudEvents Envelope Standard

CloudEvents provides a common event metadata envelope. It does not replace domain payload schema; it wraps it.

{
  "specversion": "1.0",
  "id": "evt_20260629_140501_00091",
  "source": "urn:service:payment-dispute",
  "type": "com.momofinance.payment.dispute.opened.v1",
  "subject": "dispute/DSP-2026-09291",
  "time": "2026-06-29T14:05:01Z",
  "datacontenttype": "application/json",
  "dataschema": "https://schemas.example.com/payment/dispute-opened/1.0.0",
  "correlationid": "corr_7d21d6c4",
  "causationid": "cmd_create_dispute_48391",
  "tenantid": "retail-us",
  "actorid": "service:dispute-case-api",
  "riskclass": "customer-impacting",
  "traceparent": "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01",
  "data": {
    "case_id": "DSP-2026-09291",
    "customer_ref": "cust_ref_9f22",
    "amount": "128.40",
    "currency": "USD",
    "reason_code": "goods_not_received",
    "opened_channel": "mobile_app"
  }
}

Recommended extension attributes:

AttributePurpose
correlationidConnect user request, model call, tool call, API command, workflow and event。
causationidIdentify the command, event or workflow step that caused this event。
tenantidEnforce multi-tenant isolation and audit filtering。
actoridIdentify human, service, agent or workflow actor。
policydecisionidLink to authorization, approval or guardrail decision。
schemaidImmutable schema registry identifier。
riskclassLow, restricted, customer-impacting, financial, regulated。
traceparentPropagate OpenTelemetry distributed trace context。
replayidMark controlled replay and connect to replay approval。

4.5 AsyncAPI Excerpt

asyncapi: 3.0.0
info:
  title: Payment Dispute Event API
  version: 2026.06.1
servers:
  production:
    host: events.internal.example.com
    protocol: kafka
    description: Production event broker for payment dispute domain
channels:
  payments.disputes.events.v1:
    address: payments.disputes.events.v1
    messages:
      DisputeOpened:
        $ref: "#/components/messages/DisputeOpened"
operations:
  publishDisputeOpened:
    action: send
    channel:
      $ref: "#/channels/payments.disputes.events.v1"
    messages:
      - $ref: "#/components/messages/DisputeOpened"
  subscribeDisputeOpenedForAgentEvidence:
    action: receive
    channel:
      $ref: "#/channels/payments.disputes.events.v1"
    messages:
      - $ref: "#/components/messages/DisputeOpened"
    x-agent-consumer:
      owner: ai-case-evidence-platform
      consumerGroup: dispute-evidence-agent-v1
      inboxDedupe: required
      dlqOwner: ai-integration-operations
      replayAllowed: approved_replay_only
components:
  messages:
    DisputeOpened:
      name: com.momofinance.payment.dispute.opened.v1
      title: Payment dispute opened
      contentType: application/cloudevents+json
      payload:
        $ref: "#/components/schemas/DisputeOpenedCloudEvent"
      examples:
        - name: mobile_goods_not_received
          payload:
            specversion: "1.0"
            id: evt_20260629_140501_00091
            source: urn:service:payment-dispute
            type: com.momofinance.payment.dispute.opened.v1
            subject: dispute/DSP-2026-09291
            time: "2026-06-29T14:05:01Z"
            datacontenttype: application/json
            dataschema: https://schemas.example.com/payment/dispute-opened/1.0.0
            correlationid: corr_7d21d6c4
            data:
              case_id: DSP-2026-09291
              customer_ref: cust_ref_9f22
              amount: "128.40"
              currency: USD
              reason_code: goods_not_received
              opened_channel: mobile_app
  schemas:
    DisputeOpenedCloudEvent:
      type: object
      required:
        - specversion
        - id
        - source
        - type
        - subject
        - time
        - datacontenttype
        - dataschema
        - data
      additionalProperties: true
      properties:
        specversion:
          type: string
          const: "1.0"
        id:
          type: string
        source:
          type: string
        type:
          type: string
          const: com.momofinance.payment.dispute.opened.v1
        subject:
          type: string
        time:
          type: string
          format: date-time
        datacontenttype:
          type: string
          const: application/json
        dataschema:
          type: string
          format: uri
        correlationid:
          type: string
        causationid:
          type: string
        traceparent:
          type: string
        data:
          $ref: "#/components/schemas/DisputeOpenedData"
    DisputeOpenedData:
      type: object
      required:
        - case_id
        - customer_ref
        - amount
        - currency
        - reason_code
        - opened_channel
      additionalProperties: false
      properties:
        case_id:
          type: string
        customer_ref:
          type: string
        amount:
          type: string
          pattern: "^[0-9]+(\\.[0-9]{2})$"
        currency:
          type: string
          enum:
            - USD
        reason_code:
          type: string
          enum:
            - goods_not_received
            - duplicate_charge
            - fraud_claim
            - merchant_error
        opened_channel:
          type: string
          enum:
            - mobile_app
            - branch
            - call_center
            - web

4.6 Event Consumer Contract

Every agent consumer should have a consumer contract, not just a subscribed topic.

FieldExample decision
Consumer namedispute-evidence-agent-v1
Business purposeGenerate evidence summary draft for newly opened dispute cases。
OwnerAI Case Evidence Platform。
Input eventscom.momofinance.payment.dispute.opened.v1
Output artifactsInternal evidence summary, no customer-visible output。
Output eventscom.momofinance.agent.evidence_summary.generated.v1
DedupeInbox table keyed by CloudEvents id and consumer name。
OrderingPer case_id; stale events rechecked against case API before action。
RetryExponential backoff for dependency timeout; no retry for schema or policy denied。
DLQClassified as schema, entitlement, dependency, policy, poison。
ReplayRequires replay approval, fixed schema version, fixed model / prompt / tool versions。
Kill switchCan pause consumer without stopping dispute domain events。

5. JSON Schema for Structured IO

JSON Schema is the common structure layer across OpenAPI, AsyncAPI, tool calls, model output, observations and eval tests.

5.1 Where JSON Schema Should Be Used

SurfaceWhy schema matters
Tool argumentsPrevent extra fields, wrong enums, prompt-injected parameters and malformed requests。
Tool observationsMake success, failure, partial data, source, sensitivity and freshness machine-readable。
Model structured outputLet orchestration validate extraction, classification, recommendation and draft metadata。
Event payloadsStabilize producer / consumer contract and replay。
Decision service inputFreeze facts used for allow, deny, review and reason codes。
Eval fixturesUse same schema for golden outputs, regression tests and red-team cases。
Audit eventsEnsure evidence fields are consistently populated across tools and workflows。

5.2 Schema Design Principles for Agent IO

PrinciplePractice
Close the shapeUse additionalProperties: false for agent-facing request and output schemas where feasible。
Prefer enumsUse controlled values for status, reason_code, risk_tier, decision, recoverability。
Separate data and explanationKeep machine decision fields separate from human-readable rationale。
Include provenanceEvery observation should identify source system, source record, retrieved time and data classification。
Distinguish unknown and absentUse explicit values like unknown, not_applicable, not_authorized, not empty strings。
Bound free textUse maxLength for rationales, notes and summaries。
Model money safelyUse string decimal plus currency, not floating point number。
Make partial results explicitInclude completeness, missing_fields, source_errors for degraded paths。

5.3 Observation Schema

Tool output should be a normalized observation. It should not be treated as instruction.

{
  "$id": "https://schemas.example.com/agent/tool-observation/1.0.0",
  "type": "object",
  "required": [
    "status",
    "tool_name",
    "tool_version",
    "source_system",
    "trust_level",
    "retrieved_at",
    "data_classification",
    "result",
    "audit_ref"
  ],
  "additionalProperties": false,
  "properties": {
    "status": {
      "type": "string",
      "enum": ["success", "partial", "denied", "failed"]
    },
    "tool_name": {
      "type": "string"
    },
    "tool_version": {
      "type": "string"
    },
    "source_system": {
      "type": "string"
    },
    "trust_level": {
      "type": "string",
      "enum": ["trusted_system", "trusted_policy", "untrusted_user_content", "vendor_content"]
    },
    "retrieved_at": {
      "type": "string",
      "format": "date-time"
    },
    "data_classification": {
      "type": "string",
      "enum": ["public", "internal", "pii", "pci", "restricted", "regulated"]
    },
    "result": {
      "type": "object"
    },
    "citations": {
      "type": "array",
      "items": {
        "type": "object",
        "required": ["source_ref", "record_type", "effective_time"],
        "additionalProperties": false,
        "properties": {
          "source_ref": {
            "type": "string"
          },
          "record_type": {
            "type": "string"
          },
          "effective_time": {
            "type": "string",
            "format": "date-time"
          }
        }
      }
    },
    "audit_ref": {
      "type": "string"
    }
  }
}

5.4 Structured Model Output Schema

For agent planning and recommendation, schema should separate extraction, proposal and confidence.

{
  "$id": "https://schemas.example.com/agent/action-proposal/1.0.0",
  "type": "object",
  "required": [
    "intent",
    "risk_tier",
    "proposed_action",
    "required_tools",
    "evidence_refs",
    "approval_expectation",
    "confidence",
    "rationale"
  ],
  "additionalProperties": false,
  "properties": {
    "intent": {
      "type": "string",
      "enum": [
        "dispute_investigation",
        "kyc_gap_review",
        "aml_evidence_summary",
        "customer_service_response",
        "policy_lookup"
      ]
    },
    "risk_tier": {
      "type": "string",
      "enum": ["S0", "S1", "S2", "S3", "S4", "S5"]
    },
    "proposed_action": {
      "type": "string",
      "enum": [
        "read_case",
        "draft_summary",
        "create_internal_task",
        "create_refund_proposal",
        "send_customer_notice",
        "route_to_human"
      ]
    },
    "required_tools": {
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "evidence_refs": {
      "type": "array",
      "minItems": 1,
      "items": {
        "type": "string"
      }
    },
    "approval_expectation": {
      "type": "string",
      "enum": ["none", "policy_check", "human_review", "dual_control", "not_allowed"]
    },
    "confidence": {
      "type": "number",
      "minimum": 0,
      "maximum": 1
    },
    "rationale": {
      "type": "string",
      "maxLength": 1500
    }
  }
}

5.5 Schema Review Questions

AreaQuestion
MeaningDoes every field have clear business meaning, not just technical name?
AuthorityWhich system is authoritative for this field?
SensitivityIs PII, PCI, confidential, restricted or regulated data labeled?
EnumAre status and reason values controlled and documented?
EvolutionCan optional fields be added without breaking current consumers?
DefaultsAre defaults safe, or do they hide missing data?
ValidationAre numeric, date, string and array bounds explicit?
Prompt riskCan a free text field carry untrusted instructions into model context?
EvidenceCan output be traced to source records and policy versions?
EvalCan the schema be used in golden set and regression tests?

6. Idempotency, Side Effect, Approval, Audit and Versioning

These five topics decide whether agent tools are production-grade.

6.1 Idempotency

Agent systems retry. HTTP clients retry. Message brokers redeliver. Humans refresh pages. Models may propose the same action twice. Without idempotency, a temporary timeout can become duplicate refund, duplicate notification, duplicate task or duplicate evidence write.

Action typeIdempotency key design
Create proposalcase_id + proposed_action_type + evidence_hash + policy_version
Execute approved actionapproval_id + approved_action_hash
Send customer messagecase_id + message_template_id + approved_content_hash + recipient_ref
Create internal tasksource_event_id + task_type + resource_ref
Event consumer processingcloud_event_id + consumer_name

Required behavior:

ScenarioExpected system behavior
Same key, same payload hashReturn original result, no duplicate side effect。
Same key, different payload hashReject with idempotency conflict。
Timeout after successful writeRetry returns original result and audit reference。
Event replayConsumer inbox detects processed event and marks replay as duplicate-safe。
Expired keyRetention policy must match business risk and dispute / audit period。

6.2 Side Effect

Every contract should declare side effect in a way both humans and automation can enforce.

Side effect attributeExample
sideEffectTierS3_INTERNAL_WRITE
sourceOfTruthChangecase_management.case_note.created
customerVisiblefalse for internal note, true for outbound notice。
financialMovementfalse for proposal, true for ledger posting。
reversiblecompensating_action_required for incorrect notice。
compensationOwnerPayment Dispute Operations。

Design rule:

If a tool can alter funds, account status, customer rights, KYC/AML state, complaint record or customer-visible communication,
it must not be exposed as a low-friction model tool. It needs policy gate, approval binding, audit and recovery design.

6.3 Approval

Approval must bind to the exact action, parameters, evidence and version. A generic “approved by supervisor” is too weak.

Approval fieldRequirement
approval_idStable identifier passed into execution command。
approval_subjectAction type and business object, e.g. dispute refund proposal for DSP-2026-09291
approved_parameters_hashHash of amount, currency, recipient, reason, evidence refs and policy version。
approver_identityHuman identity, role, tenant and assignment relationship。
approval_scopeSingle action, case-limited, amount-limited, expiry-limited。
approval_expiryTime after which execution requires renewed approval。
human_edit_diffDifference between agent draft and approved final content。
policy_decision_idPolicy result that determined approval requirement。

6.4 Audit

Audit evidence must connect the chain from user request to final side effect.

user request
-> identity and purpose
-> model call
-> context resources
-> tool proposal
-> schema validation
-> policy decision
-> approval packet
-> API command
-> event emitted
-> observation returned
-> final response

Minimum audit fields:

FieldWhy it matters
correlation_idLinks full business interaction。
causation_idExplains why a tool call or event happened。
actor_idHuman, agent, service or workflow actor。
agent_idAgent identity and owner。
model_idModel version used for proposal or draft。
prompt_versionPrompt or orchestration version。
tool_nameTool identity and version。
input_hashSensitive input reference without overlogging raw PII。
output_refDurable reference to observation or artifact。
policy_decision_idPDP result and version。
approval_idHuman approval or dual control reference。
trace_idOpenTelemetry trace link。

6.5 OpenTelemetry for Agent Contract Observability

OpenTelemetry should trace across AI orchestration, tool gateway, API, event producer and event consumer.

Recommended spans:

SpanKey attributes
agent.requestagent.id, session.id, user.role, tenant.id, purpose
model.generatemodel.id, prompt.version, input.tokens, output.tokens, eval.route
tool.validationtool.name, tool.version, schema.id, validation.result
policy.evaluatepolicy.id, policy.version, decision, reason_code
approval.createapproval.type, risk.tier, case.id, expiry
api.commandhttp.route, operation.id, idempotency.key.hash, status
event.publishevent.type, event.id, schema.id, topic
event.consumeconsumer.name, event.id, inbox.status, replay.id

Metrics:

MetricProduct / governance use
Tool call success / denial rateIndicates friction, misuse, missing permissions or policy drift。
Approval required / approved / rejected rateMeasures automation boundary quality and operational load。
Schema validation failure rateDetects model output drift or contract mismatch。
Idempotency conflict countFinds repeated proposals, retries and unsafe clients。
DLQ count by categorySeparates schema, entitlement, dependency, policy and poison failures。
Consumer lagShows event-driven agent backlog and SLA risk。
Contract version adoptionTracks migration and stale consumers。
Manual override rateSignals model recommendation quality or policy mismatch。

6.6 Versioning

Versioning must cover more than APIs.

ArtifactVersion policy
OpenAPISemantic version per service contract; breaking changes require new major or new operation。
AsyncAPIVersion channel / message and document; breaking event semantic change requires new event type。
JSON SchemaImmutable schema ID; compatibility checked before release。
Tool contractVersion when side effect, permission, approval, input, output or failure model changes。
PromptVersion prompt templates that influence tool selection, extraction or final wording。
Eval setVersion golden sets and red-team cases linked to contract releases。
PolicyVersion PDP policy, decision table, reason mapping and approval rule。
WorkflowVersion task IO schema, state transition, timeout and compensation map。

7. Contract Testing, Mocking and Simulation

Contract-first only works when contracts are executable in tests.

7.1 Test Layers

LayerTest goal
Schema validationReject malformed tool arguments, extra fields, wrong enum, missing evidence。
Provider contract testAPI / event producer conforms to OpenAPI / AsyncAPI and JSON Schema。
Consumer contract testAgent consumer and downstream services tolerate compatible changes and reject breaking ones。
Tool gateway testPolicy, approval, idempotency, rate limit, audit and observation normalization work。
Model-output testModel structured output conforms to schema under normal and adversarial prompts。
Event replay testReplay uses inbox dedupe, replay marker, fixed versions and no duplicate side effect。
Failure-mode testValidation, entitlement denied, approval required, conflict, dependency timeout handled differently。
End-to-end trace testCorrelation ID links model, tool, API, event, workflow and audit store。

7.2 Contract Test Cases

CaseExpected result
Agent sends unknown field override_approvalSchema validation rejects before policy or API call。
Amount string uses 128.4 instead of 128.40Validation failure with recoverability agent_can_retry
User lacks case entitlementPDP returns deny, tool gateway emits denied audit event, no API call。
Refund proposal repeats after timeoutSame idempotency key returns original proposal and audit ref。
Same idempotency key with different amountAPI returns conflict and no side effect。
Event payload removes required case_idConsumer contract test fails release。
New optional event field addedExisting consumer test passes, field ignored or logged。
Agent consumes replayed eventInbox marks duplicate or replay-safe processing path。
Tool output contains instruction textContext composer labels output as observation, not instruction。
Approval expires before executionCommand returns approval required, execution blocked。

7.3 Mocking Strategy

Mock typeUse
OpenAPI mock serverLet agent orchestration and tool gateway test request / response without source system dependency。
AsyncAPI event simulatorPublish sample CloudEvents to test consumers, DLQ and replay。
Policy decision stubTest allow, deny, approval required and redaction paths。
Approval service stubTest approval binding, expiry and human edit diff。
Error injectorSimulate timeout, 429, 409 conflict, poison event and schema mismatch。
Audit collector mockVerify required fields are emitted even on denied calls。

7.4 Simulation for Product and Risk

Simulation should answer product and governance questions, not just pass unit tests.

Simulation questionEvidence
How many tool calls become approval-required after new policy?Approval workload report by queue, role, region and risk tier。
Does a schema change break agent extraction or event consumers?Consumer compatibility report and model-output schema failure rate。
Does new tool exposure increase denied calls or prompt injection risk?Red-team run, denied call taxonomy, false allow count。
Can replay regenerate summaries without duplicate writes?Replay report with inbox, idempotency and output diff。
Does customer impact remain within agreed boundary?Case-level outcome comparison and complaint / SLA risk estimate。

7.5 Release Gate

GateRequired evidencePass standard
G1 Contract completenessOpenAPI / AsyncAPI / JSON Schema / Tool Contract CardOwner, schema, auth, side effect, audit and version complete。
G2 Risk classificationSide effect tier and customer / regulatory impactNo high-risk tool without approval and recovery design。
G3 Policy enforcementPDP / PEP map and testsHigh-risk paths cannot bypass PEP。
G4 Contract testsProvider and consumer contract test runNo critical failure, breaking change reviewed。
G5 Eval regressionModel output and tool selection evalNo unauthorized tool call, no schema-breaking output in critical cases。
G6 SimulationHistorical and synthetic scenario reportDecision delta, approval load and customer impact explained。
G7 ObservabilityTrace, metrics, audit sampleCorrelation links model, tool, API, event and approval。
G8 RollbackDisable tool, pause consumer, revert schema routeTested in non-prod with named owner。

8. Financial Retail Case Studies

8.1 Payment Dispute Agent

Scenario

An internal agent assists analysts with payment dispute cases. It reads case details, retrieves transaction evidence, drafts a summary, creates an internal refund proposal and triggers human review. It does not directly post ledger entries.

Contract Map

CapabilityContractRisk tierControls
Get dispute caseOpenAPI queryS1Case entitlement, field masking, freshness timestamp。
Search related transactionsOpenAPI queryS1Customer relationship check, pagination, PCI redaction。
Draft evidence summaryJSON Schema outputS2Citations, source refs, confidence, human review。
Create refund proposalOpenAPI commandS3Idempotency, approval policy, audit event。
Refund approved eventAsyncAPI + CloudEventsS4 resultLedger system is producer, agent only observes。
Send customer noticeOpenAPI command via workflowS4Human-approved content, template, dual control by threshold。

Key Design

Agent creates a proposal.
Workflow and policy decide approval.
Ledger service executes.
Domain event confirms.
Audit chain preserves evidence.

Failure Modes

FailureContract-first response
Agent proposes refund without evidenceRequest schema requires evidence refs; policy denies missing evidence。
Duplicate proposal after timeoutIdempotency returns original proposal。
Analyst changes amount after approvalExecution command rejects because approved parameter hash changed。
Event replay triggers duplicate taskInbox dedupe prevents duplicate workflow task。
Customer notice schema changesConsumer contract test catches missing approved wording field。

8.2 KYC Document Review Agent

Scenario

An agent assists KYC analysts by reading document metadata, extracting fields, comparing jurisdiction / product policy and drafting gaps. It does not approve or reject customers.

Contract Map

CapabilityContractRisk tierControls
Retrieve document metadataOpenAPI queryS1PII restriction, case entitlement。
Extract document factsStructured model output schemaS2Confidence, field provenance, low-confidence review。
Evaluate document sufficiencyDecision API schemaS3 decision supportPolicy version, rule id, reason code。
KYC case status changedAsyncAPI + CloudEventsS5 domain factProduced by workflow, not by agent。
Create reviewer taskOpenAPI commandS3Idempotency key by case + missing doc + policy version。

Key Design

The agent output is a gap analysis artifact, not a KYC decision. Final state transition remains in KYC workflow and named human / policy owner path.

8.3 AML Evidence Summarization Agent

Scenario

An AML operations team uses an agent to gather transaction patterns, customer profile references and prior alerts for investigator review. The agent drafts evidence packs and narrative candidates.

Contract Map

CapabilityContractRisk tierControls
Subscribe to aml.alert.evidence_requested.v1AsyncAPIS2 triggerConsumer contract, DLQ, replay approval。
Query account activityOpenAPI queryS1 restrictedAML purpose, data minimization, trace。
Generate narrative draftJSON Schema outputS2Distinguish facts, inference, uncertainty, missing evidence。
Close alertNot exposed as agent toolS5Human investigator and AML system only。
SAR-related artifactWorkflow task contractS5 supportHuman review, legal / compliance policy path。

Key Design

AML agent can assemble and explain evidence.
It cannot make the final suspicious activity determination or submit regulatory material.

8.4 Customer Service Action Agent

Scenario

A service agent helps representatives respond to customer inquiries, find policy, draft replies, create case notes and propose next actions.

Contract Map

CapabilityContractRisk tierControls
Search policyOpenAPI / resource schemaS1Effective date, jurisdiction, product filter。
Draft responseJSON Schema outputS2Policy citations, prohibited promise check。
Update case noteOpenAPI commandS3Idempotency, audit, no customer-visible effect。
Send customer messageOpenAPI command + approvalS4Human approval, approved template, delivery receipt event。
Escalate complaintWorkflow taskS4Complaint policy, SLA, audit, queue owner。

Key Design

Customer-visible commitment is not just language generation. It is a controlled action with approved wording, policy basis, send receipt, audit and recovery path.


9. Templates and Copyable Artifacts

The following artifacts use concrete example values so they can be adapted without relying on blank fields.

9.1 Tool Contract Card

# Tool Contract Card: payment_dispute.create_refund_proposal

## Identity
- Tool name: payment_dispute.create_refund_proposal
- Tool version: 2026.06.1
- Backing API operationId: paymentDisputeCreateRefundProposal
- Business owner: Payment Dispute Operations
- Technical owner: Payment Dispute Platform
- Risk owner: Retail Banking Operational Risk
- Source system: Payment Dispute Case Platform

## Business Capability
- Creates an internal refund proposal for a dispute case.
- Does not post ledger entries.
- Does not send customer communication.
- Supports dispute investigation and complaint resolution workflows.

## Allowed Intents
- dispute_investigation
- complaint_resolution

## Disallowed Intents
- marketing_retention
- discretionary_goodwill_refund_without_case
- customer_negotiation_pressure

## Input Contract
- Schema ID: https://schemas.example.com/payment/refund-proposal-request/1.0.0
- Required fields: case_id, amount, currency, reason_code, evidence_refs, agent_rationale
- Additional fields: rejected
- Money representation: decimal string plus ISO currency
- Evidence rule: at least one source evidence reference

## Output Contract
- Schema ID: https://schemas.example.com/payment/refund-proposal-observation/1.0.0
- Status values: created, returned_existing, policy_denied, approval_required
- Required provenance: source_system, proposal_id, audit_ref, policy_decision_id

## Risk and Permission
- Side effect tier: S3_INTERNAL_WRITE
- Customer visible: false
- Financial movement: false
- Required user relationship: assigned analyst or supervisor on the dispute case
- Required purpose: dispute_investigation or complaint_resolution
- Tenant boundary: retail-us
- Human approval: required before any ledger execution
- Dual control: required when amount is 500.00 USD or above

## Idempotency
- Required: yes
- Key semantics: case_id + action_type + proposal_hash
- Same key and same payload: return original proposal
- Same key and different payload: reject with idempotency_conflict
- Retention: aligned to dispute case retention and audit policy

## Failure Model
- validation_failed: agent may correct schema-compliant input
- entitlement_denied: do not retry without permission change
- policy_denied: route to human with reason
- approval_required: create approval packet
- case_state_conflict: refresh case state
- dependency_timeout: safe retry with same idempotency key

## Audit
- Event type: com.momofinance.agent.tool.executed.v1
- Required identifiers: correlation_id, causation_id, actor_id, agent_id, tool_version, model_id, prompt_version
- Required references: case_id, input_hash, output_ref, policy_decision_id, approval_packet_ref
- Sensitive fields: amount logged, raw customer PII not logged

## Observability
- Trace span: api.command
- Metrics: success_count, denial_count, approval_required_count, idempotency_conflict_count, p95_latency
- Alert: conflict spike or policy_denied spike above agreed threshold

## Versioning
- Breaking input schema change requires new tool major version.
- Approval policy change requires tool contract review.
- Side effect tier change requires architecture and risk sign-off.

9.2 Event Contract

# Event Contract: com.momofinance.payment.dispute.opened.v1

## Event Identity
- Event type: com.momofinance.payment.dispute.opened.v1
- Channel: payments.disputes.events.v1
- Producer: Payment Dispute Case Platform
- Business meaning: A payment dispute case has been opened and assigned an authoritative case id.
- Event tense: past-tense domain fact
- CloudEvents content type: application/cloudevents+json

## Envelope
- specversion: 1.0
- id: globally unique event id
- source: urn:service:payment-dispute
- subject: dispute/{case_id}
- time: producer event creation time
- dataschema: https://schemas.example.com/payment/dispute-opened/1.0.0
- correlationid: original customer or analyst interaction correlation id
- causationid: command or workflow step that opened the dispute
- traceparent: OpenTelemetry trace context

## Data Schema
- Required fields: case_id, customer_ref, amount, currency, reason_code, opened_channel
- PII rule: customer_ref is a reference, not raw PII
- Money rule: amount is decimal string, currency is enum
- Reason code rule: reason_code uses approved dispute taxonomy

## Consumer Contract
- AI evidence consumer: dispute-evidence-agent-v1
- Consumer action: generate internal evidence summary draft
- Consumer side effect: internal artifact only
- Inbox key: event id + consumer name
- Retry: dependency timeout only
- DLQ owner: AI Integration Operations
- Replay: approval required, replay marker required, idempotency validated

## Compatibility
- Adding optional data fields is compatible.
- Removing required fields is breaking.
- Changing reason_code meaning is breaking and requires new event type or major version.
- Moving raw PII into payload is prohibited without privacy and architecture review.

## Audit
- Event publish audit links producer command, actor, policy decision and trace id.
- Agent consumer audit links input event id, model version, prompt version, output artifact and schema version.

9.3 Schema Review Checklist

Review areaPass standard
Business meaningEvery field has a domain definition and source of authority。
Data classificationPII, PCI, confidential, restricted and regulated fields are labeled。
MinimalityPayload excludes fields not needed by declared consumers or tool purpose。
ValidationRequired fields, enum values, numeric bounds, string lengths and formats are explicit。
MoneyAmount uses decimal string and currency enum。
TimeTimestamps use ISO date-time and define event time vs processing time。
Free textFree text fields have max length, trust label and prompt injection handling。
ProvenanceOutput schema includes source refs, effective time and audit ref。
Error modelFailures are structured by validation, entitlement, policy, conflict, dependency and unknown。
CompatibilityAdditive, breaking and prohibited changes are documented。
Eval reuseSchema can be used for golden outputs and regression assertions。
Audit reuseSchema includes identifiers needed to reconstruct decision chain。

9.4 Compatibility Policy

# Contract Compatibility Policy: Agent Tool and Event Contracts

## Scope
This policy applies to OpenAPI operations exposed as agent tools, AsyncAPI event messages, JSON Schemas for model output, tool observations and audit events.

## Compatible Changes
- Add optional field with safe default behavior.
- Add enum value only when all consumers treat unknown values safely.
- Add response example without changing schema.
- Add non-required CloudEvents extension attribute.
- Improve description without changing semantics.

## Breaking Changes
- Remove required field.
- Rename field.
- Change field type, format, unit or money representation.
- Change enum meaning.
- Add required field.
- Change event business meaning while keeping the same event type.
- Change tool side effect tier.
- Change approval requirement in a way that reduces control strength.
- Change error code semantics.

## Prohibited Without Formal Review
- Add raw PII, PCI or regulated narrative to event payload where only references were approved.
- Expose S4 or S5 tool directly to autonomous agent execution.
- Remove audit identifiers from command, event or observation.
- Allow additional properties on high-risk tool input without schema review.

## Release Requirements
- Provider contract tests pass.
- Consumer contract tests pass for registered consumers.
- Model structured output eval passes for affected prompts.
- Policy and approval regression tests pass for affected tools.
- Simulation report explains decision delta, approval load and customer impact.
- Architecture, risk and business owners approve breaking or high-risk changes.

## Deprecation
- Deprecated versions remain observable in usage dashboard.
- Migration owner and consumer list are recorded.
- High-risk consumers receive migration tests before shutdown.
- Runtime rejects calls to retired S4 and S5 contracts after agreed sunset date.

9.5 Portfolio Evidence Pack

# Portfolio Evidence Pack: Contract-First Agent Integration

## Case Title
Governed Payment Dispute Agent with Contract-First Tool and Event Design

## Business Problem
Dispute analysts need faster evidence gathering and proposal drafting, while refunds, customer notices and complaint handling require strong control, approval and audit.

## Architecture Decision
Use OpenAPI for synchronous case queries and refund proposal commands, AsyncAPI + CloudEvents for dispute domain events, JSON Schema for tool IO and model output, and OpenTelemetry for cross-system traceability.

## Core Artifacts
- Tool Contract Card for payment_dispute.create_refund_proposal
- OpenAPI excerpt for refund proposal command
- AsyncAPI excerpt for payment.dispute.opened event
- JSON Schema for action proposal and tool observation
- Compatibility policy for schema and tool changes
- Contract test matrix for validation, entitlement, idempotency, approval and replay
- Audit evidence chain linking user, model, tool, policy, approval, command and event

## Risk Controls
- No direct ledger posting by agent
- Human approval before customer-visible or financial execution
- Idempotency key for every write command
- Inbox dedupe for event-driven agent consumer
- Policy decision recorded for every allowed, denied and approval-required tool call
- Observability spans across model, gateway, API, event and consumer

## Business Impact Narrative
The design reduces analyst preparation time while keeping final financial and customer-visible actions under policy, workflow and human control. It also gives architecture, risk and audit teams contract evidence rather than relying on prompt behavior.

## Interview Proof Points
- Can explain why prompt guardrails are not a control boundary
- Can separate query, command, event, decision and workflow task
- Can design OpenAPI / AsyncAPI contracts for agent integration
- Can show idempotency, approval and audit design for financial retail use cases
- Can connect contract testing and eval to release governance

10. Operating Model

10.1 Contract Ownership

ArtifactBusiness ownerTechnical ownerRisk / governance owner
Tool Contract CardCapability ownerTool gateway / service ownerOperational risk / security
OpenAPI operationProduct domain ownerAPI platform / service teamArchitecture review board
AsyncAPI eventDomain event ownerEvent platform / producer teamData governance / architecture
JSON SchemaSchema ownerPlatform engineeringData governance
Policy ruleBusiness / risk ownerPolicy platformCompliance / model risk
Approval workflowOperations ownerWorkflow platformRisk / internal controls
Eval setProduct ownerEvalOpsModel risk / quality
Audit eventControl ownerObservability / audit platformInternal audit / compliance

10.2 Intake Workflow

1. Register capability and business outcome.
2. Classify capability as resource, query, command, event, decision, workflow task or tool.
3. Assign side effect tier and data classification.
4. Draft OpenAPI, AsyncAPI and JSON Schema artifacts as applicable.
5. Write Tool Contract Card with owner, permissions, approval, audit and version policy.
6. Map PDP / PEP enforcement points.
7. Define idempotency, replay, DLQ and recovery behavior.
8. Build mock and contract tests.
9. Run model-output eval and prompt injection tests.
10. Simulate customer impact, approval load and failure modes.
11. Approve release with architecture, risk and business owners.
12. Monitor adoption, denial, approval, schema failure, DLQ and incidents.

10.3 RACI

ActivityPMBA / CBAPSolution ArchitectEngineeringSecurityRisk / ComplianceOperations
Capability taxonomyARRCCCC
Side effect tierARRCCA/RC
OpenAPI / AsyncAPI designCRA/RRCCI
JSON Schema reviewCRA/RRCCI
Policy and approval mappingARRCRA/RC
Contract testsCCRA/RCCI
SimulationARRRCA/RR
Release approvalACA/RRA/RA/RC
Audit evidenceCRRRCA/RR
Incident responseACA/RRA/RA/RR

10.4 Metrics

MetricWhy it matters
Approved tools by side effect tierShows exposure risk and platform maturity。
Tool reuse rateMeasures whether contract-first design creates reusable capabilities。
Schema validation failure rateDetects model drift, prompt drift or contract mismatch。
Policy denied tool callsReveals misuse, unclear UX or missing entitlement。
Approval queue loadMeasures operational impact of control design。
Idempotency conflict countFinds unsafe retries, duplicate proposals or orchestration bugs。
Event DLQ by categorySeparates schema, entitlement, dependency, policy and poison message problems。
Consumer lag for agent workersShows event-driven backlog and customer SLA risk。
Contract version adoptionTracks migration and stale consumers。
Audit replay success rateProves evidence chain works under investigation。

11. Interview Expression

Q1: 为什么 AI agent 工具设计要 contract-first?

30 秒版本:

因为 agent 调工具不是简单函数调用, 而是在触达企业系统, 客户资料, 资金动作和合规流程。Prompt 可以指导模型, 但不能保证 schema, 权限, 幂等, 审批, 审计和兼容性。Contract-first 用 OpenAPI, AsyncAPI, JSON Schema, CloudEvents 和 tool contract 把边界变成可验证, 可测试, 可审计的系统资产。

2 分钟版本:

我会先把 agent 能力拆成 query, command, event, decision 和 workflow task。同步查询和命令用 OpenAPI 描述, 异步事实和事件流用 AsyncAPI + CloudEvents 描述, 所有输入输出用 JSON Schema 约束, trace 用 OpenTelemetry 串联。然后每个 tool 都有 contract card: owner, side effect, allowed intent, permission, approval, idempotency, failure model, audit 和 versioning。这样模型只能提出工具调用, 真正执行由 gateway, policy engine, approval workflow 和 source system 控制。

Q2: OpenAPI 和 AsyncAPI 在 agent 架构里怎么分工?

OpenAPI 适合同步 query 和 command: agent 需要立即读取 case, 搜索交易, 创建 proposal 或调用 decision service。AsyncAPI 适合事件驱动: 某个业务事实已经发生, 多个 consumer 包括 agent 独立响应, 需要 replay, DLQ 和 consumer contract。简单说, command 是请求做事, event 是说明事情已经发生。Agent 不应该用一串同步 API 假装企业流程, 多步骤和人工审批应交给 workflow, 状态变化用 event 通知。

Q3: JSON Schema 对 AI 产品的价值是什么?

JSON Schema 是 agent IO 的控制面。它约束 tool arguments, observation, structured model output, event payload, decision input 和 audit event。没有 schema, agent 输出就是文本约定, 很难做 contract test, eval, replay 和审计。金融零售里我会特别关注 additionalProperties, enum, money representation, data classification, provenance, free text max length 和 structured error model。

Q4: 如何设计一个高风险写工具?

我不会直接暴露高风险写工具给模型。第一步把它拆成 proposal, approval 和 execution。Agent 可以创建 proposal 或 draft, approval workflow 绑定 exact parameters, evidence, policy version 和 approver, execution command 必须带 approval_id 和 idempotency key。API 返回 structured observation, 同时产生 audit event 和 domain event。对资金, 客户权益, KYC/AML 状态或客户外发, 默认需要 human review 或 dual control。

Q5: 如何处理 agent event consumer 的 replay 和 DLQ?

Agent consumer 要像普通生产级 consumer 一样治理。每条 CloudEvent 用 event id + consumer name 做 inbox 去重, DLQ 按 schema, entitlement, dependency, policy 和 poison 分类。Replay 需要审批, replay marker, 固定 schema/model/prompt/tool version, 并确认不会触发重复写入。高风险 replay 还要记录 replay approval 和 before/after diff, 作为审计证据。

Q6: 如何防止契约变更破坏 agent?

我会把 compatibility policy 做成 release gate。新增 optional field 通常兼容, 删除 required field, 改类型, 改 enum 语义, 新增 required field, 改 side effect tier 或降低 approval control 都是 breaking 或高风险变更。上线前跑 provider contract tests, consumer contract tests, model-output eval, policy regression 和 simulation。对事件语义变化, 不只改字段, 通常需要新的 event type 或 major version。

Q7: 审计链路怎么设计?

审计从契约开始设计。每次工具调用要记录 correlation_id, causation_id, actor_id, agent_id, model_id, prompt_version, tool_version, input_hash, policy_decision_id, approval_id, output_ref 和 trace_id。OpenTelemetry 把 model call, tool validation, policy evaluation, API command, event publish 和 event consume 串起来。这样可以从客户影响反查到模型建议, 证据来源, 人工审批和最终系统动作。

Q8: 你作为 PM/BA/Architect 如何把它做成作品集?

我会选一个金融零售场景, 例如 payment dispute agent。作品集不只展示聊天界面, 而是展示 Tool Contract Card, OpenAPI command, AsyncAPI event, JSON Schema observation, compatibility policy, contract test matrix, approval design, idempotency design 和 audit trace。这样能证明我理解 AI 产品不是 prompt demo, 而是企业架构, 解决方案架构和治理资产。

12. Final Checklist

AreaAdvanced check
Capability taxonomyEvery agent capability is classified as resource, query, command, event, decision, workflow task or tool。
Contract artifactsOpenAPI, AsyncAPI, JSON Schema and Tool Contract Card exist for exposed capabilities。
Prompt boundaryPrompt does not own authorization, side effect control, approval, audit or compatibility。
Schema rigorTool input, model output, observation, event payload and audit event are schema validated。
Side effectsEvery tool has a side effect tier and enforcement path。
ApprovalHigh-risk actions bind approval to exact parameters, evidence and policy version。
IdempotencyEvery write command and event consumer has dedupe behavior。
Event governanceEvents use CloudEvents envelope, AsyncAPI contract, consumer contract, DLQ and replay policy。
ObservabilityOpenTelemetry trace connects model, tool, API, event, workflow and audit。
TestingProvider, consumer, model-output, policy, replay and failure-mode tests are part of release。
VersioningAPI, event, schema, tool, prompt, policy, workflow and eval set versions are governed together。
Portfolio evidenceThe case study shows product value, architecture quality and governance evidence。

核心记忆:

Production AI agents do not become safe because prompts ask them to behave.
They become governable when every tool, API, event and output is defined as a contract,
enforced by policy, tested before release, observable at runtime and auditable after impact.