AI 扩展计划 / Playbooks

AI Agent Protocols / MCP / A2A Playbook

这些来源是本手册的学习锚点, 用于统一术语、协议边界和风险语言。它们不构成法律、合规、采购或安全审计意见。

1,250 行AI_AGENT_PROTOCOLS_MCP_A2A_PLAYBOOK.md

AI Agent Protocols / MCP / A2A / Tool Integration Playbook

定位: 面向 AI Platform PM / AI BA / AI Architect 的 Agent 工具集成与协议化手册。目标: 理解企业 AI Agent 如何从一次性 custom plugin 走向 protocol、tool contract、capability discovery、policy boundary、audit trail 和 vendor integration governance。核心结论: Agent 的价值不只来自模型推理, 而来自它能否以可授权、可审计、可回滚、可治理的方式连接外部工具、数据和服务。

Source Anchors

这些来源是本手册的学习锚点, 用于统一术语、协议边界和风险语言。它们不构成法律、合规、采购或安全审计意见。

Source	Link	本文用法
MCP Specification 2025-03-26	https://modelcontextprotocol.io/specification/2025-03-26	作为 MCP client / server / host、tools、resources、prompts、transport、authorization、roots、sampling 的协议基线
MCP Official Docs	https://modelcontextprotocol.io/docs/getting-started/intro	理解 MCP 的产品定位: standardized way to connect AI applications with external context and tools
NIST AI RMF: Generative AI Profile	https://www.nist.gov/publications/artificial-intelligence-risk-management-framework-generative-artificial-intelligence	用 govern / map / measure / manage 组织 GenAI 风险管理、供应商治理、评估和证据链
OWASP LLM01: Prompt Injection	https://genai.owasp.org/llmrisk/llm01-prompt-injection/	识别 direct / indirect prompt injection 对 tool use、retrieval、agent autonomy 的风险
A2A Protocol Specification	https://a2a-protocol.org/latest/specification/	作为 Agent-to-Agent 互操作的参考规范, 用于理解 agent card、task、message、artifact 等概念
A2A and MCP	https://a2a-protocol.org/latest/topics/a2a-and-mcp/	区分 MCP 偏工具/上下文连接, A2A 偏 agent 间任务协作与委派

1. 定位与现有文档关系

本手册不替代已有 AI 平台、安全和上下文工程文档, 而是把它们连接到一个新的问题:

当企业 AI Agent 不再只是聊天, 而要调用 CRM、工单、KYC、支付争议、核心银行只读接口和供应商服务时,
我们如何设计协议、工具契约、发现机制、授权策略、审计事件和平台治理?

现有文档	已提供能力	本手册补强
`docs/AI_PLATFORM_PM_PLAYBOOK.md`	定义 model gateway、RAG、tool gateway、eval、audit、cost、adoption 等平台能力	深挖 tool gateway 和 integration platform: tool catalog、capability discovery、vendor onboarding、protocol ADR、integration backlog
`docs/AI_PLATFORM_SECURITY_GATEWAY_LAB.md`	解释 prompt injection、tool authorization、policy engine、HITL、audit、kill switch	把安全网关控制点嵌入 MCP / A2A / tool contract: 权限、side effect、approval、rate limit、rollback、audit event
`docs/AI_CONTEXT_ENGINEERING_PLAYBOOK.md`	设计 task、intent、evidence、tools、memory、policy、output schema、eval 的上下文系统	把 tool observations、resources、prompts、capability metadata 作为可治理上下文供应链的一部分

一句话关系:

AI Platform PM Playbook 说明平台要建设什么能力。
AI Platform Security Gateway Lab 说明工具调用如何安全落地。
AI Context Engineering Playbook 说明上下文如何进入模型。
本手册说明 agent 与外部能力如何通过协议和契约连接、发现、授权、审计和治理。

2. 为什么需要 Protocol Thinking

2.1 从 custom plugin 到 protocol 的转变

早期企业 AI 集成常见做法是为每个 use case 写一个 custom plugin:

一个客服助手接 CRM。
一个运营助手接工单系统。
一个合规助手接政策知识库。
一个支付助手接 dispute case。
一个 KYC 助手接供应商 API。

这种方式可以快速验证 POC, 但规模化后会暴露重复成本和治理问题。

问题	Custom plugin 的表现	Protocol / tool contract 的改进
接入重复	每个团队重复定义 API wrapper、schema、错误处理	用统一 contract 描述 tool、resource、prompt 和 capability
权限分散	插件里临时写鉴权逻辑	用平台级 identity、policy、permission、approval 统一执行
风险不可比	每个插件对 side effect、PII、外发行为描述不同	tool catalog 统一记录 risk tier、side effect、data scope、audit event
发现困难	Agent 不知道有哪些工具可用, 人也不知道谁在用	capability discovery 暴露能力卡、schema、owner、约束和版本
审计断裂	只记录应用日志, 缺少统一 action trail	tool call 变成标准审计事件, 可 replay、可调查、可评估
供应商锁定	每个 vendor SDK 深度耦合	通过 adapter 和 protocol boundary 降低替换成本
上线门禁困难	安全评审要逐个看实现	用 intake、contract、ADR、risk checklist 标准化评审

2.2 协议化不是“让模型自由调用更多工具”

Protocol thinking 的目标不是把所有系统都暴露给模型, 而是把连接外部能力的方式标准化:

Agent intent
-> capability discovery
-> tool contract selection
-> argument schema validation
-> identity and policy check
-> human approval when required
-> connector execution
-> normalized observation
-> audit event
-> eval and incident replay

对 PM / BA / Architect 来说, 协议化要回答 8 个问题:

问题	交付物
Agent 能发现哪些能力?	Capability Discovery Matrix
每个工具能做什么、不能做什么?	Tool Contract Card
谁能用、在哪些目的下能用?	Permission and Policy Matrix
哪些动作有副作用?	Side Effect and Risk Tier
失败后如何恢复?	Idempotency, rollback, compensation design
如何审计?	Audit Event Schema
如何接供应商?	Vendor Integration Intake
为什么选这个协议?	Protocol ADR

2.3 企业 AI 的协议化成熟度

Maturity	典型状态	风险	下一步
Level 0: Manual	人复制粘贴系统数据给 AI	数据泄露、不可审计、不可复用	识别高频上下文和工具需求
Level 1: Custom plugin	单 use case 插件	权限分散、重复建设、难治理	定义 tool contract 和 audit event
Level 2: Tool gateway	工具统一注册和鉴权	发现能力弱、供应商接入仍散	建 tool catalog、capability discovery
Level 3: Protocol integration	MCP / A2A / internal protocol 连接工具和 agent	协议支持差异、版本治理复杂	建 protocol ADR、compatibility matrix
Level 4: Governed agent platform	多业务团队复用工具、策略、审计、eval、供应商集成	平台治理和组织变更难	建 operating model、risk review、adoption metrics

3. MCP Deep Dive

3.1 MCP 解决什么问题

MCP, Model Context Protocol, 可以理解为 AI 应用连接外部上下文和工具的标准接口层。它把过去散落在应用代码里的连接逻辑拆成更清晰的角色:

Host: AI 应用本体, 例如 IDE、聊天应用、agent workbench、企业 AI portal。
Client: Host 内部用于连接某个 MCP server 的协议客户端。
Server: 暴露 tools、resources、prompts 等能力的服务。
Transport: client 和 server 的通信方式。

对企业来说, MCP 的产品意义不只是“多一个接口标准”, 而是让工具集成可以被:

注册。
发现。
授权。
版本化。
监控。
审计。
替换。
进入平台治理。

3.2 Host / Client / Server

Role	责任	企业设计要点
Host	用户交互、模型编排、会话状态、策略入口、最终响应	Host 不应把工具授权完全交给模型; 要控制 server allowlist、用户确认、审计和 UI friction
Client	与特定 MCP server 建立连接, 协议协商, 发送工具/资源/提示请求	一个 host 可以有多个 client; 每个 client 的连接、权限和日志要独立可观测
Server	暴露可调用工具、可读资源、可复用提示模板	Server 是 integration boundary; 要声明 owner、scope、risk tier、schema、auth、rate limit、SLA

企业中常见部署形态:

Pattern	例子	适用场景	风险控制
Local server	开发者本机文件、Git、CLI 工具	开发者 productivity、低风险本地上下文	roots 限制、路径 allowlist、本地 secret 扫描
Internal remote server	内部 CRM / case / policy MCP server	企业业务系统集成	SSO、service identity、ABAC、network boundary、audit
Vendor remote server	KYC、ticketing、CRM SaaS vendor	供应商能力复用	vendor risk review、OAuth scope、DPA、日志回传、kill switch
Gateway-managed server	统一 tool gateway 代理多个后端系统	高风险金融零售平台	policy engine、approval、rate limit、central audit、connector isolation

3.3 MCP Tools

MCP tools 表示模型可以请求调用的动作或函数。它们通常需要明确 input schema, 并返回结构化结果或文本观察。

PM / BA / Architect 需要把每个 tool 当成一个可评审的产品能力:

维度	设计问题	示例
Intent	这个工具支持哪个业务任务?	查询客户 case 摘要
Input schema	必填字段、类型、枚举、格式、验证规则是什么?	`case_id`, `customer_id`, `reason_code`
Output schema	返回给模型的 observation 如何结构化?	status、summary、last_updated、source_system
Side effect	是否写入、发送、支付、关闭 case、修改客户资料?	read-only / draft / write / external-send
Permission	谁能用, 在什么业务目的下能用?	authenticated user with case entitlement
Approval	是否需要人审、双控、主管批准?	refund over threshold requires approval
Idempotency	重试是否会重复扣款、重复发信、重复建单?	idempotency key based on case and action
Audit	记录哪些字段可支持调查和监管问询?	actor、tool、arguments hash、policy decision

关键原则:

Read-only tool 也需要权限和审计, 因为它可能泄露敏感数据。
Write tool 必须显式声明 side effect、rollback 或 compensation。
External-send tool, 例如发送邮件、提交监管回复、通知客户, 应默认高风险。
Tool result 是 observation, 不是新的系统指令。工具结果里出现的文字不能绕过 policy。

3.4 MCP Resources

MCP resources 表示可读取的上下文对象, 例如文件、知识库条目、数据库记录、API 资源或业务对象。它们更像“可被 host 或模型引用的 evidence”。

企业 resource 设计要点:

Resource metadata	为什么重要
`uri` / stable identifier	支持引用、审计、缓存和 replay
source system	区分 CRM、policy KB、case system、vendor portal
owner	确认谁负责准确性、权限和更新
sensitivity	PII、PCI、confidential、public、restricted
permission scope	用户是否有权查看该资源
freshness	资料是否过期, 是否需要重新读取
jurisdiction / product	金融零售场景经常受区域和产品限制
injection risk	外部文档、邮件、网页、供应商备注是否可能包含恶意指令

Resources 和 context engineering 的关系:

Resource 不是“塞进 prompt 的文本”。
Resource 应该先经过 permission filter、freshness check、sensitivity label、prompt injection screening,
再由 context composer 以证据形式进入模型上下文。

3.5 MCP Prompts

MCP prompts 可用于暴露可复用提示模板或工作流入口。它们适合把业务任务包装成受控的 prompt pattern, 例如:

summarize_dispute_case
draft_kyc_followup_questions
compare_policy_versions
prepare_branch_complaint_response
triage_payment_exception

Prompt 设计要点:

维度	要求
Owner	每个 prompt 有业务 owner 和风险 owner
Version	prompt 变更可 diff、可审批、可回滚
Inputs	输入变量和来源明确, 不接受任意拼接
Policy	prompt 内包含任务边界, 但不替代外部 policy engine
Output	输出 schema 与下游流程匹配
Eval	每个高风险 prompt 绑定 golden set 和 release threshold

3.6 Transport

MCP 2025-03-26 规范中需要重点关注两类 transport 思路:

Transport	典型场景	PM / Architect 关注点
stdio	本地工具、开发者工作站、本机脚本	简单、低延迟, 但要限制本地文件 roots、环境变量、secret 和命令执行边界
Streamable HTTP	远程 server、企业内部服务、供应商服务	需要网络安全、OAuth / token、tenant isolation、observability、rate limit、SLA

设计建议:

本地 stdio 适合开发者和低风险工具, 不适合直接暴露高风险业务动作。
远程 HTTP MCP server 必须纳入 API security、identity、policy、logging、vendor risk review。
不要把 transport 安全等同于业务安全。TLS、OAuth 或网络白名单只解决连接层问题, 不能替代 tool-level authorization。

3.7 Authorization and Security

MCP authorization 解决 client 连接 server 的授权问题, 企业还需要额外设计业务授权。

Layer	解决的问题	不能替代
Transport security	连接是否加密, server 是否可信	业务权限、数据最小化
Client/server authorization	client 是否被允许访问 server	用户是否能执行具体工具动作
User identity	当前用户是谁, 属于哪个 tenant / role	该动作是否符合业务目的
Tool-level policy	某用户在某 case 中能否调用某工具	参数是否安全、输出是否可外发
Human approval	高风险动作是否被人工确认	底层审计和回滚
Audit and replay	事后能否解释发生了什么	事前权限判断

与 OWASP LLM01 的连接:

Prompt injection 可能诱导模型调用不该调用的工具。
Indirect prompt injection 可能藏在网页、邮件、PDF、工单备注、供应商回复中。
防护重点不应只放在 prompt, 还要放在 least privilege、segmentation、approval、audit、tool result isolation 和 red-team eval。

3.8 Roots

Roots 是 client 向 server 暴露的一组边界, 常见于文件系统或项目上下文。企业中 roots 的核心意义是限定 server 可以看见什么范围。

设计原则:

roots 必须最小化, 不把整个磁盘、共享盘或用户 home 目录默认暴露。
roots 应带有 purpose, 例如 current_case_documents、policy_kb_export、repo_workspace。
roots 变化要记录审计事件, 高风险目录需要用户确认。
server 不应通过路径技巧、symlink 或隐藏文件绕过 roots。

3.9 Sampling

MCP sampling 允许 server 通过 client 请求模型生成。它可以支持更复杂的 server-side agentic behavior, 但企业中需要谨慎。

风险点:

server 可能把敏感 resource 发给模型。
server 可能构造诱导性 prompt。
server 可能绕过 host 的上下文治理。
采样请求和结果如果不审计, 事后难以解释。

治理建议:

控制	说明
Allowlist	只有经过评审的 server 可以使用 sampling
User / admin consent	高风险 server 需要明确授权
Prompt visibility	sampling 请求应可被 host 和审计系统查看
Data classification	不允许 server 在 sampling 中发送超出权限的数据
Logging	记录 server、purpose、input metadata、model、output metadata
Eval	对 sampling 场景设计 prompt injection 和 data leakage 测试

3.10 Elicitation

截至 MCP 2025-03-26 规范基线, 本手册不把 elicitation 作为该版本的稳定必备能力。后续 MCP 版本和相关文档中出现的 elicitation 思路, 可以理解为 server 请求 host 向用户收集额外信息的交互模式。

在企业落地时, 采用 elicitation 前需要确认:

当前 host / client / server 是否支持该能力。
用户输入是否会改变权限、参数、审批或 side effect。
是否存在诱导用户提供敏感信息的风险。
elicitation 请求是否进入 audit log。
是否能限制问题类型、字段范围和业务目的。

安全表达:

Elicitation 可以改善多轮工具调用体验, 但它也扩大了 server 影响用户输入的能力。
在金融零售场景中, 应把 elicitation 视为受控 UI flow, 而不是 server 自由询问用户。

4. A2A / Agent-to-Agent 思维

4.1 A2A 和 MCP 的边界

可以用一句话区分:

MCP 主要解决 agent / host 如何连接工具、资源和提示。
A2A 主要解决一个 agent 如何发现、委派、跟踪和接收另一个 agent 的任务结果。

维度	MCP	A2A / Agent-to-Agent
核心对象	tools、resources、prompts	agent card、task、message、artifact
典型连接	AI app -> tool server	agent -> agent
主要问题	如何安全调用外部能力	如何委派任务、交换状态、返回结果
企业治理重点	tool contract、permission、side effect、audit	identity、handoff、task envelope、policy boundary、result accountability
风险	过度代理、高风险工具调用、数据泄露	责任不清、状态泄露、跨 agent 越权、供应商链路不可审计

4.2 标准成熟度表达

A2A 已有公开规范和生态推进, 但在具体企业中仍需要做 compatibility 和 vendor adoption 验证。建议表达为:

我们可以用 A2A 的 agent card、task、message、artifact 等概念设计 agent 互操作,
但上线前需要验证目标 vendor、host、runtime、security gateway 和 audit stack 的实际支持程度。

这样既承认协议方向, 也避免把外部生态成熟度过度假设为企业内部可用能力。

4.3 Agent Handoff

Agent handoff 是一个 agent 把任务移交给另一个 agent。企业中 handoff 不能只是“把用户 prompt 转发出去”, 而应包含结构化 envelope。

Handoff envelope 建议字段:

Field	说明
`task_id`	全局唯一任务标识
`requesting_agent`	发起 agent 身份、版本、owner
`target_agent`	目标 agent 身份或 capability
`user_identity`	代表哪个用户或业务角色发起
`tenant`	租户、区域、业务线
`purpose`	业务目的, 例如 case triage、KYC review、policy answer
`risk_tier`	read-only、draft、write、external-send、regulated decision
`allowed_data_scope`	可共享的数据范围和敏感等级
`allowed_tools`	目标 agent 可用工具或能力集合
`deadline`	SLA 或超时策略
`approval_requirement`	是否需要人审、双控、主管批准
`audit_correlation_id`	串联跨 agent 日志

4.4 Task Envelope

Task envelope 的目标是让 agent 协作可追踪、可重试、可审计。

{
  "task_id": "case-8821-kyc-review-001",
  "task_type": "kyc_vendor_review",
  "requester": {
    "agent_id": "retail-case-assistant",
    "version": "1.4.2"
  },
  "target_capability": "kyc_document_gap_analysis",
  "purpose": "support human analyst review",
  "risk_tier": "draft_recommendation",
  "input_refs": [
    {
      "type": "resource",
      "uri": "case://8821/documents/customer-upload",
      "sensitivity": "restricted_pii"
    }
  ],
  "constraints": {
    "no_external_send": true,
    "no_customer_decision": true,
    "must_cite_sources": true
  },
  "expected_output_schema": "kyc_gap_analysis_v1",
  "audit_correlation_id": "audit-2026-06-29-0007"
}

4.5 Identity

Agent-to-agent 场景至少有三类身份:

Identity	问题
Human user	谁发起任务, 是否有权访问数据和动作?
Agent identity	哪个 agent 在行动, 版本是什么, owner 是谁?
Service identity	底层 connector 或 vendor service 用什么凭证访问系统?

设计原则:

不使用共享万能服务账号。
Agent identity 要可撤销、可分级授权、可绑定版本和 owner。
Human user 权限不能被 agent handoff 放大。
跨 agent 传递的是 delegated authority, 不是全量用户权限。
审计中要同时记录 human、agent、service 三类身份。

4.6 Capability Card

Capability card 用于说明一个 agent 或 server 能做什么、需要什么输入、返回什么结果、有哪些限制。

字段	示例
Capability name	`payment_dispute_evidence_summarization`
Owner	Dispute Operations Platform
Supported tasks	summarize evidence, identify missing documents, draft analyst notes
Inputs	case id, document refs, policy version
Outputs	structured summary, source citations, missing evidence list
Risk tier	draft recommendation
Data classes	PII, transaction data, dispute evidence
Tools used	case read, policy KB read, document OCR read
Disallowed actions	no refund, no customer message, no case closure
SLA	p95 under 20 seconds for cases under 30 pages
Audit events	task_started, tool_called, artifact_created, task_completed

4.7 State Transfer

State transfer 是 agent handoff 中最容易失控的部分。不要把完整对话历史、完整客户档案或无限制 token 直接转给目标 agent。

推荐做法:

传 resource references, 少传原始明文。
传 task summary, 不传全量聊天记录。
传 policy constraints, 不依赖目标 agent 猜测边界。
传 output schema, 明确目标 artifact。
传 data classification, 让目标 agent 和 gateway 执行权限过滤。
传 audit correlation id, 方便串联全链路。

4.8 Policy Boundary

每次 agent handoff 都是新的 policy boundary。需要回答:

问题	控制
目标 agent 是否被允许处理该租户/区域/产品?	agent allowlist + tenant policy
是否允许共享这些资料?	resource-level entitlement + data minimization
目标 agent 是否能调用写操作?	delegated tool allowlist
结果是否能直接呈现给客户?	output risk tier + human review
失败后谁负责?	owner、SLA、incident route
审计是否能串起来?	correlation id + common audit event schema

5. Tool Contract

5.1 Tool Contract Card 的最小字段

Section	Field	说明
Identity	Tool name	稳定名称, 不随实现变化
Identity	Owner	业务 owner、技术 owner、风险 owner
Purpose	Business capability	支持的业务能力
Purpose	Intended users	哪些 agent / workflow / role 可用
Schema	Input schema	参数类型、必填、枚举、格式、约束
Schema	Output schema	结构化输出、错误、source refs
Safety	Side effect	read-only、draft、write、external-send、financial movement
Safety	Permission	RBAC / ABAC / purpose / tenant / case entitlement
Safety	Human approval	触发条件、审批人、双控规则
Reliability	Idempotency	idempotency key、重试语义
Reliability	Rate limit	每用户、每 agent、每 tenant、每工具限流
Reliability	Timeout	超时、重试、fallback
Audit	Audit event	事件名、字段、保留期限、敏感字段处理
Recovery	Rollback	rollback、compensation、manual remediation
Operations	Monitoring	success rate、error rate、latency、denied calls
Governance	Version	contract version、deprecation policy

5.2 Input Schema

Input schema 的目标不是让模型“知道怎么填参数”, 而是让平台可以验证参数是否安全、合法、符合业务目的。

示例:

{
  "tool": "case_management.get_case_summary",
  "version": "1.0.0",
  "input_schema": {
    "type": "object",
    "required": ["case_id", "purpose"],
    "properties": {
      "case_id": {
        "type": "string",
        "pattern": "^CASE-[0-9]{8}$"
      },
      "purpose": {
        "type": "string",
        "enum": ["triage", "customer_response_draft", "quality_review"]
      },
      "include_notes": {
        "type": "boolean",
        "default": false
      }
    },
    "additionalProperties": false
  }
}

设计原则:

用枚举限制 purpose、reason code、action type。
不让模型自由构造 SQL、URL、文件路径或系统命令。
对金额、日期、账号、case id、customer id 做格式和范围校验。
默认禁止 additionalProperties, 避免 prompt injection 把隐藏指令塞进参数。

5.3 Output Schema

Output schema 要让工具结果变成可引用、可审计、可验证的 observation。

{
  "status": "success",
  "source_system": "case_management",
  "source_record": "CASE-20260629",
  "retrieved_at": "2026-06-29T16:30:00Z",
  "data_classification": "restricted_pii",
  "summary": {
    "case_status": "open",
    "case_type": "payment_dispute",
    "last_customer_contact": "2026-06-27",
    "next_required_action": "collect merchant evidence"
  },
  "citations": [
    {
      "field": "next_required_action",
      "source_ref": "case_note:88721"
    }
  ]
}

设计原则:

返回 source refs, 不只返回自然语言摘要。
标注 retrieved_at 和 source system。
标注 data classification, 便于后续 DLP 和输出控制。
工具错误要结构化, 不能只返回“失败了”。

5.4 Side Effect

将工具按副作用分级:

Tier	含义	示例	默认控制
S0 Read-only public	公开或低敏查询	查询公开政策 FAQ	记录调用, 基础限流
S1 Read-only restricted	敏感只读	查询客户 case、账户摘要	entitlement、PII guard、audit
S2 Draft	生成草稿, 不自动发送或写入	起草客户回复、生成工单备注草稿	人工确认、版本记录
S3 Internal write	写入内部系统	添加 case note、更新任务状态	权限、idempotency、审批条件、rollback
S4 External send	对客户/供应商/监管外发	发邮件、提交供应商请求	强审批、DLP、双控、发送前预览
S5 Financial / regulated action	资金、额度、KYC 决定、账户限制	退款、冻结账户、拒绝开户	默认不自动执行, 仅允许 human-in-the-loop 或系统规则执行

5.5 Idempotency

Agent 工具调用容易重试, 所以写操作必须设计幂等性。

Action	Idempotency key 建议
create ticket	`workflow_id + source_case_id + action_type`
add case note	`case_id + generated_note_hash + agent_run_id`
send vendor request	`case_id + vendor_id + request_type + document_set_hash`
draft customer response	`case_id + prompt_version + evidence_hash`
update case status	`case_id + from_status + to_status + approved_action_id`

如果不能做到真正 rollback, 也要设计 compensation:

创建了重复 ticket: 自动关闭重复 ticket 并保留审计说明。
发送了错误供应商请求: 发送撤回或更正请求。
写入了错误 case note: 追加 corrected note, 不删除历史。
错误外发给客户: 升级 incident, 触发客户补救流程。

5.6 Audit Event

每次工具调用都应产生标准化 audit event。

{
  "event_type": "agent_tool_call",
  "event_time": "2026-06-29T16:35:12Z",
  "audit_correlation_id": "audit-2026-06-29-0007",
  "human_user": "u12345",
  "agent_id": "retail-case-assistant",
  "agent_version": "1.4.2",
  "tool_name": "case_management.add_case_note",
  "tool_version": "1.0.0",
  "risk_tier": "S3_internal_write",
  "purpose": "quality_review",
  "input_hash": "sha256:...",
  "source_refs": ["case://CASE-20260629"],
  "policy_decision": "require_approval",
  "approval_id": "approval-9931",
  "execution_status": "success",
  "output_hash": "sha256:...",
  "pii_redaction_applied": true
}

审计设计原则:

敏感入参可存 hash、masked value 或 encrypted evidence package。
审计事件要能串起 prompt、context、model、tool、policy、approval、output。
Denied call 也要记录, 因为它是风险信号。
Audit schema 要支持 incident replay 和 eval regression。

6. 金融零售集成场景

6.1 场景总览

系统	Agent 能力	Tool / Resource 示例	风险重点
CRM	客户画像、交互历史、销售/服务建议	read customer profile, retrieve interaction history	PII、权限、销售合规、数据最小化
Case management	投诉、争议、异常工单处理	get case summary, add internal note, update task	case entitlement、写入幂等、审计
Core banking read-only	账户摘要、余额、交易只读查询	get account summary, list transactions	高敏数据、只读边界、最小字段
KYC vendor	文档校验、名单筛查、缺口分析	submit document check, retrieve vendor result	vendor risk、跨境数据、模型不得作最终决定
Payment dispute system	争议证据、chargeback 状态、商户回复	summarize dispute evidence, draft representment note	资金影响、时限、外发审批
Policy knowledge base	政策问答、条款对比、版本差异	search policy, compare policy versions	版本有效性、引用、过期政策
Ticketing	IT / operations issue creation and routing	create ticket, update ticket status	重复建单、权限、SLA、routing

6.2 CRM Integration

推荐能力:

查询客户基础资料, 且只返回当前任务必要字段。
查询最近交互记录, 用于客服上下文。
生成下一步建议, 但不自动改变客户状态。

Tool contract 注意:

维度	设计
Input	`customer_id`, `case_id`, `purpose`
Output	masked profile, eligible products only if purpose allows
Permission	用户必须对该客户或 case 有服务权限
Side effect	默认 read-only; 更新客户资料需单独工具和审批
Audit	记录 customer id hash、purpose、fields returned

6.3 Case Management Integration

推荐能力:

获取 case 摘要。
读取 case notes。
生成内部处理建议。
添加内部备注, 但需要审批或用户确认。
更新 task 状态, 仅限低风险状态。

设计重点:

Case note 不应被模型改写历史, 只能追加。
自动写入前展示 diff。
对“关闭 case”“拒绝客户诉求”“升级监管投诉”设高风险门禁。

6.4 Core Banking Read-only Integration

原则:

核心银行系统第一阶段只读, 只取最小字段, 不让 agent 直接执行资金或账户状态变更。

推荐工具:

Tool	Purpose	控制
`get_account_summary`	支持客服回答账户状态	entitlement、masked account number、audit
`list_recent_transactions`	支持争议和异常调查	日期范围限制、金额字段最小化
`verify_account_status`	支持流程判断	返回枚举, 不返回不必要详情

禁止或强人审:

冻结/解冻账户。
修改客户限额。
发起转账。
修改利率、费用、产品状态。
作出信贷/KYC 最终决定。

6.5 KYC Vendor Integration

推荐能力:

提交 KYC 文档校验请求。
获取供应商结果。
汇总缺失资料。
生成 analyst review checklist。

治理重点:

风险	控制
供应商结果被模型当最终决定	输出标注为 evidence, human analyst 决策
跨境传输敏感资料	vendor risk review、data residency、contract terms
重复提交文档	idempotency key、vendor request tracking
供应商 API 错误	fallback、manual review route、incident log
黑盒评分不可解释	要求 reason codes、source refs、review trail

6.6 Payment Dispute System

典型 workflow:

Read dispute case
-> retrieve transaction and evidence
-> search policy and network rules
-> summarize missing evidence
-> draft analyst note
-> human approves
-> optionally create vendor / merchant request

高风险边界:

不自动发起退款。
不自动拒绝争议。
不自动外发客户或商户通知。
不自动关闭 case。
所有资金影响动作必须人审或由既有规则引擎执行。

6.7 Policy Knowledge Base

推荐设计:

Policy KB 暴露为 resources 和 search tools。
每条政策包含 version、effective_date、jurisdiction、product、owner。
输出必须引用 source refs。
过期政策不得作为当前建议依据, 但可用于历史 case 分析。

Prompt injection 风险:

外部网页、PDF、供应商文档中可能包含“忽略之前指令”类文本。
RAG 结果必须作为 evidence, 不作为 instruction。
对政策库 ingestion 做 source allowlist、content scanning、approval workflow。

6.8 Ticketing Integration

推荐能力:

根据 case 创建内部 ticket。
查询 ticket 状态。
添加 ticket comment。
路由到合适队列。

控制:

create ticket 要有 idempotency key, 防止重复建单。
routing 不应只靠模型自由选择, 要基于队列规则和 capability matrix。
ticket comment 如果包含客户信息, 要经过 PII guard。
大规模 ticket 创建要有 rate limit 和异常检测。

7. BA / PM / Architect 输出

7.1 Tool Catalog

Tool catalog 是平台治理的基础资产。

字段	示例
Tool name	`case_management.get_case_summary`
System	Case Management
Owner	Service Operations
Risk tier	S1 read-only restricted
Input schema version	`1.0.0`
Output schema version	`1.0.0`
Permission model	case entitlement + role
Approval required	No
Audit event	`agent_tool_call`
SLA	p95 800ms
Status	pilot / approved / deprecated

PM 的用法:

规划哪些工具先平台化。
对齐 use case 和工具能力。
管理 adoption 和工具复用率。

BA 的用法:

把业务流程动作拆成 tool。
定义字段、异常路径、审批条件。
和业务 owner 确认验收标准。

Architect 的用法:

设计 gateway、connector、identity、policy、logging。
评估协议、transport、版本、SLA。
管理 integration boundary。

7.2 Integration Backlog

Priority	Integration	User story	Acceptance criteria
P0	Case read-only MCP server	As a case analyst agent, I can read case summary by authorized case id	entitlement checked, schema validated, audit event emitted
P0	Policy KB resource server	As a support agent, I can retrieve current policies with citations	effective date, jurisdiction, product filters work
P0	Tool gateway audit schema	As risk, I can trace every tool call	correlation id links user, agent, model, tool, policy
P1	KYC vendor adapter	As a KYC analyst, I can request document gap analysis	vendor scopes reviewed, no final decision automation
P1	Dispute draft workflow	As a dispute analyst, I can generate an evidence summary draft	no external send, source citations, human review
P2	A2A agent handoff pilot	As a case agent, I can delegate KYC gap analysis to a specialist agent	task envelope, policy boundary, artifact returned

7.3 Capability Matrix

Capability	Agent / Server	Input	Output	Risk	Owner	Protocol fit
Case summary	Case MCP Server	case_id, purpose	structured summary	S1	Service Ops	MCP tool/resource
Policy search	Policy MCP Server	query, product, jurisdiction	cited policy snippets	S1	Compliance	MCP resource/tool
KYC gap analysis	KYC Specialist Agent	document refs, policy version	gap list artifact	S2	KYC Ops	A2A task + MCP tools
Dispute note draft	Dispute Agent	case id, evidence refs	analyst note draft	S2	Dispute Ops	MCP tools
Ticket creation	Ticketing MCP Server	source case, issue type	ticket id	S3	IT Ops	MCP tool via gateway

7.4 Vendor Risk Questions

Area	Questions
Protocol support	Does the vendor support MCP, A2A, OpenAPI, webhook, or only proprietary SDK?
Auth	What OAuth scopes, service identities, token lifetimes, and revocation controls exist?
Data	What data is sent, stored, retained, logged, or used for training?
Residency	Which regions process and store data?
Security	How are prompt injection, data exfiltration, secrets, and tenant isolation handled?
Audit	Can the vendor return request ids, decision reasons, timestamps, and event logs?
Reliability	SLA, timeout, retry, rate limit, incident notification?
Change management	How are schema, model, scoring, or API changes announced?
Exit	Can we export logs, configs, prompts, tool definitions, and artifacts?
Compliance	Which financial, privacy, and operational controls are contractually supported?

7.5 Protocol ADR

Protocol ADR 要记录为什么采用 MCP、A2A、OpenAPI adapter、internal tool gateway 或 vendor SDK。

关键决策点:

这是 tool integration、resource access、agent handoff 还是 workflow orchestration?
是否需要 capability discovery?
是否需要 remote vendor support?
当前 host/runtime 是否支持该协议?
安全网关能否插入 authorization、policy、approval、audit?
是否需要 streaming、long-running task、artifact、push notification?
协议版本如何治理?
如果 vendor 不支持协议, 是否用 adapter 包装?

7.6 Security Review Checklist

Check	Evidence
Tool has owner and risk tier	Tool Contract Card
Input schema rejects extra fields	JSON schema / test cases
Output schema labels source and sensitivity	Sample tool response
Permission check uses user, tenant, purpose, resource entitlement	Policy rule
High-risk actions require approval	Approval workflow screenshot or API
Tool calls emit audit event	Audit event sample
Prompt injection test includes malicious resource content	Eval report
Rate limit and anomaly detection defined	Ops config
Kill switch can disable tool/server/vendor	Runbook
Rollback or compensation path exists	Incident playbook

8. 14 天 Lab: Agent Integration Platform Blueprint

目标: 14 天后形成一个可展示的 Agent Integration Platform Blueprint, 包含 tool catalog、capability discovery、MCP server intake、A2A handoff 设计、security checklist、vendor risk review 和 platform governance。

Day	任务	Artifact
1	选定一个金融零售 use case, 例如 payment dispute assistant 或 KYC analyst copilot	Use Case One-pager
2	画出现有流程, 标注人工动作、系统查询、供应商交互、审批点	Current State Workflow
3	拆分 agent 需要的 tools、resources、prompts 和 external services	Capability Decomposition
4	为每个能力定义 risk tier、data class、owner、side effect	Capability Discovery Matrix
5	选择 3 个优先工具, 写 Tool Contract Card	3 Tool Contract Cards
6	设计 MCP server intake: host、client、server、transport、auth、scope、audit	MCP Server Intake
7	设计 tool gateway sequence: model proposal -> policy -> approval -> execution -> audit	Tool Call Sequence Diagram
8	为一个高风险动作设计 human approval 和 rollback / compensation	Approval and Recovery Design
9	设计 policy rules: RBAC、ABAC、purpose、tenant、case entitlement、risk tier	Policy Rule Matrix
10	设计 A2A handoff: specialist agent、task envelope、state transfer、artifact	A2A Handoff Design
11	准备 vendor risk questions, 针对 KYC、CRM 或 ticketing vendor 做评审	Vendor Risk Questionnaire
12	设计 audit event schema 和 dashboard 指标	Audit Schema and Metrics
13	设计 prompt injection / excessive agency / data leakage 测试包	Integration Red-team Test Pack
14	汇总成平台蓝图, 包含 ADR、roadmap、governance、release gate	Agent Integration Platform Blueprint

Blueprint 推荐结构

# Agent Integration Platform Blueprint

## 1. Use Case and Business Outcome
- Target workflow
- Users
- Baseline pain
- Success metrics

## 2. Capability Map
- Tools
- Resources
- Prompts
- Agent handoff
- Vendors

## 3. Protocol Architecture
- MCP server boundaries
- A2A handoff boundaries
- Tool gateway
- Security gateway
- Context composer

## 4. Tool Contracts
- Input schema
- Output schema
- Risk tier
- Permission
- Approval
- Audit
- Rollback

## 5. Security and Governance
- Policy matrix
- Prompt injection controls
- Vendor risk controls
- Audit and replay
- Kill switch

## 6. Roadmap
- MVP
- Pilot
- Production gate
- Scale plan

9. Templates

9.1 MCP Server Intake

# MCP Server Intake

## Basic Information
- Server name:
- Business owner:
- Technical owner:
- Risk owner:
- Source system:
- Environment: dev / test / prod
- Protocol version:
- Transport: stdio / Streamable HTTP

## Business Purpose
- Supported workflow:
- Supported agent / host:
- User roles:
- Business value:
- Non-goals:

## Capabilities
| Capability | Type | Risk tier | Owner | Status |
|---|---|---|---|---|
|  | tool / resource / prompt |  |  | proposed / pilot / approved |

## Data and Permission
- Data classes:
- Tenant / region scope:
- Resource entitlement rule:
- User identity source:
- Service identity:
- OAuth scopes or credential model:
- Data retention:

## Security Controls
- Input validation:
- Output filtering:
- Prompt injection controls:
- Human approval triggers:
- Rate limits:
- Kill switch:
- Secrets handling:

## Audit and Observability
- Audit event names:
- Correlation id:
- Logged metadata:
- Masked or encrypted fields:
- Metrics:
- Alert rules:

## Operations
- SLA:
- Timeout:
- Retry:
- Idempotency:
- Rollback / compensation:
- Deprecation policy:

9.2 Tool Contract Card

# Tool Contract Card

## Identity
- Tool name:
- Version:
- System:
- Business owner:
- Technical owner:
- Risk owner:

## Purpose
- Business capability:
- Intended users:
- Supported workflow:
- Explicit non-goals:

## Input Schema
```json
{
  "type": "object",
  "required": [],
  "properties": {},
  "additionalProperties": false
}
```

## Output Schema
```json
{
  "status": "success",
  "source_system": "",
  "retrieved_at": "",
  "data_classification": "",
  "result": {},
  "citations": []
}
```

## Risk and Permission
- Side effect tier:
- Required role:
- Required purpose:
- Resource entitlement:
- Tenant / region boundary:
- Human approval:
- Dual control:

## Reliability
- Timeout:
- Retry:
- Idempotency key:
- Rate limit:
- Fallback:

## Audit
- Event type:
- Required fields:
- Sensitive fields handling:
- Retention:
- Replay support:

## Recovery
- Rollback:
- Compensation:
- Manual remediation owner:
- Incident severity:

9.3 Capability Discovery Matrix

Capability	Type	Provider	Protocol	Input	Output	Risk tier	Data class	Permission	Approval	Owner	Status
Case summary	Tool	Case MCP Server	MCP	case_id, purpose	summary, citations	S1	PII	case entitlement	No	Service Ops	Approved
KYC gap analysis	Agent	KYC Specialist	A2A + MCP	doc refs	gap artifact	S2	Restricted PII	analyst role	Review	KYC Ops	Pilot
Ticket creation	Tool	Ticketing Server	MCP	issue type, case ref	ticket id	S3	Internal	role + purpose	Conditional	IT Ops	Proposed

9.4 Protocol ADR

# ADR: Protocol Choice for Agent Integration

## Status
Accepted

## Context
- Use case:
- Systems involved:
- Agent / host:
- Required capabilities:
- Data classes:
- Risk tier:
- Vendor constraints:

## Decision
- Selected protocol:
- Selected transport:
- Integration boundary:
- Gateway pattern:

## Alternatives Considered
| Option | Pros | Cons | Decision |
|---|---|---|---|
| MCP server |  |  |  |
| A2A handoff |  |  |  |
| OpenAPI adapter |  |  |  |
| Vendor SDK |  |  |  |
| Internal gateway only |  |  |  |

## Security and Governance
- Identity:
- Authorization:
- Policy:
- Human approval:
- Audit:
- Prompt injection controls:
- Kill switch:

## Consequences
- Benefits:
- Trade-offs:
- Migration path:
- Deprecation plan:
- Open operational risks:

9.5 Integration Risk Checklist

Risk Area	Question	Evidence	Decision
Capability	Is the tool/resource/agent capability clearly bounded?	Contract card	approve / revise / reject
Data	Does it expose PII, PCI, confidential, or regulated data?	Data classification	approve / revise / reject
Permission	Is user, tenant, purpose, and resource entitlement checked?	Policy rule	approve / revise / reject
Side effect	Can it write, send, pay, close, decide, or change status?	Risk tier	approve / revise / reject
Prompt injection	Can untrusted content influence tool calls?	Red-team cases	approve / revise / reject
Vendor	Does a third party process data or make recommendations?	Vendor review	approve / revise / reject
Audit	Can we replay who did what, why, with which data?	Audit sample	approve / revise / reject
Reliability	Are timeout, retry, idempotency, and rate limit defined?	Ops design	approve / revise / reject
Recovery	Is rollback or compensation realistic?	Runbook	approve / revise / reject
Governance	Is owner, version, deprecation, and kill switch defined?	Governance record	approve / revise / reject

10. 面试表达

10.1 30 秒版本

企业 AI Agent 的关键不只是模型能力, 而是它如何安全连接外部工具和数据。我的设计思路是把工具集成协议化: 用 MCP 这类协议暴露 tools、resources、prompts, 用 tool contract 定义 input/output schema、权限、副作用、限流、审计和回滚; 对 agent 间协作, 用 A2A 思维设计 task envelope、identity、capability card、state transfer 和 policy boundary。这样平台可以统一做 discovery、authorization、human approval、audit、vendor governance 和 release gate。

10.2 2 分钟版本

在企业里, Agent 真正有价值的地方是能连接 CRM、case management、KYC vendor、payment dispute、policy KB 等系统, 但这也是最大风险来源。早期 custom plugin 可以快速 POC, 但规模化后会出现权限分散、审计断裂、供应商锁定、重复接入和安全评审困难。

我的做法是引入 protocol thinking。第一层是 tool contract, 每个工具必须有 input schema、output schema、side effect tier、permission、rate limit、audit event、idempotency、rollback 和 human approval 规则。第二层是 MCP 或内部 tool gateway, 用标准方式暴露 tools、resources、prompts, 并接入 identity、policy engine、DLP、audit 和 kill switch。第三层是 agent-to-agent handoff, 用 task envelope 和 capability card 明确目标 agent 能做什么、能看什么数据、能调用哪些工具、输出什么 artifact。

对金融零售场景, 我会先做只读和草稿类能力, 比如 case summary、policy search、KYC gap analysis draft, 把退款、关闭 case、账户冻结、KYC 最终决定等动作放在强人审或既有规则系统里。上线前用 NIST GenAI Profile 的风险管理语言组织 governance evidence, 用 OWASP LLM01 的 prompt injection 风险做 red-team test, 确保工具调用不是只靠 prompt 防护。

10.3 Platform PM 深挖

Platform PM 视角我会把它产品化成四个资产:

Tool catalog: 所有工具、资源、prompt、agent capability 都有 owner、risk tier、schema、权限、审计和状态。
Integration backlog: 按业务价值、风险、复用度排序, 先做 read-only 和 draft, 再做受控写入。
Developer experience: 提供 MCP server intake、tool contract template、SDK/reference adapter、local test harness。
Governance dashboard: 看 adoption、tool reuse、denied calls、approval rate、latency、cost、incident 和 vendor health。

我不会把平台目标定义成“让所有业务自己随便接工具”, 而是“让高价值工具以可发现、可授权、可审计、可评估的方式复用”。

10.4 Architect 深挖

Architect 视角我会明确几个边界:

Host / client / server boundary: host 控制用户体验、策略入口和审计; server 暴露工具和资源; client 做协议连接。
Tool gateway boundary: 模型只能提出 tool call proposal, 真正执行前必须经过 schema validation、policy engine、approval、DLP 和 rate limit。
Context boundary: resources 和 tool outputs 是 evidence / observation, 不是 instruction。
Identity boundary: 同时记录 human user、agent identity 和 service identity, 不用共享万能账号。
Protocol boundary: MCP 适合工具和资源连接; A2A 适合跨 agent 任务委派; 不支持协议的 vendor 用 adapter 包装。

架构上我会要求 correlation id 串起 model call、context refs、tool calls、policy decisions、approval 和 final output, 这样才能支持 incident replay 和监管问询。

10.5 Security 深挖

Security 视角的核心观点是: prompt 不是安全边界。Prompt injection 可以来自用户输入, 也可以来自 RAG 文档、网页、邮件、case note 或供应商回复。控制点必须在平台层:

Least privilege: 每个 agent 只拿当前 workflow 必要工具。
Tool-level authorization: 结合 user、tenant、purpose、resource entitlement 和 risk tier。
Human approval: 对 external-send、financial movement、regulated decision 做强审批或双控。
Data protection: 对 PII、PCI、secret、外发内容做 DLP 和 redaction。
Audit: denied call、approved call、tool result、output 都要记录。
Kill switch: 能按 tool、server、vendor、tenant、workflow 关闭能力。
Red-team eval: 把 prompt injection、excessive agency、data leakage、approval bypass 做成回归测试。

11. 作品集转化建议

可以把本手册转成一个 portfolio case:

Case Title:
Designing a Governed Agent Integration Platform for Financial Retail Operations

Problem:
AI assistants need to use CRM, case, KYC, payment dispute and policy systems,
but unmanaged tool access creates security, audit and vendor risk.

Deliverables:
Tool catalog, MCP server intake, tool contract cards, A2A handoff design,
policy matrix, audit schema, vendor risk checklist, 14-day platform blueprint.

Business Impact:
Reduce duplicate integration work, improve governance, enable safe reuse of tools,
and create evidence for security, risk and architecture review.

建议展示 4 张图:

Capability map: 业务系统、MCP server、tool gateway、agent、user。
Tool call sequence: proposal、policy、approval、execution、audit。
A2A handoff: case agent -> KYC specialist agent -> artifact。
Governance dashboard: adoption、approval、denied calls、latency、incident。

12. 自检清单

检查项	结果
已连接 AI Platform PM Playbook、Security Gateway Lab、Context Engineering Playbook	是
已解释 enterprise AI 从 custom plugin 走向 protocol / tool contract / capability discovery 的原因	是
已覆盖 MCP client / server / host、tools、resources、prompts、transport、authorization/security、roots、sampling、elicitation 边界	是
已覆盖 A2A agent handoff、task envelope、identity、capability card、state transfer、policy boundary	是
已定义 tool contract 的 input/output schema、side effect、idempotency、permission、rate limit、audit、rollback、human approval	是
已覆盖 CRM、case management、core banking read-only、KYC vendor、payment dispute、policy KB、ticketing	是
已提供 BA / PM / Architect 输出资产	是
已提供 14 天 lab 和最终 blueprint	是
已提供 MCP Server Intake、Tool Contract Card、Capability Discovery Matrix、Protocol ADR、Integration Risk Checklist 模板	是
已提供 30 秒、2 分钟、Platform PM、Architect、Security 面试表达	是
已包含指定 source anchors	是