AI Third-Party Vendor Contract / Exit Architecture Playbook
版本:v1.0
AI Third-Party Vendor Contract & Exit Architecture Playbook
版本:v1.0 日期:2026-06-30 适用对象:CBAP、AI BA、AI PM、Enterprise Architect、Solution Architect、第三方风险管理、采购、法务、合规、隐私、信息安全、模型风险管理、内审、金融零售业务负责人
定位:把 AI vendor architecture、third-party risk、contract controls、concentration risk 和 exit strategy 统一成一套可执行的金融零售架构手册。重点不是“供应商评分表做得漂亮”,而是让机构在使用第三方 AI 能力时仍然控制数据、模型、日志、评估、变更、事故、审计证据和退出路径。
重要说明:本文是学习、架构和作品集材料,不是法律意见、监管解释、采购建议或正式合同文本。真实项目必须由 legal、compliance、procurement、third-party risk、privacy、security、model risk、business owner 和 internal audit 按机构类型、监管关系、司法辖区、数据分类和业务用途确认。
1. 目的 / 适用对象 / 核心观点
1.1 目的
本手册解决四个高级问题:
- 如何把 AI vendor 风险从“采购尽调”提升到“端到端架构控制”。
- 如何把 data、model、logging、eval、audit、change notice、incident、exit 等控制要求写进合同,并能在运行中拿到证据。
- 如何设计 vendor abstraction,让 AI 系统在供应商涨价、质量退化、事故、监管要求或战略变化时可替换、可降级、可迁移。
- 如何把第三方集中度风险和退出策略做成金融零售 AI 作品集中的高阶能力证明。
1.2 适用对象
| 角色 | 关注点 | 应产出证据 |
|---|---|---|
| CBAP / AI BA | 用例边界、业务规则、流程、数据、例外、监管影响、验收标准 | 业务能力图、BPMN、requirements-to-controls matrix、vendor architecture review |
| AI PM | 价值、用户旅程、MVP、adoption、风险接受、供应商路线图 | business case、vendor scorecard、SLA/SLO、pilot-to-production gate |
| Solution Architect | 模型网关、RAG、agent、API、日志、权限、可观测、故障降级、替换点 | C4、ADR、interface contract、data flow、exit architecture |
| Enterprise Architect | portfolio fit、target architecture、标准化、集中度、路线图 | capability heatmap、vendor concentration view、architecture standards |
| Procurement / TPRM | 尽调、合同、SLA、子处理方、费用、续约、退出 | due diligence pack、contract-control matrix、evidence schedule |
| Legal / Compliance / Privacy | 数据使用、监管协作、消费者影响、隐私、跨境、责任 | DPA、AI addendum、regulatory cooperation terms、retention and deletion terms |
| Security / Model Risk / Internal Audit | 安全、模型行为、评估、日志、审计权、事故、持续监控 | control evidence、eval report、audit log export、incident record |
1.3 核心观点
AI vendor risk 的本质不是“供应商能不能通过采购评分”,而是:
外部 AI 能力是否被放在一个可治理、可观测、可评估、可审计、可替换、可退出的企业架构中。
对金融零售机构而言,AI vendor 合同和架构必须共同回答:
- 数据会进入哪些 prompt、embedding、index、日志、telemetry、support tool、training pipeline 和子处理方。
- 模型版本、系统 prompt、retrieval source、tool permission 和 guardrail 是否能被冻结、回归测试、回滚和审计重建。
- 供应商模型更新、定价变化、服务中断、收购、子处理方变更、地域变更、policy drift 是否有通知、审批和暂停权。
- 机构是否能在合同终止、监管指令、供应商失效或战略切换时导出数据、配置、日志、评估、知识索引和运行证据。
- 集中度风险是否被量化:单一 foundation model、单一 cloud、单一 RAG SaaS、单一 eval vendor、单一 SI 或单一关键数据源。
一句话面试表达:
I do not treat AI vendor management as procurement scoring. I treat it as a control architecture: contract terms define rights, platform architecture creates enforcement points, and evidence operations prove that those rights and controls work over time.
2. Source Anchors
以下锚点用于组织控制语言和证据结构。它们不自动等同于某个机构的合规结论。
| Anchor | Official source | 本手册采用的思想 |
|---|---|---|
| 2023 Interagency Guidance on Third-Party Relationships: Risk Management | Federal Reserve SR 23-4: https://www.federalreserve.gov/supervisionreg/srletters/sr2304.htm | 第三方风险覆盖 planning、due diligence and third-party selection、contract negotiation、ongoing monitoring、termination 的生命周期;控制强度应与风险、复杂度、活动关键性相匹配。 |
| SR 23-4 Attachment: Interagency Guidance PDF | https://www.federalreserve.gov/supervisionreg/srletters/sr2304a1.pdf | 用于合同、持续监控、终止、监管检查、文档保留、访问权、数据返还和销毁等具体控制语言。 |
| OCC Bulletin 2023-17 | https://www.occ.gov/news-issuances/bulletins/2023/bulletin-2023-17.html | 作为 OCC 对联合指引的官方公告;强调基于银行风险画像、复杂度和第三方活动关键性进行第三方风险管理。 |
| FDIC FIL-29-2023 | https://www.fdic.gov/news/financial-institution-letters/2023/fil23029.html | 作为 FDIC 对联合指引的官方公告;用于提醒银行使用第三方不减少自身安全稳健经营与合规责任。 |
| Federal Register final guidance | https://www.federalregister.gov/documents/2023/06/09/2023-12340/interagency-guidance-on-third-party-relationships-risk-management | 用作联合指引正式文本锚点,便于引用生命周期、治理、合同、监控和终止框架。 |
| NIST AI RMF 1.0 | https://www.nist.gov/itl/ai-risk-management-framework | 用 Govern、Map、Measure、Manage 组织 AI 风险、评估、持续监控、治理证据和供应商控制。 |
| ISO/IEC 42001:2023 | https://www.iso.org/standard/42001 | 用 AI Management System 语言审查供应商是否有 AI 生命周期、责任、风险、变更、运行监控和持续改进机制。 |
| NIST Cybersecurity Framework 2.0 | https://www.nist.gov/cyberframework | 用 Govern、Identify、Protect、Detect、Respond、Recover 组织网络安全、供应链、事故响应、恢复和证据沟通。 |
使用边界:
- Interagency Guidance 是银行第三方风险管理的主锚点,本文把它扩展到 AI vendor architecture 和 exit engineering。
- NIST AI RMF 是 AI 风险治理语言,不替代行业监管、消费者保护、模型风险管理、隐私和网络安全要求。
- ISO/IEC 42001 可用于审查供应商 AI management system,但证书或声明不能替代本机构对具体用例、数据、模型和流程的评估。
- NIST CSF 用于安全和韧性控制,不足以单独覆盖 AI 输出质量、模型漂移、RAG 源文档错误、agent 越权和 eval 失效。
3. AI Vendor Risk 不只是采购评分
传统供应商管理常把风险压缩成问卷、评分、证书和合同审批。AI 场景下,这种做法会漏掉核心风险,因为 AI 供应商通常进入的是“决策、知识、交互、自动化和证据链”。
3.1 采购评分的盲点
| 采购评分常看 | AI vendor 真正要看 | 风险后果 |
|---|---|---|
| 公司规模、融资、客户案例 | 模型版本、数据边界、日志保留、子处理方、region、support access | 品牌强不代表用例安全 |
| SOC 2 / ISO 27001 | prompt、completion、embedding、index、eval、tool trace 是否进入证据链 | 安全证书不能证明 AI 行为可控 |
| Demo 效果 | 真实业务数据、负样本、边界案例、政策冲突、过期知识、人工复核 | Demo 准确不代表生产可靠 |
| SLA 百分比 | end-to-end workflow SLO、rate limit、fallback、degraded mode、manual queue | Vendor API 可用不等于业务可用 |
| 合同价格 | token、embedding、vector storage、eval、observability、support、egress、exit cost | 低单价可能变成高退出成本 |
| 通用合规声明 | 对本机构用例、数据分类、客户影响、司法辖区和监管关系的适配 | “供应商合规”不等于机构合规 |
3.2 AI vendor risk 的七层结构
| Layer | 关键问题 | 架构控制 |
|---|---|---|
| Business process | AI 是否改变客户承诺、审批、建议、投诉、授信、欺诈、AML 或交易处理 | BPMN、decision authority matrix、human approval gate |
| Data | 哪些数据进入 vendor,是否被训练、保留、复用、支持访问或跨境处理 | data classification、DLP、redaction、data loss boundary、DPA |
| Model | 使用哪个模型、哪个版本、哪个 deployment、是否静默更新、是否可回滚 | model registry、model gateway、version pinning、eval gate |
| Knowledge / RAG | 文档来源、权限、有效期、引用、索引导出、embedding 管理 | source registry、ACL sync、index versioning、citation audit |
| Agent / Tool | AI 是否能调用内部系统、写入交易、改 case、发通知、触发支付 | tool permission broker、approval workflow、idempotency、kill switch |
| Logging / Evidence | 是否能重建某次输出的输入、源文档、prompt、模型、工具调用、审批 | immutable trace、evidence export、retention schedule |
| Exit / Resilience | 供应商不可用或终止时,系统是否能降级、替换、迁移或停用 | abstraction layer、fallback provider、manual queue、exit runbook |
3.3 高阶判断
AI vendor 风险成熟度可以用一句话判断:
如果某个关键供应商明天不能服务,机构是否能在可接受时间内继续服务客户、保留证据、满足监管问询、保护数据并执行退出计划。
成熟机构不会只问“vendor 是否通过尽调”,而会问:
- 哪些业务能力依赖这个 vendor。
- 这些依赖属于 critical activity、high-risk activity、customer-facing activity 还是 back-office productivity。
- 控制权在合同里、平台里、流程里、日志里、还是只在供应商销售材料里。
- 是否存在无法导出的 prompt、workflow、index、embedding、memory、eval set、annotation、audit trail。
- 是否能用架构证据证明“可替换”,而不是口头说“理论上可以换”。
4. Contract Clauses For AI Vendor Controls
合同不是法务单点工作。AI 合同控制必须由 business、BA、architect、risk、security、privacy、model risk 和 procurement 共同输入,因为很多合同条款最终要靠架构和运营落地。
4.1 条款总览
| Clause area | 合同必须覆盖 | 架构必须支撑 | 证据必须证明 |
|---|---|---|---|
| Data | 输入、输出、embedding、index、日志、telemetry、support access、training use、retention、deletion | data boundary、redaction、DLP、encryption、tenant isolation、region control | data flow、DPA、retention config、deletion certificate、access log |
| Model | 模型身份、版本、deployment、更新、回滚、fine-tuning、distillation、model output ownership | model gateway、version pinning、eval gate、fallback route | release notes、model card、eval report、change approval |
| Logging | prompt、completion、retrieval、tool calls、latency、cost、error、user、approval、admin action | trace schema、log redaction、SIEM integration、evidence export | audit log sample、schema、retention proof、export test |
| Eval | offline eval、red-team、regression、business rubric、negative cases、release gate | eval pipeline、golden set、judge calibration、shadow mode | eval dataset lineage、test report、sign-off record |
| Audit | right to audit、regulator access、evidence retention、subprocessor evidence、control reports | evidence binder、control mapping、read-only export、audit workspace | SOC/ISO reports、bridge letter、control test results |
| Change notice | model update、feature change、subprocessor、region、security posture、pricing、policy terms | change detection、approval workflow、canary testing、rollback | notice archive、impact assessment、ADR、release gate |
| Incident | security incident、data exposure、unsafe output、model degradation、prompt injection、tool misuse、outage | incident taxonomy、kill switch、severity routing、postmortem | notification record、timeline、root cause、corrective actions |
| Exit | termination rights、transition support、data return/destruction、export format、fees、knowledge transfer | export pipeline、replacement route、degraded mode、manual operations | exit test result、export package、deletion proof、access revocation |
4.2 Data clauses
| Control objective | Contract language intent | Architecture evidence |
|---|---|---|
| Purpose limitation | Vendor may process institution data only to provide the contracted service, not for unrelated analytics, product training, benchmarking or resale. | Data flow diagram maps every vendor processing purpose to a business use case and system endpoint. |
| No training without explicit approval | Prompts, completions, documents, embeddings, feedback, annotations, customer records and logs are not used to train or improve vendor models unless explicitly approved in writing for a scoped dataset and purpose. | Vendor configuration screenshot or API setting; model provider data-use terms; periodic attestation. |
| Data minimization | Vendor receives the minimum fields needed for the approved workflow. Sensitive fields are tokenized, masked, omitted or routed through customer-controlled services when feasible. | Field-level data classification, redaction rules, sample payload, DLP test evidence. |
| Data residency and transfer | Processing, storage, backup, support access and subprocessor flows are constrained by approved geography and legal basis. | Region configuration, subprocessor map, cross-border transfer assessment. |
| Retention and deletion | Retention period is explicit for prompts, outputs, documents, embeddings, indexes, logs, telemetry, backups and support artifacts; deletion includes confirmation and exception handling. | Retention schedule, deletion API test, deletion certificate, backup purge policy. |
| Support access | Vendor personnel access is time-bound, least-privilege, approved, logged and revocable; privileged access to production data requires named approval and business justification. | Support access log, break-glass approval, access review, offboarding evidence. |
| Data return | On request or termination, vendor returns customer data, configurations, audit logs, eval results and operational records in documented formats. | Export manifest, sample export, checksum, schema documentation. |
4.3 Model clauses
| Control objective | Contract language intent | Architecture evidence |
|---|---|---|
| Model identity | Vendor discloses model family, deployment mode, major version, hosted region and material third-party model dependencies used for the service. | Model inventory, provider chain, deployment architecture. |
| Version control | Material model updates require advance notice, release notes, impact summary and ability to test before production exposure for high-risk workflows. | Version pinning, staging test, regression report, release approval. |
| Rollback / freeze | Institution can freeze a model version or roll back to a prior approved version when quality, compliance, safety or operational criteria fail. | Model gateway routing rule, rollback exercise record. |
| Fine-tuning and adaptation | Any fine-tuning, RAG tuning, prompt optimization or customer-specific adaptation has documented data source, owner, risk review and output rights. | Model adaptation register, dataset lineage, approval record. |
| Output rights and restrictions | Institution can use, store, review, audit and retain outputs for service delivery, quality monitoring, dispute handling and regulatory evidence. | Output retention policy, case record, audit export. |
| Model substitution | Vendor cannot substitute material model components for high-risk use cases without notice, evaluation and approval. | Dependency inventory, change notice evidence. |
4.4 Logging clauses
AI logging must balance auditability and privacy. A weak contract says “logs are available.” A strong contract defines exactly what is logged, how sensitive content is protected, how long logs are retained, who can access them and how they are exported.
| Log element | Why it matters | Required evidence |
|---|---|---|
| User / role / channel | Reconstruct authority and context | User ID, role, application, session, case ID |
| Prompt / instruction / policy version | Prove what the model was asked to do | Prompt ID, system prompt version, policy pack version |
| Retrieved sources | Check grounding, permissions and stale-source risk | Document IDs, source timestamps, ACL state, citation scores |
| Model details | Reproduce model behavior as far as feasible | Provider, model, deployment, version, parameters |
| Tool calls | Detect excessive agency and unauthorized action | Tool name, payload summary, approval, result, idempotency key |
| Output and final action | Distinguish model suggestion from business action | Raw output, edited output, approver, submitted action |
| Errors and refusals | Monitor reliability and safety | Error code, refusal reason, retry, escalation |
| Cost and latency | Control economics and customer experience | Token count, workflow duration, vendor latency, cost allocation |
| Admin changes | Audit governance and configuration drift | Config changes, approver, effective time, rollback status |
4.5 Eval clauses
| Control objective | Contract language intent | Architecture evidence |
|---|---|---|
| Business-specific eval | Vendor supports customer-specific test sets and rubrics for the approved workflow, not only generic benchmarks. | Golden dataset, scoring rubric, sampled cases, reviewer calibration. |
| Negative and adversarial cases | Eval includes policy conflicts, stale knowledge, prompt injection, sensitive data, complaint language, vulnerable customer scenarios and edge cases. | Red-team report, failure taxonomy, remediation evidence. |
| Release gate | Material model, prompt, retrieval, tool or policy changes cannot enter production until agreed eval thresholds are met or risk acceptance is documented. | CI/CD gate, release checklist, exception approval. |
| Continuous monitoring | Production sampling tracks quality, drift, hallucination, citation accuracy, override rate, complaint rate and incident triggers. | Monitoring dashboard, weekly or monthly review pack. |
| Eval data protection | Eval datasets, reviewer notes and business rubrics are protected as confidential institutional assets. | Access control, retention, export and deletion evidence. |
4.6 Audit clauses
| Control objective | Contract language intent | Evidence |
|---|---|---|
| Audit right | Institution or its designee can review relevant controls, reports, test results and evidence, with reasonable procedures for confidentiality and security. | Audit clause, evidence request log, completed review. |
| Regulator access | Contract recognizes that regulated activities may be subject to supervisory review and requires cooperation with appropriate regulatory requests. | Regulatory cooperation clause, exam response process. |
| Control reports | Vendor provides SOC 2, ISO certificates, bridge letters, pen-test summaries, AI governance summaries and remediation status where applicable. | Current reports, gap assessment, residual risk decision. |
| Evidence retention | Vendor retains records needed to support audit, complaint, dispute, incident, regulatory and model risk review for agreed periods. | Retention schedule, export test, archive inventory. |
| Subprocessor evidence | Vendor discloses subprocessors and provides evidence of oversight for material subprocessors. | Subprocessor list, notification archive, risk assessment. |
4.7 Change notice clauses
Material AI change is broader than a software release. It includes model routing, prompt templates, retrieval algorithms, connectors, data retention, telemetry, safety filters, pricing, rate limits and subprocessors.
| Change type | Minimum notice / control expectation | Evidence |
|---|---|---|
| Model update | Advance notice for high-risk workflows; release notes; customer testing window; rollback path | Notice, eval result, approval |
| Prompt / policy change | Versioned change log; institution approval for customer-specific policy changes | Prompt registry, policy diff |
| RAG / search algorithm | Regression test for citation quality, permission filtering and stale-source behavior | RAG eval report |
| Tool / API behavior | Impact assessment for write actions, external communications and case changes | Tool contract diff, test result |
| Subprocessor | Prior notice, objection process, updated data flow | Subprocessor notice and assessment |
| Geography / support model | Approval before material change to data location or support access model | Region review, access review |
| Pricing / rate limit | Notice period and impact analysis for cost or throughput changes | Commercial impact memo |
| Security posture | Notice for material control degradation, major audit findings or unresolved high-risk vulnerabilities | Security review update |
4.8 Incident clauses
AI incident scope should include security incidents and AI-specific operational incidents.
| Incident type | Examples | Contract / runbook expectation |
|---|---|---|
| Data exposure | Prompt, output, document, embedding, log or support artifact exposed to unauthorized party | Notice, containment, affected data, deletion, customer/regulatory support |
| Prompt injection / data exfiltration | Malicious content causes tool misuse or source leakage | Detection, affected sessions, mitigation, rule update |
| Unsafe or prohibited output | AI gives prohibited advice, discriminatory response, misleading financial explanation or harmful instruction | Severity classification, customer impact review, remediation |
| Model degradation | Accuracy, citation, refusal, latency, cost or override metrics breach threshold | Degraded mode, rollback, eval re-run |
| Tool misuse | Agent sends unauthorized message, changes case, triggers payment, modifies customer record | Kill switch, idempotent reversal, approval review |
| Service outage | Vendor API, RAG SaaS, eval platform or observability unavailable | Fallback, manual queue, SLA credit, postmortem |
| Cost runaway | Token or workflow usage spikes due to loop, abuse, misconfiguration or vendor change | Budget cap, alert, root cause, billing adjustment |
4.9 Exit clauses
Exit terms must be negotiated before dependency is created.
| Exit control | Contract language intent | Architecture evidence |
|---|---|---|
| Termination rights | Terminate for breach, persistent SLA failure, regulatory direction, unacceptable risk, data misuse, material change or strategic need with defined notice and cure periods. | Termination scenario matrix. |
| Transition support | Vendor provides transition assistance, export support, knowledge transfer and parallel run support for agreed period and fees. | Transition runbook, support contact, fee schedule. |
| Export scope | Data, documents, embeddings where feasible, indexes, prompts, workflow configs, policies, logs, eval results, annotations, feedback and admin records are exportable. | Export manifest and sample package. |
| Format | Exports use documented, machine-readable formats with schema, timestamps, ownership, lineage and checksums. | Schema docs, checksum validation. |
| Deletion | Vendor returns or destroys institution data and provides written confirmation; backup deletion follows defined schedule. | Deletion certificate, backup exception record. |
| Access revocation | API keys, SSO, service accounts, support access, subprocessors and integrations are revoked. | Access revocation evidence. |
| Business continuity | Contract supports orderly transition without prohibitive restrictions or costs. | Exit rehearsal and residual risk memo. |
5. Architecture Patterns For Vendor Abstraction
AI vendor abstraction is not only a wrapper around API calls. It is a control plane that separates business workflow from provider-specific implementation.
5.1 Reference architecture
Business workflow / channel
-> AI use-case service
-> policy and decision authority layer
-> model gateway / provider abstraction
-> retrieval abstraction / knowledge access layer
-> tool permission broker
-> observability and evidence layer
-> vendor adapters
-> model / RAG / data / eval / observability vendors
5.2 Pattern 1: Model gateway
| Design element | Purpose | Control gained |
|---|---|---|
| Provider-neutral API | Avoid direct coupling from business apps to vendor SDKs | Route, swap, throttle and observe providers centrally |
| Model registry | Track approved providers, model versions and use-case mapping | Prevent unapproved model use |
| Policy-aware routing | Route by data class, use case, jurisdiction, risk tier and cost | Keep sensitive workflows on approved deployments |
| Version pinning | Keep production workflows on tested model versions | Prevent silent behavior changes |
| Fallback routing | Use alternate model or degraded mode when vendor fails | Reduce outage and concentration impact |
| Prompt registry | Version system prompts, business prompts and guardrail prompts | Reconstruct decisions and support regression tests |
When to use:
- Multiple AI use cases share foundation model providers.
- Regulated workflows require version control, eval gates, fallback and evidence.
- Vendor SDKs would otherwise leak into many applications.
When not enough:
- If vendor owns the full SaaS workflow, a gateway cannot control hidden prompts, hidden retrieval or hidden tool calls. Contract and export rights become more important.
5.3 Pattern 2: RAG abstraction
| Design element | Purpose | Control gained |
|---|---|---|
| Source registry | Track authoritative documents, owners, effective dates and retention | Avoid stale or unauthorized knowledge |
| Connector boundary | Separate document ingestion from vendor-specific indexing | Switch indexing/search vendor more easily |
| ACL sync service | Apply source-system permissions into retrieval | Prevent cross-role disclosure |
| Index versioning | Version embeddings, chunking, metadata and index build | Reproduce outputs and roll back |
| Citation audit | Store retrieved sources and citation scores | Explain and challenge outputs |
| Exportable corpus map | Preserve source-to-index mapping | Support migration and termination |
High-risk RAG rule:
If the vendor cannot export index metadata, source mappings, permission state and retrieval logs, the institution does not truly control the knowledge layer.
5.4 Pattern 3: Tool permission broker for agents
| Control | Design rule |
|---|---|
| Least privilege | Agent tools are scoped by workflow, role, customer segment, transaction type and environment. |
| Approval gate | High-impact actions require human approval before execution. |
| Idempotency | Every write action uses idempotency keys and compensating actions where possible. |
| Policy enforcement | Tool calls pass through deterministic policy checks before execution. |
| Runtime kill switch | Business owner, incident commander or platform owner can disable tools by use case. |
| Evidence | Tool request, approval, result and final business action are logged. |
Agent vendor selection principle:
Never let vendor autonomy exceed the institution's ability to authorize, monitor, reverse or explain the action.
5.5 Pattern 4: Evidence-by-design observability
| Evidence question | Required trace field |
|---|---|
| Who initiated the workflow? | user ID, role, channel, session, case ID |
| What data entered the vendor boundary? | payload class, redaction status, document IDs |
| Which model and prompt were used? | provider, model, version, deployment, prompt version |
| Which knowledge was retrieved? | source IDs, document version, ACL state, retrieval scores |
| Which tool was called? | tool name, parameters summary, approval, execution result |
| What was shown to the user? | raw output, edited output, final action |
| Why was it accepted or overridden? | reviewer decision, reason code, feedback |
| What did it cost and how long did it take? | token count, vendor latency, workflow latency, unit cost |
Evidence-by-design means the audit trail is produced during normal operation, not assembled manually after an incident.
5.6 Pattern 5: Strangler exit pattern
For an incumbent AI SaaS with high lock-in:
Current direct integration
-> introduce evidence export
-> introduce API facade
-> move auth and policy decisions outside vendor
-> externalize knowledge sources and source registry
-> externalize eval and release gates
-> add secondary provider or manual queue
-> migrate workflow segment by segment
Use this when:
- Vendor owns the UI and workflow, but the institution wants an exit path.
- Data, prompts, knowledge and logs are trapped inside the vendor.
- A full rewrite would be too risky.
5.7 Pattern selection matrix
| Risk condition | Recommended pattern |
|---|---|
| Direct API calls from many apps to a model provider | Model gateway |
| Customer documents or policies indexed in vendor SaaS | RAG abstraction plus export clause |
| AI can perform write actions | Tool permission broker |
| Audit trail currently equals chat transcript | Evidence-by-design observability |
| Vendor owns full workflow and data | Strangler exit pattern |
| Multiple vendors across critical workflows | Portfolio concentration dashboard |
| High-risk use case with frequent model updates | Eval gate plus version pinning |
6. Concentration Risk
AI concentration risk appears when many business capabilities depend on the same model, cloud, data vendor, workflow SaaS, eval vendor, SI or specialized skill pool.
6.1 Concentration dimensions
| Dimension | Example | Risk |
|---|---|---|
| Provider concentration | One foundation model powers customer service, complaints, AML narrative and collections | Broad outage or model behavior shift affects many workflows |
| Cloud concentration | AI workloads, vector DB, logs and training all on one cloud region/provider | Cloud outage, pricing or regional control issue has amplified impact |
| Data concentration | One KYC or fraud data vendor feeds onboarding, AML and transaction risk | Data quality or license issue affects multiple decisions |
| RAG SaaS concentration | All policies, procedures and case knowledge live in one vendor index | Exit difficulty and knowledge lock-in |
| Eval concentration | Vendor evaluates its own model and own workflow | Independence issue and weak challenge function |
| SI concentration | One implementation partner owns integration, prompt design and run support | Knowledge dependency and weak internal capability |
| Skills concentration | Only one internal engineer understands model gateway and vendor routing | Operational fragility |
| Contract concentration | Multiple business units buy the same vendor under inconsistent terms | Uneven rights, audit gaps and renewal leverage loss |
6.2 Concentration metrics
| Metric | Formula / method | Interpretation |
|---|---|---|
| Workflow dependency count | Number of production workflows relying on vendor | High count needs board or senior risk visibility for critical activities |
| Customer impact scope | Customers, accounts, cases, transactions or employees affected by vendor outage | Measures operational and consumer impact |
| Critical activity coverage | Whether vendor supports critical or high-risk banking activities | Raises governance, monitoring and exit standards |
| Replacement time | Estimated time to move to alternate provider or manual process | Long replacement time indicates lock-in |
| Data portability score | Export completeness across data, logs, configs, evals, index and prompts | Low score means contract rights are weak or untested |
| Evidence independence | Whether eval and audit evidence are vendor-produced only | Low independence weakens assurance |
| Contract consistency | Percent of vendor contracts using approved AI addendum and exit terms | Low consistency creates hidden risk |
| Unit economics sensitivity | Cost impact under 2x usage, 2x token price, premium model fallback or egress | Shows hidden commercial concentration |
6.3 Concentration controls
| Control | Description |
|---|---|
| Portfolio inventory | Maintain inventory of AI vendors mapped to capabilities, data classes, systems, models, regions and business owners. |
| Criticality tiering | Classify vendor relationships by customer impact, regulatory impact, operational dependence and substitutability. |
| Common AI addendum | Standardize data, model, logging, eval, audit, change, incident and exit terms across contracts. |
| Architecture review board | Require architecture approval for direct vendor SDK use, high-risk SaaS, agent write actions and non-exportable knowledge stores. |
| Dual-path resilience | For critical workflows, define alternate provider, degraded mode, manual queue or business shutdown criteria. |
| Independent eval | Use institution-owned eval sets and independent review for high-impact models or vendor-produced claims. |
| Exit rehearsal | Test export, re-ingestion, access revocation and degraded operations at least annually for critical vendors. |
| Renewal gate | Do not renew critical AI vendor contracts without recent evidence of SLA, eval, audit, incident, data deletion and exit readiness. |
7. Financial Retail Case: AI Assistant For Disputes And Complaints
7.1 Scenario
A regional bank wants to deploy an AI assistant for card disputes and customer complaints. The assistant summarizes customer interactions, retrieves policy and procedure documents, drafts case notes, recommends next-step actions and prepares customer response language. It does not approve refunds automatically in phase 1, but it can pre-fill case management fields.
Vendor stack:
- Foundation model provider for summarization and drafting.
- RAG SaaS for policies, procedures and knowledge articles.
- Case management SaaS integration.
- AI observability vendor for traces, cost and quality monitoring.
- SI partner for integration and workflow rollout.
7.2 Risk framing
| Risk area | Why it matters |
|---|---|
| Customer harm | Incorrect dispute guidance can affect reimbursement, complaint handling and customer trust. |
| Data sensitivity | Customer identity, account, card transaction, complaint and call transcript data may enter prompts, logs and support tools. |
| Regulatory evidence | Complaint and dispute handling must be explainable and reconstructable. |
| Knowledge accuracy | Policies and procedures change; stale RAG content can create wrong outcomes. |
| Human over-reliance | Agents may accept AI drafts without checking policy or customer facts. |
| Vendor lock-in | RAG index, prompt logic and workflow configuration can become difficult to export. |
| Concentration | Same assistant may later expand to fraud, collections and servicing, increasing dependency. |
7.3 Target architecture
Contact center / case management
-> AI assistant service
-> policy gate: allowed use, data classification, customer impact
-> redaction service for transcripts and account data
-> model gateway with approved model versions
-> RAG abstraction over bank-owned policy source registry
-> tool permission broker for case pre-fill only
-> human review and approval
-> evidence store and SIEM
-> vendor adapters
7.4 Contract controls
| Area | Control decision |
|---|---|
| Data | Vendor cannot use prompts, transcripts, outputs, feedback, embeddings or logs for model training or unrelated product improvement without explicit written approval. |
| Model | Model version is pinned for production; material updates require notice, regression test and approval. |
| RAG | Bank retains ownership of source documents and metadata; vendor must support index export or rebuild from bank-owned source registry. |
| Logging | Trace must include source documents, prompt version, model version, draft output, edits, approver and final case note. |
| Eval | Pilot must pass policy accuracy, citation accuracy, harmful advice, stale-source, complaint language and data leakage tests. |
| Change notice | Subprocessor, region, model, retrieval algorithm, retention and support access changes require notice and assessment. |
| Incident | Unsafe customer response, data leakage, unauthorized tool action and quality degradation are explicitly in incident scope. |
| Exit | Export package includes case traces, prompts, policy mappings, eval results, workflow configuration and deletion certificate. |
7.5 Operating model
| Role | Responsibility |
|---|---|
| Business owner | Owns complaint/dispute workflow outcomes, risk acceptance and customer impact. |
| AI PM | Owns adoption, user experience, business value, release gates and feedback loop. |
| BA / CBAP | Owns process map, business rules, exception paths, requirements-to-control traceability. |
| Solution architect | Owns vendor abstraction, integration, logging, fallback, exit architecture. |
| Legal / compliance | Owns contract control review, consumer impact, record retention and regulatory cooperation terms. |
| Security / privacy | Owns data classification, redaction, access, support controls and incident routing. |
| Model risk / eval owner | Owns eval design, performance monitoring, drift thresholds and challenge. |
| Vendor owner | Owns SLA, evidence collection, quarterly review, renewal gate and exit rehearsal. |
7.6 Exit scenario
Trigger: RAG SaaS vendor announces a material pricing increase and moves support operations to a geography not approved for customer complaint data.
Exit execution:
- Freeze new expansion and disable vendor feature changes.
- Export source-to-index mapping, prompts, workflow config, eval reports and trace logs.
- Rebuild index using bank-owned policy source registry in alternate retrieval layer.
- Route model calls through existing model gateway to approved provider.
- Run regression eval on 200 representative complaint and dispute cases.
- Shift production traffic by workflow segment after business approval.
- Revoke vendor access, confirm data deletion and archive termination evidence.
- Conduct lessons learned and update AI vendor standards.
7.7 Executive narrative
We did not buy an AI assistant as a black-box SaaS. We designed the assistant so that customer data boundaries, policy retrieval, model routing, human approval, audit evidence and exit rights are controlled by the bank. The vendor accelerates capability, but the bank retains accountability, evidence and the ability to transition.
8. Templates
8.1 Vendor Architecture Review Template
# Vendor Architecture Review: [Vendor / Product / Use Case]
## 1. Review metadata
- Business owner:
- Vendor owner:
- Architecture owner:
- Review date:
- Use case:
- Risk tier:
- Customer-facing or internal:
- Critical activity linkage:
- Data classification:
- Jurisdictions / regions:
## 2. Business capability and process fit
- Business capability supported:
- Current process pain:
- AI-supported steps:
- Human approval points:
- Customer impact:
- Exception paths:
- Baseline process if AI is unavailable:
## 3. Vendor dependency map
| Dependency | Provider | Function | Data processed | Region | Subprocessor | Replacement option |
|---|---|---|---|---|---|---|
| Foundation model | | | | | | |
| RAG / search | | | | | | |
| Data source | | | | | | |
| Logging / observability | | | | | | |
| Integration / SI | | | | | | |
## 4. Data architecture
| Data object | Source system | Fields | Sensitivity | Vendor exposure | Retention | Control |
|---|---|---|---|---|---|---|
| Customer profile | | | | | | |
| Transaction data | | | | | | |
| Policy documents | | | | | | |
| Prompts and outputs | | | | | | |
| Logs and telemetry | | | | | | |
## 5. Model and RAG controls
| Control | Decision | Evidence |
|---|---|---|
| Approved model versions | | |
| Version pinning | | |
| Model update notice | | |
| Prompt registry | | |
| Source registry | | |
| ACL sync | | |
| Citation audit | | |
| Index export or rebuild path | | |
## 6. Agent and tool controls
| Tool | Read / write | Approval needed | Idempotency | Logging | Kill switch |
|---|---|---|---|---|---|
| | | | | | |
## 7. Evaluation and release gate
| Eval area | Threshold | Owner | Evidence |
|---|---|---|---|
| Accuracy | | | |
| Citation correctness | | | |
| Sensitive data handling | | | |
| Policy conflict handling | | | |
| Unsafe output | | | |
| Latency and cost | | | |
## 8. Resilience and exit
| Scenario | Response | Maximum tolerable impact | Evidence |
|---|---|---|---|
| Vendor outage | | | |
| Model quality degradation | | | |
| Subprocessor change | | | |
| Contract termination | | | |
| Regulatory direction | | | |
## 9. Architecture decision
- Approved use:
- Conditions:
- Required contract controls:
- Required technical controls:
- Required monitoring:
- Renewal gate:
- Exit rehearsal date:
8.2 Contract-Control Matrix Template
| Control ID | Clause area | Control requirement | Contract artifact | Technical enforcement | Evidence | Owner | Review cadence |
|---|---|---|---|---|---|---|---|
| CC-01 | Data | Vendor cannot use institution prompts, outputs, embeddings, logs or feedback for model training without explicit written approval. | MSA / DPA / AI addendum | Provider setting, data gateway, redaction | Vendor attestation, configuration evidence | Privacy / vendor owner | Quarterly |
| CC-02 | Data | Retention periods are defined for prompts, outputs, embeddings, indexes, logs, telemetry and support artifacts. | DPA / retention schedule | Retention config, deletion API | Retention report, deletion test | Privacy / platform owner | Quarterly |
| CC-03 | Model | Material model updates require notice, release notes, test window and approval for high-risk workflows. | AI addendum / SLA | Model gateway, version pinning | Notice archive, eval report | Model risk / architect | Per release |
| CC-04 | Logging | Vendor provides exportable trace logs covering prompt, source, model, output, tool call, approval and final action. | Audit addendum | Evidence store, SIEM integration | Sample trace export | Security / audit owner | Monthly |
| CC-05 | Eval | Customer-specific eval and regression testing must pass agreed thresholds before production release. | SOW / AI addendum | Eval pipeline, release gate | Eval report, sign-off | Eval owner / PM | Per release |
| CC-06 | Audit | Institution and regulators receive access to relevant evidence, control reports and records for regulated activities. | MSA / audit clause | Evidence binder | Report inventory, request log | Legal / internal audit | Semiannual |
| CC-07 | Change | Subprocessor, geography, model, retrieval, retention, support access and pricing changes require notice and assessment. | Change notice clause | Vendor intake workflow | Impact memo, approval | Vendor owner | Per notice |
| CC-08 | Incident | Data exposure, unsafe output, prompt injection, tool misuse, outage and quality degradation are in incident scope. | Incident clause | Incident routing, kill switch | Incident record, postmortem | Security / business owner | Per incident |
| CC-09 | Exit | Vendor must support export of data, configuration, logs, eval results, prompts, policy mappings and deletion confirmation. | Exit clause | Export pipeline, migration runbook | Exit test, export manifest | Vendor owner / architect | Annual |
| CC-10 | Concentration | Critical workflows require concentration review and alternate path. | Governance standard | Portfolio inventory, fallback route | Concentration dashboard | Enterprise architect | Quarterly |
8.3 Exit Plan Template
# AI Vendor Exit Plan: [Vendor / Product / Use Case]
## 1. Exit scope
- Vendor / product:
- Contract ID:
- Business capabilities affected:
- Systems affected:
- Data classes affected:
- Criticality tier:
- Exit trigger:
- Target end state:
## 2. Exit triggers
| Trigger | Severity | Decision owner | Required action |
|---|---|---|---|
| Persistent SLA breach | | | |
| Material model degradation | | | |
| Data misuse or breach | | | |
| Regulatory direction | | | |
| Unacceptable subprocessor / region change | | | |
| Commercial or renewal failure | | | |
| Strategic platform migration | | | |
## 3. Dependency inventory
| Asset | Vendor-held | Institution-held | Export format | Replacement / target | Validation method |
|---|---|---|---|---|---|
| Source documents | | | | | |
| Prompts | | | | | |
| Workflow configuration | | | | | |
| Embeddings / index metadata | | | | | |
| Audit logs | | | | | |
| Eval reports | | | | | |
| Feedback and annotations | | | | | |
| API credentials and integrations | | | | | |
## 4. Transition approach
- Replacement option:
- Degraded mode:
- Manual processing approach:
- Parallel run period:
- Data export date:
- Re-ingestion plan:
- Regression eval plan:
- Business acceptance criteria:
- Customer communication approach:
- Regulatory / audit notification approach:
## 5. Exit execution checklist
| Step | Owner | Evidence |
|---|---|---|
| Freeze vendor changes | | Change freeze record |
| Export data and configuration | | Export manifest and checksum |
| Export logs and evidence | | Audit archive |
| Rebuild or migrate knowledge layer | | Index validation report |
| Run regression eval | | Eval report and sign-off |
| Shift traffic or workflow | | Cutover record |
| Revoke access | | Access revocation report |
| Confirm deletion | | Deletion certificate |
| Close financial obligations | | Final invoice review |
| Archive exit evidence | | Evidence binder link |
| Conduct lessons learned | | Post-exit review |
## 6. Residual risk decision
- Residual risks:
- Accepted by:
- Compensating controls:
- Monitoring period:
- Evidence retention:
8.4 SLA / SLO / Evidence Schedule Template
| Area | SLA / SLO | Measurement | Breach trigger | Evidence source | Owner | Cadence |
|---|---|---|---|---|---|---|
| Availability | AI service available for approved business hours and critical processing windows | Vendor uptime plus workflow success rate | Workflow unavailable beyond agreed threshold | Vendor status, synthetic test, workflow logs | Vendor owner / platform | Monthly |
| Latency | Response time supports user workflow without increasing handling time | p50 / p95 workflow latency | p95 exceeds threshold for defined period | APM, trace logs | Platform owner | Weekly |
| Quality | Output meets business rubric and policy accuracy threshold | Sample review and automated eval | Score below threshold or negative trend | Eval dashboard, QA review | Eval owner | Weekly / monthly |
| Citation accuracy | RAG outputs cite current, authorized and relevant sources | Citation audit | Wrong, stale or unauthorized source above threshold | Retrieval logs, source registry | Knowledge owner | Monthly |
| Human override | Override rate stays within expected range and reasons are reviewed | Override rate by team and use case | Spike or persistent outlier | Case management, feedback logs | Business owner | Monthly |
| Data leakage | Sensitive data controls operate as designed | DLP and redaction tests | Any high-severity leakage | DLP logs, incident record | Security / privacy | Continuous |
| Model change | Material changes are tested before production | Change records and eval gates | Unapproved change or failed regression | Release log, notice archive | Model risk / architect | Per release |
| Incident notice | Vendor notifies within agreed timeline by severity | Incident timestamps | Late or incomplete notice | Incident record | Security / vendor owner | Per incident |
| Cost | Unit cost per workflow remains within approved range | Cost per case / token / document / user | Threshold breach | Usage export, finance report | PM / finance | Monthly |
| Exit readiness | Export and migration controls remain functional | Exit rehearsal | Failed export or missing evidence | Exit test record | Vendor owner / architect | Annual |
8.5 Portfolio Evidence Template
| Evidence artifact | What it proves | Portfolio story |
|---|---|---|
| Vendor architecture review | You can translate vendor claims into system, data, model, logging and exit questions | “I evaluate AI vendors at architecture depth, not only procurement depth.” |
| Contract-control matrix | You understand how clauses map to technical enforcement and evidence | “I can bridge legal language, risk controls and implementation.” |
| Data flow and vendor boundary diagram | You can show where sensitive data crosses vendor boundaries | “I can make privacy and third-party risk visible to executives.” |
| Model gateway ADR | You can design for abstraction, version control and fallback | “I reduce lock-in with explicit architectural decision records.” |
| RAG source registry and index strategy | You can protect knowledge quality, permissions and exportability | “I know that RAG risk is often knowledge governance risk.” |
| Eval and release gate report | You can convert requirements into measurable acceptance criteria | “I do not accept benchmark demos as production evidence.” |
| SLA/SLO/evidence schedule | You can operate vendor risk after go-live | “I make risk management continuous, not a one-time approval.” |
| Concentration dashboard | You can see enterprise dependency across workflows and providers | “I can discuss resilience and systemic vendor risk with senior leaders.” |
| Exit plan and rehearsal | You can prove a vendor is replaceable in practice | “I design for graceful degradation and orderly termination before dependency grows.” |
| Incident postmortem template | You can manage AI-specific incidents across business, vendor and technical teams | “I treat unsafe output, data leakage and tool misuse as operational incidents.” |
9. Interview Expression
9.1 30-second version
AI third-party risk is not just vendor due diligence. In financial retail, I would manage it as a lifecycle and architecture problem. The contract gives us rights over data, model changes, logs, evals, audit access, incident response and exit. The architecture gives us enforcement points such as a model gateway, RAG abstraction, tool permission broker and evidence store. The operating model proves the controls through monitoring, release gates, incident records and exit rehearsals.
9.2 2-minute version
For an AI vendor, I start with the business process and customer impact, then map every vendor dependency across data, model, RAG, agent tools, logs and subprocessors. I use the 2023 Interagency Guidance lifecycle: planning, due diligence, contract negotiation, ongoing monitoring and termination. Then I add AI-specific controls from NIST AI RMF and ISO 42001: governance, risk mapping, measurement, management, lifecycle accountability and continuous improvement.
The key is to avoid treating the vendor as a black box. Contractually, I want clear terms for no training on our data, retention and deletion, model change notice, logging and audit export, eval requirements, incident notification, regulatory cooperation and exit support. Architecturally, I avoid direct coupling where possible. I prefer a model gateway for provider abstraction, a source registry and RAG abstraction for knowledge control, a tool permission broker for agent actions, and an evidence layer that can reconstruct decisions.
Finally, I look at concentration and exit. If a single model provider, RAG SaaS or SI supports multiple critical workflows, I want a concentration view, alternate path, degraded mode and tested export plan. The goal is not to eliminate vendors; it is to use vendors while retaining institutional accountability, resilience and evidence.
9.3 Senior interview bullets
- “I separate supplier selection from supplier dependency. Selection answers who can provide capability; dependency architecture answers how we remain in control after adoption.”
- “For AI SaaS, the most dangerous lock-in is often not the UI. It is trapped prompts, hidden retrieval logic, non-exportable logs, vendor-held evals and undocumented workflow configuration.”
- “A strong AI contract is only useful if the platform can enforce it. If the contract says model changes require notice but the application cannot detect model versions, the control is weak.”
- “I ask whether the institution can reconstruct a disputed AI-assisted decision: user, data, prompt, model, sources, tool calls, output, human approval and final action.”
- “Exit strategy is a design input, not a termination activity. I want data export, source registry, provider abstraction and manual fallback designed before production.”
- “For CBAP work, I translate business rules and exception paths into contract controls, eval cases and architecture review questions.”
- “For a financial retail AI assistant, I would not accept a generic vendor benchmark. I would test stale policy, vulnerable customer language, complaint escalation, PII leakage, citation accuracy and human override behavior.”
9.4 Questions to ask the interviewer
- “Which AI vendors are already embedded in your critical workflows, and do you have a single portfolio view of dependency and concentration?”
- “Do your AI contracts give you export rights for prompts, workflow configuration, logs, eval results and knowledge index metadata?”
- “Can your teams reconstruct an AI-assisted customer decision from production evidence without asking the vendor to manually assemble it?”
- “Do you have a tested exit path for your most important AI vendor, including degraded operations and data deletion evidence?”
- “Are model updates, RAG changes and tool permission changes governed through the same release and risk process as other production changes?”
10. Practical Review Checklist
| Question | Strong answer |
|---|---|
| What is the business capability? | Mapped to capability, workflow, owner, customer impact and risk tier. |
| What data crosses the vendor boundary? | Field-level data map including prompts, outputs, embeddings, logs, telemetry and support access. |
| What model is used? | Known provider, model, deployment, version, update process and fallback route. |
| What knowledge is used? | Source registry, owner, effective date, ACL, index version and citation audit. |
| What actions can AI take? | Tools are scoped, approved, logged, reversible where feasible and controlled by kill switch. |
| What is in the contract? | Data, model, logging, eval, audit, change, incident, subprocessor and exit controls. |
| What is monitored? | SLA, workflow SLO, quality, citation, override, drift, incidents, cost and exit readiness. |
| What can be exported? | Data, logs, prompts, configs, evals, feedback, source mappings and operational evidence. |
| What if vendor fails? | Fallback, degraded mode, manual queue, alternate provider and communication plan. |
| What if regulator asks? | Evidence binder can reconstruct decisions, controls, monitoring and vendor oversight. |
11. One-Page Executive Summary
AI vendor governance for financial retail should be built around four lines of defense:
- Contract rights: data limits, model change notice, logging, eval, audit, incident, subprocessor and exit terms.
- Architecture controls: model gateway, RAG abstraction, tool permission broker, evidence layer and fallback path.
- Operating evidence: release gates, monitoring, incident records, access reviews, SLA reports and exit rehearsals.
- Portfolio governance: concentration view, criticality tiering, renewal gates and senior risk reporting.
The most advanced posture is not “we only build internally” or “we only buy certified products.” The advanced posture is selective external leverage with internal control of decision authority, data boundaries, evidence, resilience and exit.