AI Supply Chain / AI BOM:来源证明架构
一句话:
AI Supply Chain / AI BOM / Provenance Architecture 解读
面向对象: AI Platform PM / Enterprise Architect / Senior BA / Model Risk Lead / Security Architect / Vendor Risk Lead。 核心问题: AI supply chain 不是普通软件依赖清单。金融零售 AI 系统还依赖模型、供应商、数据集、RAG 语料、prompt、工具、MCP server、eval set、人审供应商、telemetry store、开源包、license、版本、签名和变更证据。 学习目标: 建立 AI Bill of Materials and Provenance Architecture, 能回答“某个 AI 输出由哪些组件、数据、工具、人和供应商共同产生, 这些组件是否可信、可替换、可审计、可退出”。
Source Anchors
| Source | Link | 本文采用方式 |
|---|---|---|
| CISA SBOM official page | https://www.cisa.gov/sbom | 用 SBOM 思维理解 component inventory、version、supplier、dependency、vulnerability response。 |
| NIST SSDF SP 800-218 | https://csrc.nist.gov/pubs/sp/800/218/final | 用 secure software development practice 扩展到 AI component build、verify、release、response。 |
| OWASP Top 10 for LLM Applications | https://genai.owasp.org/llm-top-10/ | 将 LLM supply chain、data/model poisoning、tool/plugin/MCP 风险纳入 AI BOM。 |
| NIST AI RMF | https://www.nist.gov/itl/ai-risk-management-framework | 用 Govern / Map / Measure / Manage 组织 AI supply chain 风险治理和证据。 |
| Federal Reserve SR 26-2 | https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm | 作为当前模型风险管理锚点;AI BOM 必须比 formal model inventory 更宽。 |
一句话:
AI BOM is the component-level traceability layer for AI systems: model, data, prompt, tool, eval, human operation, telemetry and vendor components must be versioned, provenanced, signed, monitored and exit-ready.
1. Thesis
传统 SBOM 关注 software package、version、supplier 和漏洞响应。AI supply chain 需要更宽的 BOM:
| 传统 SBOM 对象 | AI BOM 扩展对象 |
|---|---|
| application package | model endpoint, prompt pack, RAG index, tool manifest |
| library dependency | open-source model, tokenizer, MCP server, connector package |
| supplier | model provider, data vendor, human review vendor, eval SaaS |
| version | model version, prompt version, corpus snapshot, eval set release |
| license | model license, dataset license, document rights, generated-output terms |
| vulnerability | CVE plus prompt injection, data poisoning, unsafe tool chain, model deprecation |
AI BOM 不是 vendor risk 的替代品。Vendor risk 回答“这个供应商是否可用、合规、可靠”。AI BOM 回答“这个 AI workflow 的每个组件是什么, 从哪里来, 谁批准, 哪个版本, 有什么权利限制, 被什么输出使用, 变更或漏洞出现时如何定位影响范围”。
2. Why It Matters In Financial Retail
金融零售 AI 系统通常进入客户服务、信贷、催收、欺诈、AML、投诉、财富、营销和员工 Copilot。Supply chain failure 会产生五类后果:
| Failure | Example |
|---|---|
| Hidden model drift | 供应商静默升级模型, 客服答复风格和拒答率变化。 |
| RAG provenance break | 过期收费政策仍在向量索引中, 影响客户承诺。 |
| Tool chain compromise | MCP connector 描述被污染, Agent 外发客户数据。 |
| License / rights breach | 开源模型或数据集许可证不允许商业金融用途。 |
| Weak incident scoping | 发现污染数据集后, 无法定位哪些输出、eval、prompt、模型适配受影响。 |
SR 26-2 代表模型风险管理的当前监管锚点, 但 AI BOM 不能只等同 formal model inventory:
- 一个 GenAI workflow 可能只有一个 formal model, 但有数十个可改变行为的组件。
- RAG 语料、prompt、工具权限、eval set、人审标准和 telemetry pipeline 都能改变结果。
- 没有 component-level provenance, 模型验证只能验证“系统某一时刻的表现”, 不能追踪“哪个组件导致变化”。
3. Core Concepts
| Concept | Meaning | AI example |
|---|---|---|
| AI BOM | AI system component inventory with versions, provenance, ownership, rights and risk state | card-dispute-copilot-aibom-2026.06.30 |
| Component | Any artifact or service that can influence AI behavior, evidence, cost or compliance | model, prompt, corpus, MCP server, eval set, reviewer vendor |
| Provenance | Evidence of origin, derivation, approval, signature and usage | source doc -> chunk -> index -> prompt -> output |
| Attestation | Signed statement that a component was built, tested or approved under defined controls | signed prompt pack release |
| Vulnerability | Any defect or exposure requiring triage, not only CVE | poisoned corpus, expired license, model deprecation |
| Impact graph | Queryable map from component to affected workflows, outputs, customers and controls | model v2026-06 used by 12 workflows |
| Exit readiness | Ability to replace, rebuild, disable or downgrade a component | alternate model route and index rebuild runbook |
4. AI BOM Component Taxonomy
| Component class | Required metadata |
|---|---|
| Model | provider, model name, version, deployment, region, license, data-use terms, eval status |
| Provider | legal entity, service, subprocessor, region, SLA, support access, concentration score |
| Dataset | source, owner, license, consent, sensitivity, purpose, snapshot, quality profile |
| RAG corpus | source docs, chunk rule, embedding model, index version, ACL policy, effective dates |
| Prompt | template ID, system/developer instructions, policy pack, owner, change approval |
| Tool / API | tool name, risk tier, permission scope, schema, approver, kill switch |
| MCP server / connector | manifest, allowed tools, transport, token scope, sandbox, signature |
| Eval dataset | source, synthetic/production flag, sensitivity, rubric, release gate, owner |
| Human operation | review vendor, reviewer training, rubric, QA sampling, confidentiality control |
| Telemetry store | trace schema, retention, access, masking, export, evidence owner |
| Open-source package | package, version, license, maintainer, checksum, vulnerability state |
| License / rights | permitted use, attribution, redistribution, training/fine-tune restrictions |
5. Architecture Diagram
AI Product / Workflow
-> AI BOM Registry
-> model, prompt, RAG, tool, MCP, eval, human review, telemetry, package, license
-> Provenance Event Log
-> ingest, sign, approve, deploy, invoke, evaluate, incident, retire
-> Impact Graph
-> component -> workflow -> output -> customer/case -> control evidence
-> Response Engine
-> vulnerability/change notice -> impact query -> owner task -> regression -> release/rollback
Key design principle:
Every component that can change model behavior, data exposure, legal rights, audit evidence, cost or exit path belongs in the AI BOM.
6. Provenance Architecture
Minimum provenance events:
| Event | Evidence |
|---|---|
| component_registered | component ID, class, owner, supplier, version, source URI |
| component_verified | checksum, signature, scan result, license review, data rights review |
| component_approved | approver, risk tier, approved use, prohibited use |
| component_deployed | workflow, environment, release ID, effective time |
| component_used | trace ID, model call ID, prompt ID, retrieval index, tool call |
| component_changed | old version, new version, reason, change owner, eval result |
| component_flagged | vulnerability, license issue, poisoning suspicion, deprecation |
| component_retired | replacement, deletion/export evidence, residual risk |
AI BOM should connect with model inventory, vendor inventory, data catalog, package registry, CMDB, prompt registry, RAG source registry, eval store and incident system. It should not live as a spreadsheet detached from runtime traces.
7. Financial Retail Case: Card Dispute Copilot
Use case: employee Copilot helps explain credit card disputes, annual fees and provisional credits.
| Component | Risk if missing from BOM |
|---|---|
| foundation model route | silent model change affects tone, refusal and citation behavior |
| fee policy corpus snapshot | stale policy causes wrong customer promise |
| dispute SOP prompt pack | unapproved prompt weakens compliance language |
| transaction summary tool | tool permission drift exposes too much customer data |
| MCP case connector | connector update changes write behavior in CRM |
| eval set | regression no longer covers fee waiver edge cases |
| human QA vendor | reviewer rubric drift hides systematic errors |
| trace store | audit cannot reconstruct disputed AI-assisted answer |
Impact query example:
SELECT workflow_id, output_id, customer_case_id, trace_id
FROM ai_component_usage
WHERE component_id = 'rag-corpus-card-fee-policy-2026q2'
AND component_version = 'v17'
AND generated_at >= '2026-06-01';
This query turns a corpus issue into scoped response instead of broad panic.
8. PM / BA / Architect Checklist
| Role | Checklist |
|---|---|
| PM | Define release-gated components, vendor-change exposure, customer-promise impact and concentration risk. |
| BA | Trace business rules to prompt, corpus, tool, eval and human-review components; write acceptance criteria for version evidence. |
| Architect | Design registry, signed manifests, provenance events, impact graph, gateway integration, rollback and exit patterns. |
| Security | Verify package, connector, MCP server, tool token, signature, sandbox, vulnerability and incident routing controls. |
| Model risk | Connect AI BOM to model inventory while keeping broader system-component traceability. |
| Procurement | Map suppliers and subprocessors to component classes, data exposure, rights, notice and exit obligations. |
9. Code-Lite Experiment
Create a small JSON AI BOM for one workflow, then run three response queries manually.
{"workflow_id":"card-dispute-copilot","release":"2026.06.30","components":[
{"id":"model-primary","class":"model","version":"2026-06","risk_tier":"high"},
{"id":"prompt-dispute-sop","class":"prompt","version":"v9","signature":"sha256:8f91"},
{"id":"rag-fee-policy","class":"rag_corpus","version":"2026q2-v17","owner":"policy-ops"},
{"id":"tool-transaction-summary","class":"tool","version":"v4","permission":"read_minimized_summary"},
{"id":"eval-dispute-golden","class":"eval_dataset","version":"2026q2-v3","sensitivity":"synthetic_plus_masked"}]}
Queries:
- Which components touch customer data?
- Which components have licenses or data rights restrictions?
- If
rag-fee-policyis stale, which workflows, eval sets and outputs are affected?
Expected learning: AI BOM becomes useful only when it supports response queries, not just inventory display.
10. Interview Questions
Q1: How is AI BOM different from SBOM?
SBOM tracks software components and dependencies. AI BOM includes software but adds models, datasets, RAG corpora, prompts, tools, MCP servers, eval datasets, human review vendors, telemetry stores, licenses, provenance, signatures and runtime usage.
Q2: How is AI BOM different from vendor risk?
Vendor risk assesses supplier lifecycle, contract and oversight. AI BOM is component-level traceability: which component version influenced which workflow and what to do when that component changes.
Q3: Why is SR 26-2 not enough by itself?
SR 26-2 anchors current model risk expectations, but a formal model inventory alone can miss RAG corpora, prompts, tools, connectors, eval sets and human operations. AI BOM complements model inventory with broader system provenance.
11. Common Pitfalls
| Pitfall | Better pattern |
|---|---|
| Treating AI BOM as a spreadsheet | Integrate with runtime trace and release gates |
| Recording only model name | Record model, prompt, corpus, tool, eval and human operation versions |
| Ignoring licenses | Add model/data/package/output-rights review before deployment |
| No signed manifests | Require checksum/signature for prompt packs, tool manifests and corpus snapshots |
| Confusing vendor inventory with AI BOM | Link them, but keep component-level traceability |
| No response workflow | Define owner, severity, impact query, regression and rollback path |
| No exit view | Track exportability, replacement option and concentration score per component |