AI Traceability Requirements-Eval-Control Graph Playbook
以下官方来源作为 traceability graph 的方法锚点。本文把它们转成金融零售 AI 项目的需求工程, 评测门禁, 控制证据, 架构治理和审计问询资产。访问日期按 2026-06-29 记录。
AI Traceability Requirements / Eval / Control Graph Playbook
面向对象: 高级 AI PM / AI BA / Product Architect / Solutions Architect / Enterprise Architect / EvalOps Lead / AI Governance / Model Risk / Internal Audit / 金融零售 AI 转型负责人。
目的: 把传统 BA 的 requirements traceability 升级为 AI 系统的 requirement -> eval -> control -> architecture decision -> implementation/config -> telemetry -> incident -> evidence graph, 用于产品设计, 架构治理, 上线门禁, 审计证据和作品集展示。
核心观点: AI traceability 不是把 PRD 条目连到测试用例, 而是证明一个 AI 能力在特定业务结果, 风险边界, 系统版本和运行证据下可被测量, 控制, 追责和持续治理。
使用方式: 每个高影响 AI use case 至少维护一张 Traceability Graph Table, 一张 Coverage Matrix, 一组 Evidence Queries, 一份 Release Decision Memo, 一组 Audit Q&A 和一份 Portfolio Evidence Pack。
重要说明: 本文是学习, 作品集和治理设计材料, 不是法律意见, 审计意见, 模型验证结论或监管解释。正式项目必须由 Legal, Compliance, Risk, Model Risk, Internal Audit, Security, Privacy, Data Owner 和业务管理层按适用司法辖区确认。
1. Source Anchors
以下官方来源作为 traceability graph 的方法锚点。本文把它们转成金融零售 AI 项目的需求工程, 评测门禁, 控制证据, 架构治理和审计问询资产。访问日期按 2026-06-29 记录。
| Anchor | Official source | 本文使用方式 |
|---|---|---|
| W3C PROV | https://www.w3.org/TR/prov-overview/ | 借用 Entity / Activity / Agent 的 provenance 思路, 组织 AI requirement, eval run, release decision, telemetry, incident 和 evidence 的来源链。 |
| NIST AI RMF | https://www.nist.gov/itl/ai-risk-management-framework | 用 Govern / Map / Measure / Manage 组织 AI 风险上下文, 评测, 控制, 决策和持续改进的 traceability。 |
| ISO/IEC 42001 | https://www.iso.org/standard/81230.html | 用 AI management system 视角把 scope, role, operation, performance evaluation, improvement 和 management review 转成证据对象。 |
| OpenTelemetry | https://opentelemetry.io/docs/ | 用 traces, metrics, logs 和 attributes 思路设计生产 telemetry, 让线上行为可以回连到 requirement, eval, control 和 incident。 |
Standards-to-artifacts:
| Source lens | Traceability artifact | 高级表达 |
|---|---|---|
| W3C PROV | lineage model, evidence provenance, change impact map | “我不只保存证据, 我会证明证据由谁, 在哪个活动, 基于哪个实体生成。” |
| NIST AI RMF | risk-to-eval-to-control traceability, release gate, monitoring gate | “我把 AI requirement 放入 Govern / Map / Measure / Manage 的闭环, 而不是把它当功能需求结束。” |
| ISO 42001 | AIMS evidence map, ownership matrix, management review pack | “我用管理体系语言证明 AI 能力有范围, 责任, 运行控制, 绩效评价和持续改进。” |
| OpenTelemetry | trace attribute spec, production evidence query, incident replay | “我在架构阶段就定义可观测性字段, 不靠事后截图证明 AI 行为。” |
2. One-Sentence Positioning
AI Traceability Graph = 把业务目标, AI 需求, 评测契约, 控制活动, 架构决策, 配置版本, 生产遥测, 事故复盘和审计证据连接成一张可查询, 可复核, 可治理的证据图。
最小链路:
Business outcome
-> stakeholder concern
-> AI requirement
-> eval question
-> eval case / metric / threshold
-> control objective / control activity
-> ADR
-> implementation / prompt / RAG / tool / policy config
-> telemetry signal
-> incident / exception / change
-> evidence artifact
-> release or assurance decision
高级 BA 的差异化不在于“能不能画 traceability matrix”, 而在于能否回答:
- 这个 AI 需求为什么服务某个业务结果和风险结果。
- 这个需求如何被 eval contract 证明, 而不是靠 demo 证明。
- 哪些控制降低了 AI 的误用, 幻觉, 越权, 泄露, 偏差和过度依赖风险。
- 哪个 ADR 解释了模型, RAG, tool, logging, fallback 和 human oversight 的关键选择。
- 哪些配置版本在生产中真实运行。
- 哪些 telemetry 可以证明 AI 行为仍在边界内。
- 哪个 incident 或 exception 影响哪些需求, eval, 控制和证据。
- 审计或监管问询时, 如何从问题追到证据路径。
3. 为什么 AI 需求不能只停在 User Story
User story 适合表达用户目标和交互意图, 但 AI 系统的真实风险通常不在 story 文案里, 而在概率行为, 数据来源, 工具权限, 模型版本, 人工控制, 运行漂移和证据缺口里。
3.1 User story 的失效点
| 传统写法 | 表面价值 | AI 项目缺口 |
|---|---|---|
| As an analyst, I want AI to summarize AML cases, so that I can work faster. | 有角色, 功能, 价值 | 没有说明关键事实必须来源于哪些证据, 哪些结论禁止生成, 哪些输出必须人工确认, 如何评测遗漏 red flag。 |
| As a customer service agent, I want AI to answer policy questions. | 有工作流插入点 | 没有定义政策版本, citation, unauthorized commitment, 投诉升级, 客户影响和生产监控。 |
| As a lender, I want AI to draft credit memos. | 有辅助写作场景 | 没有定义 fair lending 边界, protected class exclusion, reason code, human decision, audit log 和 override evidence。 |
| As a product owner, I want a dashboard for AI quality. | 有管理视图 | 没有定义 quality 与 requirement, eval, release gate, monitoring signal, incident severity 的关系。 |
User story 可以保留, 但只能作为 graph 的一个节点。AI requirement 必须进一步转成:
User story
-> decision boundary
-> data and knowledge boundary
-> behavior requirement
-> risk requirement
-> eval requirement
-> control requirement
-> telemetry requirement
-> evidence requirement
3.2 弱需求与强需求
| 弱需求 | 问题 | 强需求 |
|---|---|---|
| AI must provide accurate answers. | 没有样本, 分母, 风险等级和失败定义 | 在信用卡费用政策场景中, 面向坐席的回答必须引用 approved policy source id 和 effective date; unauthorized fee waiver commitment 为 critical failure, 目标为 0。 |
| AI should cite sources. | 不知道来源是否有效, 是否支持结论 | 每个 material factual claim 必须关联至少一个 active source id; citation audit 检查 source existence, source freshness, claim support 和 entitlement。 |
| AI should be safe. | 控制目标不可测试 | AI 不得执行客户影响动作; 所有 tool call 受 allowlist, role entitlement, dry-run validation 和 human approval 控制。 |
| AI should be monitored after release. | 监控对象不清 | 生产 telemetry 必须记录 requirement_id, eval_contract_id, model_version, prompt_version, kb_version, tool_name, decision_boundary, risk_tier 和 escalation_result。 |
| AI output should be reviewed by humans. | 人工复核可能形式化 | 高风险输出保存到 system of record 前必须记录 reviewer_id, review_outcome, edit_diff_hash, approval_timestamp 和 escalation_reason。 |
3.3 从 CBAP Traceability 到 AI Traceability
本文不重复 CBAP 需求生命周期基础。升级点如下:
| CBAP traceability | AI traceability upgrade |
|---|---|
| Business requirement -> stakeholder requirement -> solution requirement | Business outcome -> AI behavior contract -> eval contract -> control objective -> production evidence |
| Requirement -> design -> test case | Requirement -> dataset slice -> metric -> threshold -> critical failure -> release gate |
| Change impact analysis | Prompt/model/RAG/tool/policy change -> impacted eval cases -> impacted controls -> impacted telemetry -> impacted evidence |
| Requirements status | Requirement state + eval state + control state + monitoring state + incident state |
| Acceptance criteria | Risk-tiered evaluation, hard blockers, residual risk acceptance, monitoring triggers |
| Traceability matrix | Queryable graph with version, owner, provenance, freshness and decision impact |
一句面试表达:
传统 traceability 证明“需求被实现和测试”。AI traceability 还要证明“行为在风险边界内被评测, 控制, 监控, 复核, 变更和审计”。这就是 CBAP 能力在 AI 系统里的升级。
4. Traceability Graph Nodes
Traceability graph 的节点不是越多越好, 而是每个节点都必须回答一个治理问题。下面 taxonomy 适合金融零售 AI 项目。
| Node type | 核心问题 | 最低字段 |
|---|---|---|
| Business Outcome | 业务或风险结果改善什么 | outcome_id, baseline, target, owner, measurement_method, risk_constraint |
| Stakeholder Concern | 谁担心什么 | concern_id, stakeholder, concern, decision_needed, severity |
| AI Requirement | AI 必须如何行为 | requirement_id, requirement_type, allowed_behavior, forbidden_behavior, risk_tier, owner |
| Assumption | 需求成立依赖什么 | assumption_id, assumption, validation_method, expiry_condition |
| Risk | 失败会造成什么影响 | risk_id, risk_event, impact, likelihood, severity, affected_party |
| Eval Question | 评测要回答什么 | eval_question_id, question, linked_requirement, decision_use |
| Eval Case | 用哪些样本证明 | eval_case_id, scenario, dataset_slice, expected_behavior, severity |
| Metric | 用什么信号判断 | metric_id, definition, denominator, threshold, hard_stop, slice |
| Eval Run | 哪次评测结果 | eval_run_id, version_set, dataset_version, result, failed_cases, reviewer |
| Control Objective | 风险降低到什么状态 | control_objective_id, risk_id, objective, assurance_claim |
| Control Activity | 控制如何实际运行 | control_activity_id, preventive_detective_corrective, system_rule, manual_step, owner |
| ADR | 架构为什么这样选 | adr_id, decision, alternatives, control_impact, tradeoff, approval |
| Implementation Component | 哪个组件落地需求 | component_id, service, workflow_step, interface, repository_or_system |
| Configuration | 哪个版本在运行 | config_id, model_version, prompt_version, kb_version, tool_schema_version, policy_version |
| Telemetry Signal | 生产中看什么 | signal_id, trace_attribute, metric_or_log, threshold, sampling_rule, retention |
| Incident / Exception | 哪个失败或例外改变状态 | incident_id, severity, linked_signal, affected_requirement, remediation, decision |
| Evidence Artifact | 哪份材料证明 | evidence_id, artifact_type, source_system, version, generated_at, reviewer, retention |
| Decision Record | 谁基于什么决策 | decision_id, decision_type, go_limited_no_go, approver, conditions, expiry |
| Owner / Agent | 谁负责或生成 | agent_id, role, accountability, review_cadence |
4.1 Node metadata standard
每个高价值节点至少包含:
| Metadata | 要求 |
|---|---|
| Unique ID | 跨 PRD, ADR, eval report, control matrix, telemetry 和 audit binder 可引用。 |
| Version | 与系统版本, 模型版本, prompt 版本, 知识库版本, policy 版本分离管理。 |
| Owner | 业务 owner, control owner, evidence owner 和 technical owner 不混用。 |
| State | draft, approved, active, restricted, retired, superseded, failed, accepted_exception。 |
| Effective window | 适用日期, 版本窗口, release scope 和复核日期。 |
| Risk tier | low, medium, high, critical, 并说明依据。 |
| Evidence quality | freshness, coverage, independence, reproducibility, retention。 |
| Change trigger | 哪些变化会导致节点重新评审或证据失效。 |
4.2 W3C PROV 映射
用 PROV 思路可以避免证据链混乱:
| PROV concept | AI traceability object | 示例 |
|---|---|---|
| Entity | requirement, dataset, prompt, model, policy, eval report, release memo, log extract | REQ-CS-014, KB-RETAIL-POLICY-2026-06, EV-EVAL-2026-06-18 |
| Activity | elicitation, risk assessment, eval run, release review, incident triage, control retest | EVAL-RUN-2026-06-18, REL-GATE-2026-06-20 |
| Agent | product owner, BA, architect, model risk reviewer, compliance approver, service account | AI Product Owner, Model Risk Reviewer, EvalOps Pipeline |
一句话:
Evidence is credible when the graph can show which activity generated it, which entity it used, which agent approved it, and which decision it supported.
5. Traceability Graph Edges
Edges 是图谱的治理价值。节点只是对象清单, edge 才回答“为什么相关”。
| Edge type | 含义 | 典型查询 |
|---|---|---|
| derives_from | requirement 来自 outcome 或 stakeholder concern | 哪些需求支撑这个业务结果? |
| refines | 高层需求被细化为行为, 数据, 评测, 控制需求 | 这个 PRD 需求有哪些 AI-specific 子需求? |
| constrains | policy, risk appetite 或 decision boundary 限制某需求 | 哪些约束导致该 AI 不能自动决策? |
| verifies | eval case, metric 或 test 验证 requirement/control | 这个需求被哪些 eval 证明? |
| mitigates | control objective 降低 risk | 这个风险由哪些控制降低? |
| implements | component/config 实现 requirement 或 control | 哪个服务, prompt, tool schema 落地该控制? |
| decided_by | ADR 或 release memo 解释选择 | 为什么采用 RAG 而不是 fine-tuning? |
| emits | component 产生 telemetry signal | 哪些生产信号证明该需求在运行? |
| triggers | signal 触发 incident, review, rollback 或 eval refresh | 哪些指标超过阈值会停止发布? |
| supports | evidence 支持 claim, test, control 或 decision | 这份证据支持哪个上线主张? |
| approved_by | decision 由某 agent 批准 | 谁接受剩余风险, 到什么时候? |
| supersedes | 新版本替代旧版本 | 哪些旧证据因新 prompt 失效? |
| impacted_by | change 影响 requirement, eval, control, telemetry 或 evidence | 模型升级会影响哪些门禁? |
| observed_in | incident 或 production trace 观察到某失败模式 | 哪些线上失败已经进入 regression dataset? |
5.1 Edge cardinality rules
| Rule | 质量标准 |
|---|---|
| 每个 high-risk AI requirement 至少连接一个 risk, 一个 eval question, 一个 metric, 一个 control objective 和一个 evidence artifact。 | |
| 每个 critical risk 至少连接一个 preventive control, 一个 detective control, 一个 incident response trigger 和一个 release blocker。 | |
| 每个 release decision 必须连接 eval run, control evidence, residual risk, approver 和 expiry/review condition。 | |
| 每个 production incident 必须反向连接 affected requirement, affected config, telemetry signal, evidence update 和 regression eval case。 | |
| 每个 ADR 若改变模型, RAG, tool, logging, fallback 或 human oversight, 必须连接 impacted requirements 和 controls。 |
5.2 Edge anti-patterns
| Anti-pattern | 风险 | 修正 |
|---|---|---|
| Requirement 只连 Jira ticket | 无法证明风险和证据 | 加入 eval, control, ADR, telemetry 和 evidence edges。 |
| Eval report 只连 release memo | 无法追到具体需求 | 每个关键 metric 连接 requirement, risk, slice 和 threshold。 |
| Control matrix 只连政策条款 | 控制与系统实现脱节 | 控制活动连接 component, config, log field 和 operating evidence。 |
| Incident 只在运维系统 | 失败不会改善需求和评测 | incident 连接 failed requirement, root cause, regression case 和 release condition。 |
| Evidence 只按文件夹存放 | 问询时无法定位 | evidence 连接 claim, control, version, owner, generation activity 和 retention。 |
6. End-to-End Chain: Outcome to Evidence
下面链路是本文的核心模板。它把需求工程, 评测, 控制, 架构, 生产运行和证据治理放在同一张图里。
Business outcome
-> Requirement
-> Eval
-> Control
-> ADR
-> Implementation / Config
-> Telemetry
-> Incident / Exception
-> Evidence
| Step | 产物 | 关键问题 | 金融零售示例 |
|---|---|---|---|
| Business outcome | Outcome card | 业务结果和风险约束是什么 | 坐席政策查询时间降低 30%, unauthorized commitment 为 0。 |
| Requirement | AI requirement card | AI 可以/不可以做什么 | AI 可草拟回答, 必须引用有效政策, 命中投诉/欺诈/信贷承诺转人工。 |
| Eval | Eval contract | 如何证明需求可上线 | 用 fee, dispute, credit limit, complaint, fraud slices 评估 groundedness, escalation, forbidden commitment。 |
| Control | Control matrix | 哪些控制降低风险 | RAG source approval, forbidden action classifier, HITL escalation, QA sampling。 |
| ADR | Architecture decision | 为什么这样设计 | 采用 RAG + policy registry, 不 fine-tune 政策知识; tool action 只允许 read-only。 |
| Implementation / Config | Version registry | 哪些版本在生产 | model gpt-4.1-prod, prompt cs-p12, kb retail-policy-2026-06, policy fee-v8。 |
| Telemetry | Trace/log/metric spec | 线上如何观察 | 每次回答记录 requirement_id, citation_status, escalation_result, policy_version, risk_tier。 |
| Incident / Exception | Incident record | 失败如何闭环 | 发现 1 条错误费用承诺, 冻结相关 intent, 更新 policy guardrail, 加入 regression set。 |
| Evidence | Evidence binder item | 如何给审计证明 | eval report, release memo, source registry, trace sample, incident RCA, regression pass report。 |
6.1 Release gate view
| Gate question | Graph query |
|---|---|
| 是否所有 high-risk requirements 都有 eval coverage? | 找出 risk_tier in high/critical 且无 verifies edge 的 requirements。 |
| 是否所有 critical failures 都关闭或有 no-go decision? | 查询 failed eval cases where severity = critical and release_decision not in no-go / restricted / remediated。 |
| 是否生产 telemetry 覆盖上线主张? | 查询 release claims without emitted telemetry signals or evidence retention rule。 |
| 是否某次模型升级影响控制证据? | 从 model_version change 沿 impacted_by edge 找 requirements, eval cases, controls, evidence artifacts。 |
| 是否可以进入 limited release? | 汇总 eval pass, control evidence quality, residual risk acceptance, monitoring readiness 和 stop rules。 |
6.2 Audit view
| Examiner question | Graph path |
|---|---|
| 这个 AI 是否会做最终客户影响决定? | question -> claim -> decision boundary requirement -> tool control -> ADR -> permission test -> audit log sample -> release memo。 |
| 如何证明政策回答使用了有效来源? | question -> RAG requirement -> source registry control -> citation eval -> production trace sample -> evidence binder。 |
| 谁批准了剩余风险? | question -> residual risk -> release decision -> approver agent -> decision record -> expiry condition。 |
| 某事故是否完成整改? | question -> incident -> root cause -> remediation control -> regression eval -> production monitoring -> closure evidence。 |
7. Graph Architecture and Operating Model
Traceability graph 不一定一开始就是图数据库。高级做法是先把 ID, edge, owner, evidence 和 query discipline 建起来, 再决定工具形态。
7.1 Maturity levels
| Level | 形态 | 适用阶段 | 风险 |
|---|---|---|---|
| L1 Spreadsheet Matrix | Excel/Sheets/Markdown table 管理节点和 edges | 作品集, PoC, 单用例 pilot | 容易版本漂移, 查询能力有限。 |
| L2 Linked Artifacts | PRD, ADR, eval report, control matrix, evidence index 使用统一 ID | 多团队 release | 需要强文档纪律和 reviewer 机制。 |
| L3 Metadata Registry | 用例台账, model registry, prompt registry, dataset registry, evidence registry 联动 | 多用例治理 | 需要治理 owner 和数据质量规则。 |
| L4 Queryable Graph | 图数据库或 GRC/SDLC 工具集成, 支持影响分析和审计查询 | 高影响 AI portfolio | 需要 schema governance, access control 和 change management。 |
| L5 Runtime-Connected Graph | OpenTelemetry, incident, CI/CD, eval pipeline 自动回写 traceability | 生产级 AI platform | 需要平台工程投入和强安全控制。 |
7.2 最小可行 schema
| Table | Purpose | Key fields |
|---|---|---|
trace_nodes | 管理所有节点 | node_id, node_type, title, owner, risk_tier, state, version, effective_from, effective_to |
trace_edges | 管理关系 | source_node_id, edge_type, target_node_id, rationale, created_by, created_at, confidence |
trace_evidence | 管理证据 | evidence_id, artifact_type, source_system, version, generated_by, reviewed_by, retention, quality_score |
trace_decisions | 管理门禁和风险接受 | decision_id, decision_type, scope, result, conditions, approver, expiry, evidence_refs |
trace_changes | 管理影响分析 | change_id, change_type, affected_versions, impacted_nodes, required_regression, decision |
7.3 OpenTelemetry instrumentation discipline
生产 telemetry 必须能回连 graph, 否则运行证据会断链。
建议核心 attributes:
| Attribute | 用途 |
|---|---|
ai.use_case_id | 连接 AI inventory 和 business outcome。 |
ai.requirement_id | 连接 production trace 到需求。 |
ai.eval_contract_id | 判断该生产行为是否有上线前评测契约。 |
ai.risk_tier | 支持高风险场景抽样, 告警和保留策略。 |
ai.model_version | 连接 model registry 和 release evidence。 |
ai.prompt_version | 连接 prompt change 和 regression eval。 |
ai.kb_version | 连接 RAG source registry 和 citation audit。 |
ai.policy_version | 连接政策生效日期和回答有效性。 |
ai.tool_name | 连接 tool allowlist 和 permission control。 |
ai.tool_decision | 记录 allowed, blocked, human_approval_required, executed。 |
ai.citation_status | 记录 supported, missing, stale, non_supporting, unauthorized。 |
ai.escalation_result | 记录 no_escalation, escalated_to_human, supervisor_review, blocked。 |
ai.output_hash | 支持复盘, 降低敏感明文暴露。 |
ai.evidence_trace_id | 连接日志样本到 evidence binder。 |
7.4 RACI
| Role | Traceability responsibility |
|---|---|
| AI PM | 定义 outcome, release decision, adoption metrics, residual value/risk narrative。 |
| AI BA | 建立 requirement, stakeholder concern, workflow, decision boundary 和 traceability graph。 |
| Architect | 定义 ADR, trust boundary, logging, data flow, tool permission, fallback 和 config lineage。 |
| EvalOps Lead | 维护 eval contract, dataset slices, eval runs, metrics 和 failed-case regression。 |
| Control Owner | 定义 control objective, control activity, test method, frequency 和 failure condition。 |
| Evidence Owner | 维护 evidence artifact, quality score, retention, reviewer 和 freshness。 |
| Model Risk / Compliance | 审核 risk tier, evaluation sufficiency, control coverage, exception 和 release condition。 |
| Internal Audit | 检查 traceability completeness, evidence quality, operating effectiveness 和 management review trail。 |
8. Financial Retail Case: Customer Service AI Policy Copilot
8.1 Use case boundary
| Dimension | Scope |
|---|---|
| Use case | 零售银行客服坐席 AI Policy Copilot, 辅助回答信用卡费用, 争议处理, 账户服务和投诉升级问题。 |
| AI role | retrieve, summarize, draft, cite, classify high-risk intent。 |
| AI 不做 | 不直接向客户发送回复, 不承诺费用减免, 不批准信贷, 不提供法律结论, 不绕过主管升级。 |
| Users | 客服坐席, 质检主管, 知识库 owner, 产品 owner, 合规 reviewer。 |
| Data / knowledge | approved policy repository, fee schedule, dispute SOP, complaint escalation policy, account context with entitlement filtering。 |
| Risk tier | 高影响辅助系统, 因可能影响客户金融产品理解, 费用争议, 投诉升级和客户权益。 |
8.2 Traceability graph sample
| Source node | Edge | Target node | Rationale |
|---|---|---|---|
| OUT-CS-001: Reduce policy lookup time by 30% without unauthorized commitment | derives_from | REQ-CS-014: AI answers must cite active policy and avoid fee waiver commitment | 业务效率目标必须受客户权益约束。 |
| REQ-CS-014 | verifies | EVAL-CS-021: Fee waiver and dispute response golden set | 用高风险费用和争议样本验证政策引用和禁止承诺。 |
| EVAL-CS-021 | uses_metric | MET-CS-005: Critical unauthorized commitment count = 0 | 任何未经授权承诺都是 release blocker。 |
| REQ-CS-014 | mitigated_by | CTRL-RAG-003: Approved source and citation control | 通过 source registry 和 citation audit 降低错误政策风险。 |
| CTRL-RAG-003 | decided_by | ADR-CS-007: RAG over fine-tuning for policy freshness | 政策频繁变化, 需要 source-level freshness 和可引用性。 |
| ADR-CS-007 | implemented_by | CMP-CS-RAG-02: Policy retriever with entitlement filter | 系统实现 approved source, active status 和权限过滤。 |
| CMP-CS-RAG-02 | configured_by | CFG-CS-2026-06: kb retail-policy-2026-06, prompt cs-p12 | 明确生产版本。 |
| CMP-CS-RAG-02 | emits | SIG-CS-009: citation_status by policy_version and intent | 生产监控来源引用是否存在, 过期或不支持结论。 |
| SIG-CS-009 | triggers | INC-CS-2026-0611: Stale fee policy citation incident | 线上发现过期费用政策引用。 |
| INC-CS-2026-0611 | observed_in | EVAL-CS-REG-003: Regression case for stale fee source | 事故样本进入回归集。 |
| EVAL-CS-REG-003 | supports | EV-CS-037: Regression pass report after source registry fix | 证明修复有效。 |
| EV-CS-037 | supports | DEC-CS-2026-0620: Limited release to 10% seats | 支撑灰度发布决策。 |
8.3 Coverage matrix
| Requirement | Risk | Eval coverage | Control coverage | Telemetry coverage | Evidence | Gate impact |
|---|---|---|---|---|---|---|
| REQ-CS-014 active policy citation | Wrong or stale policy misleads customer | Fee/dispute/complaint golden set, stale-source red team | Approved source registry, citation audit | ai.citation_status, ai.policy_version, ai.kb_version | source registry, eval report, trace sample | critical wrong citation blocks release |
| REQ-CS-018 high-risk escalation | Complaint, fraud or legal-risk intent not escalated | High-risk intent classifier eval | Escalation SOP, supervisor queue | ai.escalation_result, intent risk label | escalation log, QA sample | under-escalation critical count = 0 |
| REQ-CS-022 no unauthorized commitment | AI promises fee waiver or credit outcome | Forbidden commitment red-team set | Response policy, forbidden action classifier | blocked commitment counter, QA tags | red-team report, blocked output logs | any confirmed occurrence = no-go or restricted release |
| REQ-CS-026 human review before customer send | AI draft sent without agent responsibility | Workflow walkthrough, UAT | UI requires agent final action and records edit diff | review action, edit diff hash, send actor | workflow log, review sample | missing review log blocks customer-facing rollout |
| REQ-CS-031 traceable production behavior | Cannot reconstruct output after complaint | Logging completeness test | OTel attribute standard, retention rule | trace id, output hash, version attributes | log completeness report | incomplete logs restrict release scope |
8.4 Release decision interpretation
| Signal | Result | Decision implication |
|---|---|---|
| Critical unauthorized commitment | 0 in release eval, 0 in pilot trace sample | Eligible for limited release if monitoring and stop rule active。 |
| Citation support | 98.7% overall, 100% on high-risk fee/dispute slices | Acceptable with weekly citation audit and source owner attestation。 |
| High-risk escalation | 99.2% overall, one medium severity miss in low-impact intent | Limited release with updated classifier rule and QA sampling。 |
| Log completeness | 99.8% required attributes present | Supports audit replay and complaint investigation。 |
| Residual risk | Users may over-trust fluent drafts | Mitigated by UI disclosure, mandatory agent final action, QA sampling and training。 |
9. Financial Retail Case: AML Investigation Agent
9.1 Case graph
| Graph layer | AML example |
|---|---|
| Business outcome | Reduce evidence collection and narrative drafting time by 25% while keeping critical red flag omission at 0。 |
| AI requirement | Agent may retrieve, summarize and draft narrative; it must not close alerts, change risk rating or submit SAR。 |
| Eval | Historical alert set, red flag omission set, source grounding eval, policy conflict cases。 |
| Control | Tool allowlist read/draft only, SAR workflow no AI write permission, L2 review before case record save。 |
| ADR | Use RAG with case evidence and AML policy source ids; no autonomous disposition tool。 |
| Implementation/config | case retriever, policy retriever, narrative prompt, tool registry, role entitlement filter。 |
| Telemetry | red_flag_checklist_status, citation_status, reviewer_approval, edit_diff_hash, disposition_actor。 |
| Incident | AI narrative omitted structuring pattern in a QA sample; case enters regression dataset。 |
| Evidence | grounding eval, red flag eval, permission test, review log sample, QA finding closure。 |
9.2 Audit-ready Q&A path
| Question | Answer | Evidence path |
|---|---|---|
| AI 是否会替代 analyst 做 AML disposition? | 不会。Agent 只能检索, 摘要和草拟; disposition 和 SAR submission 仍由授权人员在 case system 完成。 | REQ-AML-010 -> CTRL-AGT-004 -> ADR-AML-003 -> permission matrix -> negative test -> case action log。 |
| 如何证明 narrative 没有编造事实? | material factual statements 必须有 case evidence 或 AML policy source id; release eval 和 QA 抽样检查 unsupported claim。 | REQ-AML-014 -> EVAL-GRD-AML-02 -> MET-UNSUP-001 -> citation audit -> expert review sample。 |
| 如果 AI 遗漏 red flag 怎么办? | QA finding 触发 incident, failed trace 进入 regression set, 修复 prompt/RAG 后重新执行 eval gate。 | SIG-AML-RED-004 -> INC-AML-2026-07A -> EVAL-REG-AML-009 -> EV-RET-AML-011。 |
10. Financial Retail Case: Credit Memo Copilot
10.1 Boundary
| Dimension | Scope |
|---|---|
| Use case | Small business credit memo copilot, 为 underwriter 草拟材料缺口, policy checklist, risk factors 和 memo structure。 |
| AI 不做 | 不 approve / decline, 不生成最终 adverse action reason, 不使用 protected class 推断, 不绕过 underwriter accountability。 |
| Risk focus | fair lending, explainability, data minimization, reason consistency, human decision boundary。 |
10.2 Traceability example
| Requirement | Eval | Control | Evidence |
|---|---|---|---|
| AI must not infer or use protected class attributes. | Protected-attribute leakage and proxy reasoning red-team cases。 | Feature exclusion, prompt policy, reviewer checklist, logging of input field set。 | data field inventory, red-team report, review sample, log extract。 |
| AI may draft memo but final credit decision remains human. | Workflow UAT verifies final decision actor and approval path。 | Core lending system decision buttons unavailable to AI service account。 | RBAC test, service account permission matrix, decision audit log。 |
| AI risk factor statements must cite application data or policy. | Groundedness eval across thin-file, missing-data, conflicting-document slices。 | Evidence citation requirement and missing-evidence response policy。 | eval report, failed case analysis, policy source registry。 |
| AI output must support adverse action consistency but not auto-generate notice. | Reason-code consistency eval and human review calibration。 | Underwriter review, compliance sample, notice generation remains rule-controlled。 | calibration note, compliance review, workflow sample。 |
10.3 Senior interview point
In credit, I would treat AI as a memo and control assistant, not a decision engine, unless the institution has explicitly approved that automation boundary. My traceability graph would make this visible: every credit-impacting requirement links to eval slices, fair-lending controls, decision-system permissions, reviewer evidence and production telemetry.
11. Templates
这些模板使用“字段 + 合格样例”的方式, 避免空表格。正式项目可以把样例替换为机构内真实 ID 和证据编号。
11.1 Traceability Graph Table
| Field | 填写要求 | 合格样例 |
|---|---|---|
| Source Node ID | 起点节点, 使用稳定 ID | REQ-CS-014 |
| Source Node Type | requirement, eval, control, ADR, component, signal, incident, evidence | requirement |
| Edge Type | derives_from, verifies, mitigates, implements, emits, triggers, supports | verifies |
| Target Node ID | 终点节点 | EVAL-CS-021 |
| Target Node Type | 终点类型 | eval_case |
| Rationale | 为什么相连 | 费用政策回答需求由高风险费用/争议样本验证。 |
| Owner | 关系维护责任人 | AI BA Lead |
| Evidence Ref | 支撑该关系的材料 | EV-CS-021-EVAL-MAP |
| Freshness Rule | 何时重审 | prompt, kb, fee policy 或 escalation policy 变更时重审。 |
| Gate Impact | 对上线决策的影响 | critical slice failed -> no-go。 |
11.2 Coverage Matrix
| Requirement ID | Business outcome | Risk | Eval cases | Metrics | Controls | ADR / Component | Telemetry | Evidence | Coverage status |
|---|---|---|---|---|---|---|---|---|---|
| REQ-CS-014 | OUT-CS-001 | RISK-CS-006 wrong policy advice | EVAL-CS-021, EVAL-CS-REG-003 | MET-CS-005, MET-CIT-002 | CTRL-RAG-003 | ADR-CS-007, CMP-CS-RAG-02 | SIG-CS-009 | EV-CS-021, EV-CS-037 | Covered for limited release |
| REQ-CS-022 | OUT-CS-001 | RISK-CS-009 unauthorized commitment | EVAL-CS-033 | MET-CRIT-001 | CTRL-SAFE-006 | ADR-CS-010, CMP-CS-GUARD-01 | SIG-CS-014 | EV-CS-033, EV-CS-041 | Covered with hard blocker |
| REQ-CS-031 | OUT-CS-002 | RISK-AUD-002 cannot reconstruct answer | EVAL-LOG-004 | MET-LOG-001 | CTRL-LOG-002 | ADR-OBS-002, CMP-OTEL-01 | SIG-TRACE-001 | EV-LOG-004 | Covered with retention rule |
11.3 Evidence Query Examples
Evidence queries 可以先用 SQL, graph query, spreadsheet filter 或 GRC 报表表达。关键是查询语义清楚。
Query A: high-risk requirements without eval coverage
SELECT r.node_id, r.title, r.owner
FROM trace_nodes r
LEFT JOIN trace_edges e
ON e.source_node_id = r.node_id
AND e.edge_type = 'verifies'
WHERE r.node_type = 'requirement'
AND r.risk_tier IN ('high', 'critical')
AND e.target_node_id IS NULL;
Decision use: release readiness review。结果非空时, 高风险需求不得进入 production release。
Query B: evidence supporting a release claim
SELECT c.node_id AS claim_id,
ctrl.node_id AS control_id,
ev.evidence_id,
ev.artifact_type,
ev.version,
ev.reviewed_by,
ev.quality_score
FROM trace_nodes c
JOIN trace_edges e1 ON e1.source_node_id = c.node_id AND e1.edge_type = 'mitigated_by'
JOIN trace_nodes ctrl ON ctrl.node_id = e1.target_node_id
JOIN trace_edges e2 ON e2.source_node_id = ctrl.node_id AND e2.edge_type = 'supports'
JOIN trace_evidence ev ON ev.evidence_id = e2.target_node_id
WHERE c.node_id = 'CLAIM-CS-DECISION-BOUNDARY';
Decision use: audit response。输出用于证明 AI 不做最终客户影响决定。
Query C: change impact from prompt version
SELECT impacted.target_node_id,
impacted.edge_type,
n.node_type,
n.title,
n.owner
FROM trace_edges changed
JOIN trace_edges impacted
ON impacted.source_node_id = changed.target_node_id
JOIN trace_nodes n
ON n.node_id = impacted.target_node_id
WHERE changed.source_node_id = 'CFG-PROMPT-CS-P12'
AND changed.edge_type = 'impacted_by';
Decision use: change advisory board。prompt 变更必须列出受影响需求, eval, controls, telemetry 和 evidence。
Query D: production incidents not yet in regression set
SELECT i.node_id, i.title, i.owner
FROM trace_nodes i
LEFT JOIN trace_edges e
ON e.source_node_id = i.node_id
AND e.edge_type = 'observed_in'
WHERE i.node_type = 'incident'
AND i.risk_tier IN ('high', 'critical')
AND e.target_node_id IS NULL;
Decision use: monitoring gate。高风险线上失败必须进入 regression dataset 或有正式风险接受记录。
11.4 Release Decision Memo
# Release Decision Memo: Retail Service AI Policy Copilot r18
## Scope
- Use case: Customer service policy copilot for credit card fee, dispute and account-service questions.
- Release stage: Limited release to 10% trained agents.
- AI role: retrieve, summarize, draft, cite, classify high-risk intents.
- Excluded actions: direct customer send, fee waiver commitment, credit approval, legal conclusion.
- Version set: model `gpt-4.1-prod`, prompt `cs-p12`, knowledge base `retail-policy-2026-06`, tool schema `read-only-v3`.
## Traceability Summary
| Area | Result |
|---|---|
| Requirements | 14 active high-risk requirements, all mapped to eval and controls. |
| Eval | 6 eval suites passed; critical unauthorized commitment count = 0. |
| Controls | 9 required controls active; citation audit and escalation queue are release blockers. |
| Architecture | ADR-CS-007 and ADR-CS-010 approved by architecture and compliance reviewers. |
| Telemetry | Required OpenTelemetry attributes present in 99.8% of pilot traces. |
| Evidence | 18 evidence artifacts indexed with owner, version, reviewer and retention. |
## Decision
Limited go for 10% trained agents for 30 calendar days.
## Conditions
- Customer self-service channel remains disabled.
- Weekly citation audit covers fee, dispute and complaint slices.
- Any confirmed unauthorized commitment triggers immediate stop rule.
- Prompt, knowledge base, tool schema or escalation policy change triggers regression eval.
## Residual Risk
Agents may over-trust fluent drafts. Mitigation: UI boundary disclosure, mandatory agent final action, edit-diff logging, QA sampling and targeted training.
## Approvals
Business Owner, Compliance Reviewer, Model Risk Reviewer, AI Product Owner, Chief Architect delegate and Operations Owner approved this limited release decision on 2026-06-20.
11.5 Audit Q&A
| Examiner question | Factual answer | Graph path | Evidence |
|---|---|---|---|
| Does the AI make final decisions affecting customers? | No. It drafts agent-facing responses only. Customer send action remains with trained human agents. | CLAIM-CS-DECISION-BOUNDARY -> REQ-CS-026 -> CTRL-HITL-002 -> ADR-CS-010 -> CMP-WORKFLOW-01 | workflow UAT, permission matrix, send-action audit log |
| How do you know answers use current policy? | The RAG layer only indexes active sources from the approved policy registry and emits citation status by policy version. | REQ-CS-014 -> CTRL-RAG-003 -> CMP-CS-RAG-02 -> SIG-CS-009 -> EV-CIT-2026-06 | source registry, index build log, citation audit |
| What happens after a wrong answer incident? | The incident is triaged, affected version is scoped, failed trace enters regression, fix is retested before release expansion. | INC-CS-2026-0611 -> EVAL-CS-REG-003 -> EV-CS-037 -> DEC-CS-2026-0620 | incident RCA, regression pass report, release condition |
| Who accepted the residual risk? | Business and risk owners accepted limited-release residual risk for 30 days with explicit stop rules. | RISK-CS-OVERTRUST -> DEC-CS-2026-0620 -> AGENT-BUS-OWNER / AGENT-RISK-REVIEWER | release memo, approval record |
11.6 Portfolio Evidence Pack
| Portfolio asset | 内容 | 展示能力 |
|---|---|---|
| One-page executive narrative | 用一个金融零售 AI use case 讲清 outcome, risk boundary, release decision 和 evidence thesis。 | 高管沟通和产品判断。 |
| Traceability graph table | 20 到 40 条关键 nodes/edges, 覆盖 outcome -> evidence。 | 高级需求工程和系统化治理。 |
| Coverage matrix | high-risk requirements 到 eval, controls, telemetry, evidence 的覆盖状态。 | 上线就绪判断和缺口管理。 |
| ADR pack | 3 到 5 个关键 ADR: RAG, tool permission, logging, fallback, HITL。 | 架构治理和 tradeoff 表达。 |
| Eval contract excerpt | dataset slices, metrics, thresholds, critical failures, release blockers。 | AI 验收和 EvalOps 能力。 |
| Control evidence map | control objective, activity, test, evidence, owner, cadence。 | 风险, 合规, 审计语言。 |
| Telemetry spec | OpenTelemetry attributes, log retention, incident replay path。 | 工程落地和生产可观测性。 |
| Audit Q&A | 8 到 12 个监管/内审问题和 evidence path。 | 审计证据和问询响应能力。 |
| Interview answer pack | 30 秒, 2 分钟, CTO/CRO/Chief Architect 版本。 | 求职转化和跨角色表达。 |
12. Review Checklist
12.1 Requirement traceability
- 每个 high-risk AI requirement 是否连接 business outcome 和 stakeholder concern。
- 每个 requirement 是否写清 allowed behavior, forbidden behavior, decision boundary 和 risk tier。
- 每个 behavior requirement 是否有 eval question, dataset slice, metric 和 threshold。
- 每个 critical failure 是否独立作为 release blocker, 不被平均分抵消。
- 每个 requirement 是否连接 control objective 和 control activity。
- 每个 requirement 是否有 telemetry signal 支持生产监控。
12.2 Architecture traceability
- 每个关键 ADR 是否说明影响哪些 requirement, control, telemetry 和 evidence。
- 模型, prompt, RAG index, tool schema, policy source 和 guardrail 是否分别有版本。
- tool action 是否有 allowlist, permission, approval 和 rollback path。
- logging 是否能重建版本, 来源, 用户动作和人工复核。
- fallback 和 stop rule 是否有生产演练或测试证据。
12.3 Evidence traceability
- 每个 release claim 是否至少有一份直接 evidence 支撑。
- 每份 evidence 是否有 owner, version, generated_at, reviewer, retention 和 quality score。
- evidence 是否能追到生成活动和审批 agent。
- evidence 是否覆盖当前生产版本, 而不是旧模型或旧 prompt。
- incident closure 是否包含 root cause, remediation, regression result 和 monitoring confirmation。
12.4 Operating model traceability
- 业务 owner 是否负责 outcome 和 residual risk narrative。
- control owner 是否负责控制活动和运行频率。
- evidence owner 是否负责证据质量和刷新。
- EvalOps 是否负责 failed case 回流和 regression dataset。
- Model Risk, Compliance, Internal Audit 是否能通过 graph 查询同一口径材料。
13. Common Failure Modes
| Failure mode | 表现 | 修正 |
|---|---|---|
| Story-only requirements | PRD 写得像普通 SaaS 功能 | 把 AI 行为拆成 decision boundary, eval, control, telemetry 和 evidence requirements。 |
| Eval disconnected from requirements | 评测报告分数很高, 但不知道证明哪个需求 | 每个 metric 映射 requirement, scenario slice, risk severity 和 release decision。 |
| Controls disconnected from architecture | 控制矩阵说有权限控制, 系统没有日志证明 | 控制活动连接 component, config, permission test 和 OTel attributes。 |
| ADR without governance impact | ADR 只讲技术选型 | ADR 必须说明对风险, 控制, 证据, 成本, 监控和回滚的影响。 |
| Telemetry afterthought | 上线后才发现无法解释事故 | 在需求阶段定义 trace attributes, retention, sampling 和 incident replay。 |
| Evidence folder chaos | 材料很多, 问询时找不到路径 | 用 evidence_id, graph path, owner, version 和 quality score 管理。 |
| Change breaks evidence | prompt 或知识库变了, 旧评测仍被用于 release | 变更触发 impacted_by 查询和 regression eval。 |
| Incident not feeding eval | 事故复盘停在 RCA | failed trace 必须进入 regression case, 关联修复和 release condition。 |
| Human review theater | 文档说有人审, 日志无法证明 | 记录 reviewer, edit diff, approval action, escalation reason 和 QA sample。 |
| Average metric hides harm | 平均 groundedness 达标, 高风险切片失败 | 对 critical slices 设置 hard stop 和 slice-level threshold。 |
14. Interview Expressions
Q1: 你如何把传统需求追踪升级成 AI traceability?
30 秒回答:
我会把 traceability 从 “requirement 到 test case” 扩展成 “business outcome 到 evidence”。对 AI 系统, 每个高风险需求都要追到 eval contract, 控制目标, ADR, 版本配置, 生产 telemetry, incident loop 和审计证据。这样不只是证明功能做了, 还证明 AI 行为在风险边界内被测量, 控制和持续治理。
2 分钟回答:
传统 traceability matrix 通常回答需求是否被设计和测试。AI 系统还需要回答概率行为, 数据来源, 模型版本, 工具权限, 人工复核和上线后漂移。因此我会先定义 business outcome 和 stakeholder concern, 再写 AI requirement 的 allowed / forbidden behavior 和 decision boundary。随后把需求映射到 eval question, dataset slice, metric, threshold 和 critical failure。每个风险再映射到 control objective, control activity 和 test evidence。架构上用 ADR 记录模型, RAG, tool, logging 和 fallback 决策; 生产上用 OpenTelemetry attributes 把 trace 连接到 requirement, model version, prompt version, kb version 和 incident。最后审计问询可以从问题追到 claim, control 和 evidence path。
Q2: 为什么 user story 不足以管理 AI 需求?
30 秒回答:
User story 只能表达用户意图, 不能表达 AI 的评测样本, 失败严重度, 控制活动, 版本边界, telemetry 和审计证据。AI 需求必须变成行为契约和证据链。
2 分钟回答:
例如 “作为客服坐席, 我希望 AI 回答政策问题” 是一个有用入口, 但它不能告诉我们政策来源是否有效, 是否允许承诺费用减免, 哪些投诉必须升级, 错误回答如何被发现, 哪个版本在生产, 事故后如何回归测试。我的做法是把 story 作为图谱节点之一, 再扩展出 decision boundary, data/knowledge boundary, eval contract, control objective, ADR, telemetry 和 evidence。这样需求可以上线, 也可以被审计。
Q3: 你如何设计 AI release gate 的 traceability?
30 秒回答:
我会让 release gate 基于 graph query, 而不是会议印象。它必须证明 high-risk requirements 有 eval coverage, critical failures 为 0 或已限制范围, 控制证据有效, telemetry 就绪, residual risk 有 owner 和期限。
2 分钟回答:
Release gate 的输入包括 requirement coverage matrix, eval run report, failed case list, control evidence, ADR approval, telemetry readiness, incident response plan 和 residual risk acceptance。比如客服 AI 如果有 unauthorized commitment, 即使平均准确率很高也 no-go。如果 citation 支持率在高风险切片达标, 日志完整率达标, 人工复核真实运行, 那可以 limited go。关键是每个结论都能追到图谱路径, 例如 requirement -> eval case -> metric -> control -> ADR -> trace sample -> evidence -> decision。
Q4: 如何把 OpenTelemetry 用到 AI governance?
30 秒回答:
我会把 OpenTelemetry 当作 AI evidence graph 的生产连接层。每次 AI 调用都带 use_case_id, requirement_id, eval_contract_id, model_version, prompt_version, kb_version, tool decision, citation status 和 escalation result。
2 分钟回答:
很多 AI 项目上线后无法回答“错误来自哪个版本, 哪个政策, 哪个工具调用, 哪个用户动作”。所以可观测性不能只看 latency 和 cost。对金融零售 AI, trace attributes 必须能支持审计复盘和控制验证。例如 RAG 回答要记录 kb_version, policy_version, citation_status; Agent 工具调用要记录 tool_name, tool_decision, approval result; 人工复核要记录 review outcome 和 edit diff hash。这样 incident 可以从 production trace 回连 requirement, eval, control 和 evidence。
Q5: 面对监管或内审, traceability graph 的价值是什么?
30 秒回答:
它让团队不再临时找材料, 而是从监管问题直接追到 claim, requirement, control, test, evidence, owner 和 decision record, 并且能说明证据适用于哪个版本和时间窗口。
2 分钟回答:
监管或内审通常不会只问“有没有测试报告”, 而会问 AI 是否会做最终决策, 如何证明数据权限没有扩大, 政策回答是否当前有效, 事故如何整改, 谁接受剩余风险。Traceability graph 可以把这些问题转成路径。例如 “AI 不做最终信贷决定” 连接到 decision boundary requirement, tool permission control, ADR, RBAC test, workflow log 和 release memo。这个结构比共享盘文件夹更稳, 因为每份证据都有 owner, version, reviewer, retention 和支持的 claim。
Q6: 你如何把这套能力做成作品集?
30 秒回答:
我会选一个金融零售 AI 用例, 例如客服 policy copilot 或 AML investigation agent, 做一套 outcome-to-evidence case pack: traceability graph, coverage matrix, eval contract excerpt, ADR, telemetry spec, audit Q&A 和 release decision memo。
2 分钟回答:
作品集不要只放 PRD。高级 AI PM/BA/Architect 的价值在于能跨业务, 需求, 架构, 风险和审计。我会展示一个完整链路: 先用一页 executive narrative 说明业务目标和风险边界; 然后用 traceability graph 说明每个高风险需求如何被 eval, control, ADR, telemetry 和 evidence 覆盖; 再用 release decision memo 说明为什么是 limited go 而不是 full release; 最后用 audit Q&A 模拟监管问询。这样面试官能看到我不是只懂需求文档, 而是能把 AI 系统推到可治理上线。
15. Final Memory Card
| Concept | 一句话 |
|---|---|
| AI Traceability Graph | 把 outcome, requirement, eval, control, ADR, config, telemetry, incident 和 evidence 连成可查询治理图。 |
| User story limit | User story 表达意图, 但不足以证明 AI 行为可测, 可控, 可审计。 |
| Eval linkage | 每个高风险 AI requirement 必须有 eval question, dataset slice, metric, threshold 和 critical failure rule。 |
| Control linkage | 每个关键风险必须有 preventive, detective 或 corrective controls, 并能追到测试和证据。 |
| Architecture linkage | ADR 必须说明模型, RAG, tool, logging, fallback 和 HITL 决策对控制证据的影响。 |
| Runtime linkage | OpenTelemetry attributes 把生产行为连接回需求, 版本, 控制和 incident。 |
| Evidence linkage | 证据必须有 owner, version, reviewer, freshness, retention 和支持的 claim。 |
| Portfolio thesis | 高级 BA 的 AI 竞争力是把需求追踪升级为系统可治理性的证据图。 |
最重要的一句话:
AI traceability 的目标不是证明团队写过需求, 而是证明组织知道这个 AI 为什么存在, 如何被评测, 被哪些控制约束, 以什么版本运行, 发生问题如何复盘, 以及用什么证据接受上线和持续运营。