AI Change Impact:发布治理
一句话:
AI Change Impact / Release Governance 解读
面向对象: AI Product Architect / Platform PM / Release Governance Lead / Senior BA / MLOps / Model Risk。 核心问题: AI 上线后, 变化不只来自模型。Prompt、RAG 语料、index、embedding、reranker、tool contract、policy rule、eval set、vendor、UI、human workflow 和 monitoring threshold 都会改变系统行为。 学习目标: 建立 AI change taxonomy、impact graph、risk-tiered release gate、regression eval、canary/shadow/ramp、rollback 和 evidence pack。
Source Anchors
| Source | Link | 用途 |
|---|---|---|
| NIST AI RMF | https://www.nist.gov/itl/ai-risk-management-framework | 参考 Govern / Map / Measure / Manage 的持续风险管理 |
| NIST GenAI Profile | https://www.nist.gov/itl/ai-risk-management-framework/generative-artificial-intelligence-profile | 参考生成式 AI 特有变更风险 |
| ISO/IEC 42001 | https://www.iso.org/standard/81230.html | 参考 AI management system、operation control、performance evaluation、improvement |
| Federal Reserve SR 11-7 | https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm | 参考模型变更、验证、监控和治理思想 |
| OpenTelemetry | https://opentelemetry.io/docs/ | 参考发布后监控、trace、metric 和 incident signal |
一句话:
AI change impact is multidimensional; model release is only one change type.
1. AI Change Taxonomy
| Change type | 示例 | 可能影响 |
|---|---|---|
| Model | vendor model upgrade, fine-tuned model | answer style, safety, cost, latency |
| Prompt | system/developer prompt update | policy adherence, refusal, tone |
| RAG corpus | policy doc added/removed | factual grounding, stale advice |
| Index / embedding | chunking, embedding model, reranker | retrieval precision/recall |
| Tool/API contract | new CRM write field | action authority, side effects |
| Policy/rules | DMN / OPA / business rule | allow/deny/escalation behavior |
| Eval set/rubric | new golden set, judge change | release metric comparability |
| Vendor | model/API/SLA change | resilience, privacy, cost |
| Workflow | human approval step changed | oversight effectiveness |
| UI | disclosure/approval UX | user trust, automation bias |
| Threshold | confidence or escalation threshold | false positive/negative |
| Monitoring | alert threshold, dashboard change | incident detectability |
| Data retention | log/evidence retention | audit, privacy, deletion |
2. Impact Graph
不要只写:
prompt change, low risk
要画:
change
-> affected use cases
-> affected requirements
-> affected evals
-> affected controls
-> affected customer segments
-> affected evidence
-> release gate
一个 RAG index rebuild 可能影响:
- citation support。
- policy freshness。
- retrieval coverage。
- adverse action explanation。
- customer complaint rate。
- auditor evidence query。
3. Release Governance Architecture
change intake
-> change classification
-> impact graph
-> risk tier
-> regression eval
-> security/privacy/model risk review
-> release decision
-> canary/shadow/ramp
-> monitoring window
-> rollback or scale
-> evidence archive
高风险 AI release 需要把技术 release 和治理 release 合并。
4. Regression Gate
| Gate | 检查 |
|---|---|
| Functional | 核心任务是否仍完成 |
| Grounding | citation / source support |
| Safety | harmful output / prohibited action |
| Policy | rules / refusal / escalation |
| Fairness | segment behavior |
| Explainability | reason-code consistency |
| Tool | contract compatibility / side effect |
| Cost/latency | unit economics and SLO |
| Evidence | trace/evidence completeness |
| Privacy | PII exposure / retention |
只有平均分提升不足以放行。高风险 slice 退步也可能阻断 release。
5. 金融零售案例
5.1 Credit explanation prompt change
风险:
- 语气更自然, 但 reason code 不再精确。
- 加入“建议重新申请”导致误导。
Release gate:
- adverse action reason consistency。
- prohibited wording check。
- policy version eval。
- legal/compliance signoff。
5.2 AML typology knowledge update
风险:
- 新 typology 加入后, 旧 case narrative 变得过度告警。
- RAG 检索偏向新文档, 忽略旧规则。
Release gate:
- old/new typology regression。
- analyst review。
- false positive KRI。
5.3 Customer-service RAG index rebuild
风险:
- chunking 改变导致关键例外条款无法召回。
- source citation 不稳定。
Release gate:
- retrieval recall。
- citation exactness。
- high-traffic policy Q&A。
- complaint-sensitive cases。
5.4 CRM write tool contract change
风险:
- agent 从写草稿变成写正式字段。
- 下游系统触发客户通知。
Release gate:
- tool authority review。
- approval-before-action。
- sandbox test。
- rollback/compensation plan。
6. Change Request Template
# AI Change Request
Change id:
Change type:
Use cases affected:
Risk tier:
Business owner:
Technical owner:
Reason for change:
Expected benefit:
Components changed:
Impact graph:
Regression eval required:
Security/privacy review:
Model risk review:
Customer communication impact:
Rollback plan:
Monitoring window:
Release decision:
Evidence links:
7. Metrics / KRIs
| Metric | 含义 |
|---|---|
| Eval regression count | 变更引入质量退步 |
| High-risk slice failure | 关键场景未通过 |
| Override spike | 人类不信任新版 |
| Complaint spike | 客户影响 |
| Escaped AI defect | 门禁漏掉问题 |
| Rollback frequency | release 稳定性 |
| Evidence completeness | 审计可证明 |
| Approval SLA | 治理是否堵塞 |
| Cost/latency drift | 经济性或体验退步 |
8. 面试表达
30 秒版本:
我会把 AI change governance 设计成多维影响评估, 不只管模型版本。任何 model、prompt、RAG、tool、policy、eval、vendor、workflow、UI 或 monitoring 变化, 都要进入 change intake、impact graph、risk tier、regression eval、signoff、canary/ramp、monitoring 和 rollback。高风险 AI release 必须能证明没有破坏质量、安全、公平、解释、工具权限和证据链。
2 分钟版本:
AI 系统的变化面比传统软件更宽。比如一个 RAG index rebuild 看似只是数据更新, 但可能改变引用、政策解释、客户投诉和 adverse action explanation。一个 prompt change 可能让语气更好, 但破坏 reason-code consistency。 我的做法是建立 change taxonomy 和 impact graph。每个 change 先映射 affected use cases、requirements、evals、controls、customer segments 和 evidence。然后按 risk tier 决定 regression gate: functional、grounding、safety、policy、fairness、explainability、tool、cost/latency、privacy 和 evidence。 发布方式也按风险选择 shadow、canary、ramp 或 full rollout, 并保留 rollback plan 和 monitoring window。这样 AI 上线后可以持续演进, 但不会靠运气发布。
9. Portfolio Exercise
选择一个已上线 AI system:
- 列出 10 类可能 change。
- 画 impact graph。
- 设计 risk-tiered release gate。
- 写 regression eval checklist。
- 写 rollback plan。
- 写 release decision memo。
输出:
- AI Change Governance Pack。
- 1 张 release gate matrix。
- 1 个 canary / rollback runbook。