返回 Papers
AI 底层逻辑 / 经典论文

AI Change Impact:发布治理

一句话:

248ai-foundations/papers/106-ai-change-impact-release-governance.md

AI Change Impact / Release Governance 解读

面向对象: AI Product Architect / Platform PM / Release Governance Lead / Senior BA / MLOps / Model Risk。 核心问题: AI 上线后, 变化不只来自模型。Prompt、RAG 语料、index、embedding、reranker、tool contract、policy rule、eval set、vendor、UI、human workflow 和 monitoring threshold 都会改变系统行为。 学习目标: 建立 AI change taxonomy、impact graph、risk-tiered release gate、regression eval、canary/shadow/ramp、rollback 和 evidence pack。


Source Anchors

SourceLink用途
NIST AI RMFhttps://www.nist.gov/itl/ai-risk-management-framework参考 Govern / Map / Measure / Manage 的持续风险管理
NIST GenAI Profilehttps://www.nist.gov/itl/ai-risk-management-framework/generative-artificial-intelligence-profile参考生成式 AI 特有变更风险
ISO/IEC 42001https://www.iso.org/standard/81230.html参考 AI management system、operation control、performance evaluation、improvement
Federal Reserve SR 11-7https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm参考模型变更、验证、监控和治理思想
OpenTelemetryhttps://opentelemetry.io/docs/参考发布后监控、trace、metric 和 incident signal

一句话:

AI change impact is multidimensional; model release is only one change type.


1. AI Change Taxonomy

Change type示例可能影响
Modelvendor model upgrade, fine-tuned modelanswer style, safety, cost, latency
Promptsystem/developer prompt updatepolicy adherence, refusal, tone
RAG corpuspolicy doc added/removedfactual grounding, stale advice
Index / embeddingchunking, embedding model, rerankerretrieval precision/recall
Tool/API contractnew CRM write fieldaction authority, side effects
Policy/rulesDMN / OPA / business ruleallow/deny/escalation behavior
Eval set/rubricnew golden set, judge changerelease metric comparability
Vendormodel/API/SLA changeresilience, privacy, cost
Workflowhuman approval step changedoversight effectiveness
UIdisclosure/approval UXuser trust, automation bias
Thresholdconfidence or escalation thresholdfalse positive/negative
Monitoringalert threshold, dashboard changeincident detectability
Data retentionlog/evidence retentionaudit, privacy, deletion

2. Impact Graph

不要只写:

prompt change, low risk

要画:

change
  -> affected use cases
  -> affected requirements
  -> affected evals
  -> affected controls
  -> affected customer segments
  -> affected evidence
  -> release gate

一个 RAG index rebuild 可能影响:

  • citation support。
  • policy freshness。
  • retrieval coverage。
  • adverse action explanation。
  • customer complaint rate。
  • auditor evidence query。

3. Release Governance Architecture

change intake
  -> change classification
  -> impact graph
  -> risk tier
  -> regression eval
  -> security/privacy/model risk review
  -> release decision
  -> canary/shadow/ramp
  -> monitoring window
  -> rollback or scale
  -> evidence archive

高风险 AI release 需要把技术 release 和治理 release 合并。


4. Regression Gate

Gate检查
Functional核心任务是否仍完成
Groundingcitation / source support
Safetyharmful output / prohibited action
Policyrules / refusal / escalation
Fairnesssegment behavior
Explainabilityreason-code consistency
Toolcontract compatibility / side effect
Cost/latencyunit economics and SLO
Evidencetrace/evidence completeness
PrivacyPII exposure / retention

只有平均分提升不足以放行。高风险 slice 退步也可能阻断 release。


5. 金融零售案例

5.1 Credit explanation prompt change

风险:

  • 语气更自然, 但 reason code 不再精确。
  • 加入“建议重新申请”导致误导。

Release gate:

  • adverse action reason consistency。
  • prohibited wording check。
  • policy version eval。
  • legal/compliance signoff。

5.2 AML typology knowledge update

风险:

  • 新 typology 加入后, 旧 case narrative 变得过度告警。
  • RAG 检索偏向新文档, 忽略旧规则。

Release gate:

  • old/new typology regression。
  • analyst review。
  • false positive KRI。

5.3 Customer-service RAG index rebuild

风险:

  • chunking 改变导致关键例外条款无法召回。
  • source citation 不稳定。

Release gate:

  • retrieval recall。
  • citation exactness。
  • high-traffic policy Q&A。
  • complaint-sensitive cases。

5.4 CRM write tool contract change

风险:

  • agent 从写草稿变成写正式字段。
  • 下游系统触发客户通知。

Release gate:

  • tool authority review。
  • approval-before-action。
  • sandbox test。
  • rollback/compensation plan。

6. Change Request Template

# AI Change Request

Change id:
Change type:
Use cases affected:
Risk tier:
Business owner:
Technical owner:
Reason for change:
Expected benefit:
Components changed:
Impact graph:
Regression eval required:
Security/privacy review:
Model risk review:
Customer communication impact:
Rollback plan:
Monitoring window:
Release decision:
Evidence links:

7. Metrics / KRIs

Metric含义
Eval regression count变更引入质量退步
High-risk slice failure关键场景未通过
Override spike人类不信任新版
Complaint spike客户影响
Escaped AI defect门禁漏掉问题
Rollback frequencyrelease 稳定性
Evidence completeness审计可证明
Approval SLA治理是否堵塞
Cost/latency drift经济性或体验退步

8. 面试表达

30 秒版本:

我会把 AI change governance 设计成多维影响评估, 不只管模型版本。任何 model、prompt、RAG、tool、policy、eval、vendor、workflow、UI 或 monitoring 变化, 都要进入 change intake、impact graph、risk tier、regression eval、signoff、canary/ramp、monitoring 和 rollback。高风险 AI release 必须能证明没有破坏质量、安全、公平、解释、工具权限和证据链。

2 分钟版本:

AI 系统的变化面比传统软件更宽。比如一个 RAG index rebuild 看似只是数据更新, 但可能改变引用、政策解释、客户投诉和 adverse action explanation。一个 prompt change 可能让语气更好, 但破坏 reason-code consistency。 我的做法是建立 change taxonomy 和 impact graph。每个 change 先映射 affected use cases、requirements、evals、controls、customer segments 和 evidence。然后按 risk tier 决定 regression gate: functional、grounding、safety、policy、fairness、explainability、tool、cost/latency、privacy 和 evidence。 发布方式也按风险选择 shadow、canary、ramp 或 full rollout, 并保留 rollback plan 和 monitoring window。这样 AI 上线后可以持续演进, 但不会靠运气发布。


9. Portfolio Exercise

选择一个已上线 AI system:

  1. 列出 10 类可能 change。
  2. 画 impact graph。
  3. 设计 risk-tiered release gate。
  4. 写 regression eval checklist。
  5. 写 rollback plan。
  6. 写 release decision memo。

输出:

  • AI Change Governance Pack。
  • 1 张 release gate matrix。
  • 1 个 canary / rollback runbook。