AI Management System / ISO 42001 Operating Model Playbook
以下来源是本文的管理体系、AI 风险治理和 GenAI 风险画像锚点。本文把它们转成产品、架构、治理论坛、证据包和上线门禁,不把任何标准、框架或公开资料直接等同于法律合规结论。
AI Management System ISO 42001 Operating Model Playbook
定位:面向高级 AI PM / AI BA / AI Architect / Model Risk / 金融零售产品与架构团队,把 AI Management System, AIMS, ISO/IEC 42001:2023、NIST AI RMF、GenAI 风险画像、AI inventory、control library、release gates、evidence binder 和 management review 组合成可运行、可审计、可持续改进的 AI 治理操作系统。
适用边界:本文面向 customer-facing GenAI、credit / fraud / KYC / AML AI、内部 copilot、第三方模型集成、RAG、决策引擎、agentic workflow 和 AI 平台治理。它不把治理当成文档堆叠,而是把 policy、objective、process、control、evidence、KPI / KRI 和 architecture review 连接到 AI 生命周期。
重要说明:本文是学习、作品集和内部方案训练材料,不构成法律意见、合规结论、认证结论、模型验证报告或监管解释。ISO/IEC 42001 认证、监管合规和机构内部风险接受必须由 Legal、Compliance、Model Risk、Privacy、Security、Internal Audit、Business Owner、Data Owner、Technology Owner 和管理层结合机构类型、司法辖区、产品范围、客户影响和内部政策确认。
Source Anchors
以下来源是本文的管理体系、AI 风险治理和 GenAI 风险画像锚点。本文把它们转成产品、架构、治理论坛、证据包和上线门禁,不把任何标准、框架或公开资料直接等同于法律合规结论。
| Anchor | Link | 本文使用方式 |
|---|---|---|
| ISO official ISO/IEC 42001 page | https://www.iso.org/standard/42001 | 作为 AIMS 管理体系锚点:建立、实施、维护和持续改进 AI management system,把组织上下文、领导力、规划、支持、运行、绩效评价和改进转成可操作的治理模型。 |
| NIST AI Risk Management Framework | https://www.nist.gov/itl/ai-risk-management-framework | 作为 AI 风险管理结构锚点:用 Govern / Map / Measure / Manage 组织 AI 风险识别、测量、处置、监控和问责。 |
| NIST AI RMF 1.0 publication | https://www.nist.gov/publications/artificial-intelligence-risk-management-framework-ai-rmf-10 | 作为 AI RMF 1.0 正式出版物锚点:把 trustworthy AI characteristics、治理功能和风险管理实践落到金融零售 AI 生命周期。 |
| NIST AI RMF Generative AI Profile | https://www.nist.gov/publications/artificial-intelligence-risk-management-framework-generative-artificial-intelligence | 作为 GenAI 风险画像锚点:把 hallucination、data leakage、harmful content、overreliance、prompt / retrieval risk、synthetic content、incident response 和 monitoring 接入 AIMS。 |
1. 一句话定位
AI Management System 的核心不是多写几份 AI 政策,而是建立一个组织级控制平面:
AIMS Operating Model =
把 AI use case、模型、数据、供应商、人员、流程、技术架构和风险控制
纳入统一 inventory、risk tiering、lifecycle gates、governance forums、
control library、monitoring、incident management、evidence binder
和 continual improvement cycle。
金融零售里的高级 PM / 架构师要能回答六个问题:
- 机构知道自己在哪里用了 AI、用了什么模型、影响了哪些客户和流程吗?
- 每个 AI 系统的 risk tier、owner、human oversight、supplier dependency 和 release status 是否清楚?
- 产品、架构、模型、数据、安全、隐私、合规、运营和供应商控制是否被放进同一个生命周期?
- 上线前的 architecture review、risk assessment、impact assessment、eval、monitoring 和 evidence 是否形成门禁?
- 生产后的 drift、incident、customer harm、override、complaint、supplier change 和 model update 是否进入闭环?
- 管理层能通过 KPI / KRI、management review 和 corrective action 看见 AIMS 是否持续有效吗?
这套能力的目标不是让 AI 创新停止,而是让创新有轨道:
| 传统问题 | AIMS 操作系统的回答 |
|---|---|
| 各团队各自上线 AI,风险信息散落在 Jira、文档、供应商邮件和模型 notebook | 统一 AI inventory、use case intake、risk tiering 和 evidence binder |
| PM 只写业务 PRD,模型团队只交 accuracy,架构只看接口 | 生命周期门禁要求业务目标、客户影响、模型评估、数据控制、安全、隐私、供应商和运营一起过审 |
| GenAI copilot 快速扩散,没人知道谁能访问什么数据 | RAG / tool permission、data boundary、prompt / response logging、human oversight 和 DLP 控制纳入 control library |
| 第三方模型升级后行为变化,内部系统没有重评估 | Supplier change notification、version pinning、regression eval、risk acceptance 和 release gate |
| 事故发生后才临时找证据 | 生产日志、监控、审批、测试、模型卡、供应商评估、incident record 和 management review 常态化归档 |
2. AIMS 与 AI 风险框架的组合方式
2.1 ISO/IEC 42001 的产品化理解
ISO/IEC 42001:2023 可以被产品和架构团队理解为一个 AI management system 的骨架。它关心的不是单个模型指标,而是组织如何系统性管理 AI:
| 管理体系主题 | 产品 / 架构解释 | 金融零售落地 |
|---|---|---|
| Context | 组织为什么使用 AI,外部和内部要求是什么 | 信贷、支付、反欺诈、AML、客户服务、财富、运营效率、监管期望 |
| Leadership | 管理层如何设定 AI policy、accountability 和 risk appetite | AI Steering Committee、AIMS Owner、risk acceptance authority |
| Planning | 如何识别风险和机会,设定 AI objectives | 用例组合、自动化收益、客户影响、风险等级、KPI / KRI |
| Support | 资源、能力、意识、沟通和文档如何支撑 AIMS | 培训、role competency、evidence repository、communication protocol |
| Operation | AI 生命周期过程如何运行 | intake、impact assessment、design review、validation、release、monitoring、retirement |
| Performance evaluation | 如何监控、测量、内部审核和管理评审 | dashboard、internal audit、management review、control effectiveness |
| Improvement | 如何处理 nonconformity、corrective action 和持续改进 | incident postmortem、control gap remediation、policy refresh、model retraining |
高级表达:
AIMS 不是“AI 合规部门的文档体系”,而是一个跨产品、架构、数据、模型、供应商和运营的管理系统。它把每个 AI use case 从 idea 到 retirement 的责任、证据、风险接受和持续监控固定下来。
2.2 NIST AI RMF 的运行语言
NIST AI RMF 的 Govern / Map / Measure / Manage 很适合做 AIMS 的日常运行语言:
| NIST AI RMF function | AIMS 中的操作化含义 | 关键产物 |
|---|---|---|
| Govern | 建立政策、角色、问责、文化、培训、风险偏好和监督机制 | AI policy、RACI、governance forums、control library、training record |
| Map | 理解 use case、上下文、利益相关方、数据、流程、影响和风险 | AI inventory、impact assessment、data map、business process map |
| Measure | 测量模型、系统、数据、用户体验、安全、隐私、公平性和 GenAI 风险 | eval report、red team report、calibration、fairness、security test、RAG quality |
| Manage | 处置、接受、转移、缓解、监控和沟通 AI 风险 | release decision、risk acceptance、monitoring dashboard、incident plan、corrective action |
2.3 GenAI Profile 的特殊增量
传统预测模型的治理重点是数据、特征、性能、偏差、稳定性和阈值。GenAI / LLM / RAG / agentic AI 还需要额外关注:
| GenAI 风险 | AIMS 控制点 |
|---|---|
| Hallucination / unsupported claim | groundedness eval、citation support、answerability gate、high-risk refusal |
| Sensitive data leakage | data classification、prompt DLP、retrieval permission、response filtering、log redaction |
| Prompt injection / tool abuse | instruction hierarchy、tool allowlist、sandbox、transaction limit、human confirmation |
| Overreliance | UX boundary、human review、confidence / evidence display、training and supervision |
| Model version volatility | model card、version pinning、regression eval、change notification、rollback plan |
| Synthetic content misuse | watermark / provenance where relevant、content policy、customer communication rules |
| Third-party dependency opacity | supplier due diligence、contractual controls、audit evidence、exit strategy |
3. 金融零售 AIMS Operating Model
3.1 Operating Model 总览
Enterprise AI Strategy
-> AI Policy and Risk Appetite
-> Use Case Intake and AI Inventory
-> Risk Tiering and Impact Assessment
-> Architecture and Control Design
-> Model / Data / GenAI Evaluation
-> Release Gates and Risk Acceptance
-> Production Monitoring and Human Oversight
-> Incident / Change / Supplier Management
-> Evidence Binder and Internal Audit
-> Management Review and Continual Improvement
这不是线性文档流程,而是一个闭环:
Plan:
AI objectives, policy, scope, risk appetite, annual AI roadmap
Do:
build, buy, integrate, test, release, operate
Check:
monitor KPIs / KRIs, control effectiveness, incidents, audit findings
Act:
corrective action, policy update, control improvement, portfolio reprioritization
3.2 AIMS 范围边界
一个成熟 AIMS 需要明确什么被纳入管理范围。金融零售建议采用宽口径:
| 对象 | 是否进入 AIMS | 理由 |
|---|---|---|
| 客户可见 GenAI assistant | 进入 | 影响客户理解、服务路径、投诉、权益和品牌信任 |
| 信贷评分、额度、定价、催收模型 | 进入 | 高客户影响和监管敏感度 |
| 欺诈检测、交易拦截、step-up authentication | 进入 | 影响资金可用性、客户体验和损失 |
| KYC / AML alert triage、entity resolution、case summarization | 进入 | 影响合规调查、可疑活动识别和运营判断 |
| 内部客服 / 分行 /运营 copilot | 进入 | 可能影响员工建议、客户沟通和敏感数据访问 |
| 第三方 LLM API、embedding model、OCR、IDV vendor | 进入 | 供应商依赖、数据共享、模型变更和合同风险 |
| 自动化规则引擎中的 ML 模块 | 进入 | 即使外观看起来是规则,也可能包含 AI 风险 |
| 纯静态报表和人工写死规则 | 按内部定义判断 | 如果没有学习、推理、生成或自适应能力,通常走普通 IT / data governance;若被 AI 系统调用,仍需记录依赖 |
3.3 三层治理结构
| 层级 | 决策焦点 | 典型论坛 |
|---|---|---|
| Board / Executive | AI strategy、risk appetite、重大风险接受、资源、重大事件、管理评审输出 | Board Risk Committee、Executive AI Steering Committee |
| Enterprise Control | AIMS policy、inventory、control library、cross-functional gates、audit readiness、supplier standards | AI Governance Council、Model Risk Committee、Data / Privacy / Security Council |
| Product Delivery | 单个 use case 的需求、架构、eval、上线、监控、运营和改进 | Use Case Review、AI Design Authority、Release Gate、Incident Review |
3.4 治理论坛设计
| Forum | 频率 | Chair | 主要输入 | 主要输出 |
|---|---|---|---|---|
| Executive AI Steering Committee | 季度,重大事件临时召开 | Executive Sponsor | AI portfolio、KPI / KRI、high-risk launches、material incidents、audit findings | 风险偏好更新、资源决策、重大风险接受、管理评审行动 |
| AIMS Governance Council | 月度 | AIMS Owner / Enterprise Risk | inventory completeness、control exceptions、policy changes、cross-business issues | control updates、remediation priority、forum escalation |
| AI Use Case Intake Forum | 每周或双周 | AI Product Portfolio Lead | 新 use case、价值假设、客户影响、数据需求、供应商方案 | risk tier、owner assignment、下一步评审路径 |
| AI Design Authority / Architecture Review | 每周 | AI Architect / Enterprise Architect | solution architecture、data flow、model choice、RAG / agent design、security design | architecture decision record、required controls、technical conditions |
| Model / AI Risk Validation Forum | 按上线节奏 | Model Risk / Responsible AI Lead | eval report、fairness、calibration、red team、human oversight plan | validation outcome、limitations、risk acceptance recommendation |
| Supplier AI Risk Forum | 月度 | Vendor Risk / Procurement | third-party model inventory、SLA、change notice、security assurance、contract controls | vendor approval、control gaps、exit plan action |
| AI Incident Review Board | 事件触发,必要时 24-72 小时内 | Incident Commander / Risk Owner | incident log、customer impact、root cause、temporary controls | containment、communication plan、corrective action、lessons learned |
| Management Review | 至少年度,高风险组织建议季度轻量版 | Senior Management | AIMS performance、objectives、audit results、incidents、stakeholder feedback、changes | AIMS improvement decisions、resource allocation、policy and objective refresh |
3.5 RACI
| Activity | Executive Sponsor | AIMS Owner | Product Owner | AI PM | AI Architect | Model Owner | Data Owner | Risk / Compliance | Security / Privacy | Vendor Owner | Operations |
|---|---|---|---|---|---|---|---|---|---|---|---|
| AI policy and objectives | A | R | C | C | C | C | C | C | C | C | C |
| Use case intake | I | A | R | R | C | C | C | C | C | C | C |
| Risk tiering | I | A | R | R | C | C | C | R | C | C | C |
| Impact assessment | I | A | R | R | C | C | R | R | R | C | C |
| Architecture review | I | C | C | R | A | R | R | C | R | C | C |
| Model / GenAI evaluation | I | C | C | R | C | A | R | R | C | C | C |
| Supplier assessment | I | C | C | C | C | C | C | R | R | A | C |
| Release gate decision | A for material risk | A | R | R | R | R | C | R | R | C | C |
| Human oversight operation | I | C | A | R | C | C | C | C | C | C | R |
| Monitoring and incident response | I | A | R | R | R | R | R | R | R | R | R |
| Management review | A | R | C | C | C | C | C | C | C | C | C |
RACI 解释:
| 标记 | 含义 |
|---|---|
| R | Responsible,执行或产出 |
| A | Accountable,对结果负责并拥有最终责任 |
| C | Consulted,提供专业输入 |
| I | Informed,知情并接收结果 |
4. AI Policy、Objectives 与 Process Architecture
4.1 AI Policy 架构
AI policy 不是一份孤立宣言,而是一组层级化政策和标准:
Enterprise AI Policy
-> AI Use Case Classification Standard
-> AI Lifecycle Standard
-> AI Risk and Impact Assessment Standard
-> Data and Privacy for AI Standard
-> GenAI and RAG Usage Standard
-> Human Oversight Standard
-> Third-Party AI Supplier Standard
-> AI Monitoring and Incident Standard
-> AI Evidence and Record Retention Standard
4.2 AI Policy 的关键原则
| 原则 | 操作化解释 |
|---|---|
| Accountability | 每个 AI system 必须有 business owner、technical owner、risk owner 和 monitoring owner |
| Risk proportionality | 控制强度随客户影响、自动化程度、监管触点、可逆性和供应商依赖提高 |
| Human oversight | 高影响场景必须定义人类复核、override、appeal、QA 和 feedback loop |
| Transparency fit for audience | 对客户、员工、审计、管理层和监管沟通使用不同粒度,但内部证据必须可追溯 |
| Data boundary | 敏感数据、客户数据、训练数据、检索数据和日志必须受数据分类与访问控制约束 |
| Measurability | AI 风险不能只靠原则表达,必须有可测指标、阈值和监控频率 |
| Change control | 模型、prompt、retriever、feature、policy、threshold、vendor version 变化必须触发影响评估 |
| Continual improvement | incident、audit finding、drift、customer complaint、human override 和 supplier change 必须反馈到控制改进 |
4.3 AIMS Objectives
AIMS objectives 应同时覆盖价值、风险、控制、能力和证据。示例:
| Objective | KPI / KRI | 解释 |
|---|---|---|
| Inventory completeness | 100% production AI systems recorded with owner, risk tier, model / vendor, release status | 组织知道自己在哪里使用 AI |
| Gate adherence | High-risk AI systems pass documented release gate before production | 防止高风险 AI 绕过评审 |
| Evidence freshness | Evidence binder refreshed after material change and at defined review cadence | 审计证据保持当前有效 |
| Human oversight effectiveness | Override rate, QA defect rate, queue SLA, appeal outcome tracked by use case | 人类监督不是形式动作 |
| GenAI answer quality | Groundedness, citation support, refusal accuracy, harmful content rate | Customer-facing GenAI 有可测质量门禁 |
| Model / system stability | Drift alerts, calibration movement, performance degradation, rollback frequency | 生产表现持续可控 |
| Incident response maturity | AI incident triage time, containment time, customer impact assessment completion | 事件可快速识别、处置和沟通 |
| Supplier control | Critical AI suppliers assessed, change notifications tracked, exit plan tested | 第三方模型不成为黑箱单点风险 |
| Training coverage | Role-based AI governance training completion by product, engineering, operations, risk | 员工理解自己的 AI 责任 |
4.4 Process Architecture
1. Portfolio and Intake
AI opportunity funnel, use case intake, value / risk hypothesis
2. Classification and Inventory
AI definition check, system boundary, owner assignment, risk tier
3. Risk and Impact Assessment
customer impact, rights / access / funds / credit / compliance / privacy / security impact
4. Design and Architecture
data flow, model / vendor choice, integration pattern, human oversight, fallback
5. Build / Buy / Configure
model development, prompt / RAG config, vendor integration, controls implementation
6. Evaluation and Validation
performance, robustness, fairness, calibration, security, privacy, GenAI eval, UAT
7. Release Gate
evidence review, residual risk, risk acceptance, launch scope and conditions
8. Operate and Monitor
production metrics, human oversight, issue management, incident response
9. Change and Supplier Management
model update, prompt change, data change, vendor release, threshold change
10. Retirement
decommission, data retention, customer communication, evidence archive
5. AI Inventory
5.1 Inventory 的定位
AI inventory 是 AIMS 的核心系统记录。没有 inventory,后续的 risk assessment、release gate、monitoring、incident response、supplier control 和 management review 都会失去对象。
AI inventory 不应只记录“模型”。它应记录 AI system:
AI system =
use case + business process + users + customer impact + data + model / prompt / retriever
+ tools + decision policy + human oversight + monitoring + supplier dependencies
5.2 Inventory 数据字段
| 字段 | 填写标准 | 示例 |
|---|---|---|
| AI system ID | 机构内唯一编号 | AI-CARD-SERVICING-RAG-001 |
| Use case name | 业务可理解名称 | Credit card servicing GenAI assistant |
| Business process | 对应流程和流程 owner | Dispute intake, fee inquiry, payment assistance |
| AI capability type | predictive, generative, retrieval, classification, optimization, agentic workflow | RAG + intent classification + summarization |
| User type | customer, employee, analyst, developer, operations | customer and contact center agent |
| Customer impact | 对资金、账户、信贷、投诉、合规或服务权益的影响 | medium to high depending on intent |
| Automation level | decision, recommendation, draft, triage, search, summarization | answer draft with restricted automatic response |
| Risk tier | low, medium, high, material with rationale | high for credit decision / complaint paths |
| Model / vendor | internal model, third-party API, open-source model, vendor product | hosted LLM API + internal retriever |
| Data sources | training, evaluation, retrieval, production input, logs | policy docs, account metadata, interaction logs |
| Sensitive data | PII, financial data, transaction data, identity data, protected-class proxy | PII and account servicing data |
| Human oversight | reviewer role, trigger, queue, SLA, override logging | specialist review for regulated intents |
| Controls | linked control IDs from control library | CTRL-GENAI-ANSWERABILITY, CTRL-HITL-ESCALATION |
| Release status | concept, design, validation, pilot, production, suspended, retired | limited pilot |
| Monitoring owner | accountable owner for production metrics | Servicing AI Operations Lead |
| Evidence binder | link to approved evidence location | AIMS evidence repository record |
| Review cadence | required periodic review frequency | monthly in pilot, quarterly after stabilization |
5.3 Inventory Risk Tiering
| Tier | 典型场景 | 控制强度 |
|---|---|---|
| Low | 内部低风险文档搜索、非客户决策型摘要、开发辅助且无敏感数据 | 基础 inventory、data boundary、usage policy、monitoring |
| Medium | 员工建议、运营优先级、客户服务辅助、低可逆性影响较小 | impact assessment、architecture review、eval、human oversight、periodic QA |
| High | 客户可见回答、信贷 / 欺诈 / KYC / AML 辅助、投诉、账户限制、资金可用性 | full release gate、model / GenAI validation、fairness / security / privacy review、incident plan |
| Material | 自动或半自动影响客户权益、信贷、资金访问、监管报告、重大外包依赖 | executive oversight、formal risk acceptance、enhanced monitoring、internal audit review readiness |
5.4 Inventory 健康指标
| 指标 | 管理含义 |
|---|---|
| Systems without owner | 问责缺口 |
| Systems without risk tier | 不能确定控制强度 |
| Production systems without release evidence | 上线门禁失效 |
| High-risk systems without monitoring owner | 生产运行风险 |
| Third-party AI systems without supplier record | 供应商控制缺口 |
| Systems with stale review date | 持续改进失效 |
| Systems with material change but no reassessment | change control 断裂 |
6. Risk Management and Impact Assessment
6.1 Impact Assessment 的核心问题
AI impact assessment 不是问“模型准不准”,而是问:
| 问题 | 高级判断 |
|---|---|
| 谁受到影响 | 客户、员工、调查员、承销员、合规团队、第三方、弱势客户 |
| 影响什么 | 资金、账户访问、信贷机会、费用、投诉、隐私、身份、合规调查、工作质量 |
| 是否自动化 | 完全自动、半自动、人审辅助、草稿、排序、总结 |
| 是否可逆 | 误答、误拒、误拦截、误报、漏报、错误 SAR 支撑材料是否可纠正 |
| 证据是否充分 | 数据、模型、供应商、eval、human oversight、monitoring 是否支撑上线 |
| 谁接受残余风险 | Product Owner、Risk Owner、Executive Sponsor 或正式委员会 |
6.2 Risk Taxonomy
| 风险类别 | 金融零售例子 | 典型控制 |
|---|---|---|
| Customer harm | 错误费用解释、错误拒付建议、误导信贷资格 | customer UX boundary、human escalation、QA sampling、complaint monitoring |
| Fairness / discrimination | 信贷、欺诈或身份验证对某些群体错误率更高 | segment testing、fair lending review、threshold governance |
| Model performance | drift、calibration loss、false positive / false negative 增加 | eval suite、monitoring、rollback、retraining |
| GenAI hallucination | RAG 回答引用无法支撑、生成错误政策解释 | answerability、citation support、groundedness eval、refusal policy |
| Privacy / data protection | prompt 泄露 PII、日志保留敏感信息、供应商训练使用客户数据 | data classification、DLP、retention control、contract restriction |
| Security | prompt injection、tool abuse、model supply chain、data exfiltration | threat model、red team、tool sandbox、least privilege |
| Operational resilience | AI outage 导致客服、欺诈或 AML 队列积压 | fallback process、manual queue capacity、SLO / incident playbook |
| Supplier | 第三方模型变更、SLA 失败、审计权不足、退出困难 | supplier due diligence、contract controls、exit strategy |
| Regulatory / legal | 受监管沟通、信贷通知、AML 决策支撑材料不充分 | legal / compliance review、record retention、human accountable decision |
| Reputation | 客户可见 AI 错误被外部传播 | incident communication、public response protocol、customer remediation |
6.3 Risk Acceptance
风险接受必须具体,不应写成“业务接受风险”。建议记录:
| 维度 | 记录要求 |
|---|---|
| Accepted risk | 明确描述残余风险和触发条件 |
| Scope | 客户群体、渠道、产品、交易类型、语言、地区 |
| Time limit | 接受期限和复审日期 |
| Conditions | 上线条件、监控阈值、人工复核、pilot cap |
| Owner | 有权接受该等级风险的角色或委员会 |
| Evidence | 支撑接受的评估、测试、法律 / 合规 / 风险输入 |
| Reopen triggers | drift、incident、complaint、supplier change、policy change、threshold breach |
示例:
| 项目 | 示例内容 |
|---|---|
| Accepted risk | Credit card servicing assistant may fail to answer complex dispute edge cases and route to specialist |
| Scope | Logged-in customers, English, fee inquiry and dispute intake only, no adverse credit decision explanation |
| Conditions | Answerability pass rate above gate threshold, all regulated intents routed to specialist, weekly QA |
| Owner | Servicing Product Owner and Operational Risk delegate, with AIMS Council notification |
| Reopen trigger | unsupported high-risk answer, complaint cluster, vendor model change, answerability degradation |
6.4 Impact Assessment 输出
| Artifact | 内容 |
|---|---|
| Use Case Context | 业务目标、用户、流程、渠道、客户影响 |
| AI System Boundary | 模型、数据、工具、供应商、human oversight、fallback |
| Risk Tier Rationale | 风险等级和理由 |
| Impact Matrix | 客户、员工、运营、合规、隐私、安全、供应商影响 |
| Control Requirements | 必须实施的 control IDs |
| Release Gate Path | 需要通过的评审、审批和证据 |
| Monitoring Plan | KPI / KRI、owner、频率、阈值 |
| Residual Risk Decision | 接受、缓解、限制范围、延后上线或停止 |
7. AI Control Library
7.1 Control Library 的作用
Control library 把政策原则转成可复用控制。它让 PM 和架构师在设计阶段就知道需要什么,而不是上线前临时补文档。
Policy principle
-> Control objective
-> Control activity
-> Evidence
-> Owner
-> Frequency
-> Release gate mapping
7.2 Control Categories
| Category | Control objective | Evidence |
|---|---|---|
| Governance | AI use cases are owned, classified and governed | inventory record、RACI、forum decision |
| Lifecycle | AI systems pass defined gates before production | gate checklist、approval record、architecture decision |
| Data | AI data is appropriate, permissioned, traceable and protected | data lineage、DPIA / privacy review input、quality report |
| Model / System Evaluation | AI performance and limitations are measured | eval report、calibration、robustness、fairness、validation signoff |
| GenAI / RAG | Generated output is grounded, safe, permissioned and monitored | golden set、groundedness report、citation support、red team |
| Human Oversight | Humans can supervise, override, appeal and improve AI outcomes | SOP、queue metrics、override log、QA report |
| Security | AI system resists misuse and protects assets | threat model、prompt injection tests、access review、security approval |
| Privacy | Sensitive data use and logs follow approved boundaries | retention rule、DLP report、vendor data use restriction |
| Supplier | Third-party AI dependencies are assessed and controlled | vendor due diligence、contract clauses、SLA、change notice |
| Monitoring | Production behavior, risk and control effectiveness are tracked | dashboard、alert history、review minutes |
| Incident | AI incidents are triaged, communicated and remediated | incident record、RCA、customer impact assessment、corrective action |
| Evidence | Records are complete, current and audit-ready | evidence binder index、record retention mapping |
7.3 Sample Controls
| Control ID | Control | Applicability | Evidence |
|---|---|---|---|
| CTRL-INV-001 | Every production AI system has an inventory record with owner, risk tier, model / vendor, release status and monitoring owner | all AI systems | inventory export and owner attestation |
| CTRL-RISK-002 | High-risk AI systems complete impact assessment before design approval | high and material tier | signed impact assessment and risk tier rationale |
| CTRL-ARCH-003 | AI architecture review validates data flow, model boundary, human oversight, fallback, security and supplier dependency | medium, high, material | architecture decision record |
| CTRL-EVAL-004 | Model / system evaluation includes performance, robustness, segment analysis and limitation statement | medium, high, material | evaluation report |
| CTRL-GENAI-005 | Customer-facing GenAI requires answerability, groundedness, citation support and refusal evaluation | GenAI customer-facing | GenAI eval pack |
| CTRL-HITL-006 | High-impact actions define human review trigger, override reason, queue SLA and QA sampling | high and material | human oversight SOP and metrics |
| CTRL-SUP-007 | Critical AI suppliers require due diligence, data-use restriction, change notification and exit plan | third-party AI | supplier evidence pack |
| CTRL-INC-008 | AI incidents are triaged by severity, customer impact, containment, communication and corrective action | all production AI | incident log and postmortem |
| CTRL-CHG-009 | Material changes trigger reassessment, regression eval and release gate update | all AI systems | change record and test evidence |
| CTRL-MON-010 | Production monitoring covers KPI, KRI, drift, errors, overrides, complaints and supplier SLA | production AI | monitoring dashboard and review record |
7.4 Control Design Quality
弱控制和强控制的区别:
| 弱控制 | 强控制 |
|---|---|
| “模型上线前应经过评估” | “High-risk model must pass documented eval covering segment performance, calibration, robustness, limitations and approved residual risk before production release” |
| “需要人工监督” | “Transactions above impact threshold or below confidence threshold route to named review queue with SLA, override reason and QA sampling” |
| “供应商需要评估” | “Critical AI supplier requires security review, data-use restriction, model version change notification, SLA, audit evidence and exit plan” |
| “监控模型表现” | “Monitor precision, recall, false positive cost, drift, segment metrics, human override, customer complaints and incident triggers weekly during pilot” |
8. Architecture Review and Release Gates
8.1 AI Architecture Review 的核心
AI architecture review 不只是看云资源和 API,而是看 AI 系统是否可控:
Business flow
-> data boundary
-> model / vendor boundary
-> prompt / retriever / tool boundary
-> decision policy
-> human oversight
-> monitoring
-> fallback
-> evidence
8.2 Reference Architecture
Channel / Workflow
-> AI Policy Enforcement
- user identity and entitlement
- use case scope
- risk tier
- data classification
-> Context and Data Layer
- feature store / customer data / case data
- retrieval index / knowledge graph / policy docs
- lineage and permission filters
-> Model and Orchestration Layer
- predictive model
- LLM / embedding / reranker
- prompt template and guardrails
- tools and workflow state
-> Decision and Control Layer
- confidence / evidence scoring
- business rules
- human escalation policy
- output filter
- risk acceptance condition
-> Experience and Operations
- customer response
- employee copilot
- analyst queue
- appeal / complaint path
-> Monitoring and Evidence
- logs and metrics
- drift and quality
- incident triggers
- evidence binder
8.3 Lifecycle Gates
| Gate | 目标 | 通过标准 |
|---|---|---|
| G0 Idea Intake | 判断是否属于 AI、是否值得进入 portfolio | use case、value hypothesis、owner、initial risk tier |
| G1 Risk and Impact Assessment | 确定风险等级和控制要求 | impact assessment approved, AI inventory created |
| G2 Architecture and Control Design | 确认设计可控、可监控、可回退 | architecture review completed, control library mapped |
| G3 Evaluation and Validation | 证明系统表现和限制被测量 | model / GenAI eval, security / privacy review, human oversight test |
| G4 Release Decision | 决定 pilot、limited launch、full launch 或不发布 | evidence binder complete, residual risk accepted, monitoring ready |
| G5 Production Review | 检查真实表现和控制有效性 | KPI / KRI reviewed, incidents and overrides assessed |
| G6 Material Change Review | 处理模型、数据、prompt、supplier、threshold、policy 变化 | change impact assessed, regression eval passed, release record updated |
| G7 Retirement | 安全下线并保留记录 | decommission plan, data retention, customer / operations transition |
8.4 Release Gate Evidence
| Evidence | Low | Medium | High | Material |
|---|---|---|---|---|
| Inventory | required | required | required | required |
| Impact assessment | lightweight | standard | enhanced | enhanced with executive visibility |
| Architecture review | simplified | required | required | required |
| Model / system eval | basic | required | enhanced | independent validation where applicable |
| GenAI eval | if GenAI | if GenAI | required for GenAI | enhanced GenAI eval and red team |
| Fairness / segment review | risk-based | risk-based | required where customer impact exists | enhanced |
| Security / privacy review | risk-based | required | required | enhanced |
| Supplier review | if third-party | if third-party | required | enhanced |
| Human oversight SOP | if human involved | required for assisted decisions | required | required with QA |
| Monitoring plan | required | required | required | required with KRI thresholds |
| Risk acceptance | owner | owner | risk forum | executive or delegated authority |
8.5 Architecture Review Questions
| Area | Review question |
|---|---|
| Scope | What exact action can the AI system influence, and what is excluded from scope? |
| Data | What data enters the system, under what permission, and where is it logged? |
| Model | What model or vendor is used, what version, and what limitations are known? |
| RAG | Which sources are authoritative, how is permission enforced, and how are stale sources blocked? |
| Agentic tools | What tools can the AI call, what limits exist, and where is human confirmation required? |
| Decision policy | What thresholds route to auto, assist, abstain, escalate or block? |
| Human oversight | Who reviews, what information they see, how overrides are captured, and how QA works? |
| Customer experience | Could the customer mistake the output for a formal decision, legal advice or guaranteed bank commitment? |
| Monitoring | Which KPI / KRI detect degradation, harm, drift, misuse and supplier failure? |
| Fallback | What happens when model, retriever, vendor or policy service fails? |
| Evidence | Which records prove the control operated as designed? |
9. Financial Retail Use Cases
9.1 Customer-Facing GenAI
Use case:
Customer asks questions about fees, disputes, card benefits, payment options and account servicing.
Risk:
Customer may treat fluent AI output as official bank commitment, credit decision,
complaint response, legal interpretation or regulated notice.
AIMS controls:
- risk-tiered intent classification
- authoritative source registry
- retrieval permission filtering
- answerability and citation support gate
- regulated-intent escalation
- customer UX boundary language
- complaint and unsupported-claim monitoring
- human QA sampling and incident trigger
Release gates:
| Gate | Specific evidence |
|---|---|
| G1 | customer impact by intent, excluded intents, complaint / credit / legal boundary |
| G2 | RAG architecture, source governance, permission model, escalation policy |
| G3 | groundedness, citation support, refusal accuracy, harmful content, prompt injection test |
| G4 | limited pilot scope, QA plan, monitoring dashboard, risk acceptance |
| G5 | unsupported claim rate, customer complaint trend, escalation quality, source freshness |
9.2 Credit AI
Use case:
Underwriting, credit line assignment, pricing support, collections strategy,
early warning or affordability assistance.
Risk:
Model may influence credit opportunity, pricing, adverse action, fair lending,
collections treatment and customer financial outcomes.
AIMS controls:
- formal model governance path
- data lineage and feature justification
- fair lending / segment performance review
- calibration and threshold governance
- adverse-action boundary and explainability
- human review for near-threshold or high-impact cases
- outcome monitoring with label lag handling
- periodic validation and change control
Key architecture decision:
| Decision | Senior PM / Architect stance |
|---|---|
| Is AI making the credit decision or assisting a human decision? | Define accountable decision maker and evidence trail |
| Are GenAI explanations generated? | Do not let LLM invent adverse action reasons; bind explanations to approved reason code logic |
| Are third-party scores used? | Record supplier controls, permissible purpose, model limitations and change notifications |
| Is customer recourse available? | Connect appeal, manual review and complaint process |
9.3 Fraud AI
Use case:
Real-time fraud scoring, transaction blocking, step-up authentication,
scam detection, mule account detection and case prioritization.
Risk:
False positives block legitimate customers; false negatives create financial loss;
attackers adapt after model release.
AIMS controls:
- real-time decision policy with customer impact tiers
- false positive cost and customer friction metrics
- adversarial pattern monitoring
- step-up and manual review design
- incident trigger for sudden drift or attack pattern
- model rollback and emergency threshold change process
Monitoring package:
| Metric | Why it matters |
|---|---|
| Confirmed fraud rate | Detects model effectiveness |
| False decline / false positive proxy | Measures customer harm |
| Step-up completion and failure | Measures friction and fraud separation |
| Segment false positive rate | Detects unfair or unstable impact |
| Fraud loss and prevented loss | Connects risk control to business value |
| Attack pattern drift | Shows adversarial adaptation |
| Manual override and appeal outcome | Tests human oversight quality |
9.4 KYC / AML AI
Use case:
Identity verification, document classification, name screening, entity resolution,
transaction monitoring alert triage, narrative generation and case summarization.
Risk:
AI may miss suspicious activity, create false positives, bias analyst attention,
or produce unsupported investigation narratives.
AIMS controls:
- analyst-in-the-loop design
- evidence traceability for alerts and summaries
- no autonomous suspicious activity conclusion without accountable human process
- QA sampling and analyst disagreement monitoring
- data retention and case record controls
- model limitations documented by typology and segment
GenAI use in AML should be treated as evidence assistant, not final authority:
| GenAI capability | Allowed design |
|---|---|
| Case summarization | Summarize source-linked facts with citations to transaction and case data |
| Narrative draft | Draft for analyst review with required source references |
| Typology suggestion | Suggest candidate typologies with evidence confidence and analyst confirmation |
| SAR decision | Remains in formal human-controlled compliance process |
9.5 Internal Copilot
Use case:
Employee copilot for policy search, call summarization, operations SOP,
developer productivity, architecture review preparation and QA assistance.
Risk:
Employees may over-trust answers, expose sensitive data, or use outputs outside approved context.
AIMS controls:
- role-based access and source permissions
- employee training and usage policy
- answer boundary and source citation
- sensitive data logging controls
- QA sampling for high-impact teams
- prohibited use list for formal customer decisions
9.6 Third-Party Model Integration
Use case:
Hosted LLM API, fraud consortium score, ID verification vendor,
OCR model, embedding model, customer-service AI platform or decisioning SaaS.
Risk:
Institution may lack transparency into model behavior, training data,
version changes, outage patterns, security posture and data use.
AIMS controls:
- supplier classification and criticality assessment
- data-use restriction and retention terms
- model version and change notification
- service level and incident notification
- independent testing on institution data
- exit strategy and fallback process
10. Supplier and Vendor Controls
10.1 Supplier Risk Questions
| Area | Question |
|---|---|
| Data use | Can supplier use prompts, outputs, customer data or logs for training or service improvement? |
| Location and retention | Where is data processed and stored, and how long is it retained? |
| Model change | How are model version changes, deprecations and behavior changes communicated? |
| Assurance | What security, privacy, model governance or AI assurance evidence is available? |
| Subprocessors | Which subprocessors touch data or model operations? |
| Incident | What is the notification timeline and content for AI-related incidents? |
| Testing | Can the institution run its own eval, red team and regression tests? |
| Exit | Can the institution migrate, export data and maintain service continuity? |
| Audit | What review rights or evidence-sharing mechanisms exist? |
| Limitations | What use cases are prohibited or unsupported by supplier terms? |
10.2 Contract and Operating Controls
| Control | Product / architecture impact |
|---|---|
| Data-use restriction | Determines whether sensitive customer prompts can be sent |
| Version pinning or notification | Determines regression testing and release gate cadence |
| SLA and uptime | Determines fallback process and operational resilience |
| Incident notification | Determines customer impact assessment and regulatory communication readiness |
| Security assurance | Determines integration approval and ongoing monitoring |
| Audit evidence access | Determines internal audit and certification readiness |
| Deletion and retention | Determines logging, privacy and records architecture |
| Exit plan | Determines whether model dependency becomes operational lock-in |
10.3 Supplier Change Control
Material supplier changes include:
| Change | AIMS response |
|---|---|
| Model version upgrade | Regression eval, high-risk scenario test, release gate update |
| Terms of service change | Legal / procurement / risk review, data-use impact assessment |
| Data location change | Privacy / security review, jurisdiction impact |
| Subprocessor change | supplier risk review |
| Degraded SLA or outage | resilience review, fallback activation |
| New safety filter behavior | UX and refusal accuracy test |
| Deprecated API | migration plan and risk acceptance if timeline is constrained |
11. Monitoring, KPI / KRI and Management Review
11.1 Monitoring Stack
System telemetry:
latency, uptime, API errors, cost, token usage, queue depth
AI quality:
accuracy, calibration, groundedness, citation support, refusal quality,
false positive / false negative, analyst agreement
Risk and customer impact:
complaints, appeals, false declines, remediation, customer abandonment,
harmful answer, privacy event
Control effectiveness:
release gate adherence, evidence freshness, override reason quality,
supplier notification timeliness, training completion
Portfolio health:
inventory completeness, high-risk exceptions, incidents, audit findings,
corrective action aging
11.2 KPI / KRI Library
| Metric | Type | Use |
|---|---|---|
| Inventory completeness | KPI | Confirms AIMS coverage |
| High-risk systems without current review | KRI | Signals governance gap |
| Release gate exception count | KRI | Detects process bypass |
| Unsupported GenAI answer rate | KRI | Detects customer-facing answer risk |
| Human override rate | KPI / KRI | Measures decision quality and workflow stress |
| Override disagreement rate | KRI | Detects poor model or unclear policy |
| Customer complaint AI-tagged rate | KRI | Detects harm and trust issue |
| Supplier AI incident count | KRI | Tracks third-party dependency risk |
| Model / prompt changes without reassessment | KRI | Detects change control failure |
| Evidence binder freshness | KPI | Supports audit readiness |
| Corrective action aging | KRI | Shows improvement backlog risk |
| Training completion by role | KPI | Shows capability and awareness |
11.3 Management Review
Management review should not be a ceremonial presentation. It should decide whether AIMS remains suitable, adequate and effective.
Inputs:
| Input | Example |
|---|---|
| AIMS objectives status | inventory completeness, gate adherence, incident response performance |
| AI portfolio changes | new high-risk systems, retired systems, third-party expansion |
| KPI / KRI trends | unsupported answer rate, false positive rate, override rate, evidence freshness |
| Internal audit findings | control design gaps, operating effectiveness issues |
| Incident and complaints | customer harm, model failure, GenAI misuse, supplier event |
| Supplier performance | SLA, change notification, assurance evidence |
| Regulatory and external change | new expectations, industry incidents, technology changes |
| Resource and competence | staffing, training, tooling, platform gaps |
| Corrective actions | overdue actions, repeated root causes |
Outputs:
| Output | Example |
|---|---|
| Policy updates | GenAI customer-facing answer standard revised |
| Objective refresh | increase coverage of supplier AI monitoring |
| Resource decisions | fund central AI inventory workflow and eval platform |
| Risk appetite adjustment | restrict autonomous actions for high-impact customer decisions |
| Corrective action priority | remediate evidence gaps for material systems |
| Portfolio decision | pause expansion of a use case until monitoring stabilizes |
11.4 Continual Improvement Loop
Trigger:
incident, audit finding, drift, complaint, supplier change, KPI / KRI breach,
failed release gate, management review action
Root cause:
policy gap, control design weakness, operating failure, data issue,
model limitation, supplier failure, training gap, architecture flaw
Corrective action:
owner, scope, due date, evidence, validation method, management visibility
Effectiveness check:
metric improved, control operated, recurrence reduced, evidence updated
12. Incident, Change Control and Communication
12.1 AI Incident Taxonomy
| Severity | Example | Response |
|---|---|---|
| Sev 1 | AI causes material customer harm, regulatory reporting issue, sensitive data exposure, major discriminatory impact | immediate containment, executive escalation, legal / compliance / security involvement |
| Sev 2 | High-risk AI produces repeated unsupported or harmful outputs, false declines spike, vendor model change breaks controls | incident review board, customer impact assessment, temporary restrictions |
| Sev 3 | Monitoring threshold breach, QA defect cluster, human override spike | issue management, root cause, corrective action |
| Sev 4 | Low-impact quality issue, documentation gap, isolated internal copilot error | backlog remediation and trend monitoring |
12.2 Incident Communication
| Audience | Communication need |
|---|---|
| Customer | Clear impact, correction path, support channel and remediation where appropriate |
| Frontline / operations | What to stop, what fallback to use, what to tell customers |
| Risk / compliance / legal | Facts, scope, timeline, impacted customers, evidence, decision log |
| Security / privacy | Data exposure, access, containment, forensic evidence |
| Supplier | Incident data, SLA, root cause request, remediation commitment |
| Management | severity, customer impact, business impact, containment, residual risk |
| Internal audit | evidence trail, control failure analysis, corrective action |
12.3 Change Control
AI change control must cover more than code deployment:
| Change type | Example | Required review |
|---|---|---|
| Model | LLM version, fraud model retraining, embedding model replacement | regression eval, risk review, monitoring update |
| Data | new feature, new retrieval source, new customer segment | data quality, privacy, impact assessment update |
| Prompt | system prompt, refusal instruction, tool instruction | GenAI eval, prompt injection test, release record |
| Retriever / index | source refresh, chunking, ranking, permission filter | groundedness, source freshness, access control test |
| Threshold / policy | fraud block threshold, escalation rule, confidence cutoff | business impact, segment analysis, risk acceptance |
| Vendor | API change, SLA change, subprocessor change | supplier risk review |
| UX | disclosure language, customer answer format, human escalation path | customer harm and compliance review |
| Operations | queue owner, SLA, QA sample rate | human oversight review |
12.4 Emergency Change
Emergency changes are sometimes necessary in fraud attacks, vendor outages or harmful GenAI output. The operating model should allow rapid containment while preserving evidence:
| Step | Requirement |
|---|---|
| Declare emergency reason | specific incident, attack pattern, outage or customer harm |
| Assign accountable owner | business, risk and technology owners named |
| Apply temporary control | disable intent, raise threshold, force human review, switch fallback |
| Record decision | time, scope, rationale, expected duration |
| Perform after-action review | full validation, risk acceptance update, corrective action |
13. Evidence Binder and Audit Readiness
13.1 Evidence Binder 的目标
Evidence binder 的目标不是“堆材料”,而是让一个外部不了解项目的人能够追溯:
What was built?
Why was it built?
Who owns it?
What risks were identified?
What controls were required?
How were controls implemented and tested?
Who accepted residual risk?
How is production monitored?
How are incidents, changes and suppliers handled?
How does management know AIMS is improving?
13.2 Binder Structure
| Section | Contents |
|---|---|
| 1. System Record | inventory record, use case description, owner, risk tier, scope |
| 2. Policy Mapping | applicable AI policy, standards, control library IDs |
| 3. Impact Assessment | customer / operational / privacy / security / supplier impact |
| 4. Architecture | architecture diagram, data flow, system boundary, fallback |
| 5. Data Evidence | lineage, quality, permissions, retention, privacy review input |
| 6. Model / GenAI Evidence | eval report, validation, limitations, red team, RAG metrics |
| 7. Human Oversight | SOP, queue design, override rules, QA plan |
| 8. Supplier Evidence | due diligence, contract controls, SLA, model change notification |
| 9. Release Decision | gate checklist, approvals, risk acceptance, launch conditions |
| 10. Monitoring | dashboard, thresholds, review cadence, owner |
| 11. Incidents and Changes | incident log, RCA, corrective action, material change records |
| 12. Management Review | review minutes, KPI / KRI, decisions, improvement actions |
13.3 Certification Readiness Without Overclaiming
适合的表达:
| Safe expression | Why it works |
|---|---|
| “AIMS-aligned operating model” | 表达对管理体系思想的对齐,不声称认证 |
| “audit-ready evidence binder” | 表达证据准备状态,不声称外部审核结论 |
| “ISO/IEC 42001 certification readiness support” | 表达准备工作,不声称已经符合或通过 |
| “controls mapped to AI policy and risk framework” | 表达内部控制映射 |
| “formal compliance conclusion requires qualified review” | 明确边界 |
不适合的表达:
| Risky expression | Problem |
|---|---|
| “This system is ISO 42001 compliant” | 可能被理解为未经审核的合规结论 |
| “NIST AI RMF certified” | NIST AI RMF 是风险管理框架,不应被当作认证标签 |
| “Legally compliant AI” | 法律合规需要司法辖区、业务事实和法律专业判断 |
| “No AI risk remains” | AIMS 管理残余风险,不消灭所有风险 |
13.4 Internal Audit Questions
| Question | Evidence that answers it |
|---|---|
| How do you know all production AI systems are in scope? | inventory completeness control, attestations, discovery process |
| Who can approve high-risk AI release? | RACI, release gate policy, committee decision record |
| How is customer-facing GenAI quality measured? | groundedness, citation support, refusal eval, QA sampling |
| How are AI incidents identified and escalated? | monitoring thresholds, incident taxonomy, incident records |
| How are third-party AI changes controlled? | supplier clauses, change notices, regression eval |
| How does management review AIMS effectiveness? | management review inputs, KPI / KRI, decisions and actions |
| How are exceptions handled? | risk acceptance record, scope, expiry, owner, reopen triggers |
14. Templates
14.1 AI Use Case Intake Template
| Field | Writing standard | Strong example |
|---|---|---|
| Use case name | Business-readable, not model-centric | Credit card dispute intake assistant |
| Business objective | Measurable value and user outcome | Reduce repeat calls by answering low-risk dispute status questions and routing complex cases |
| AI capability | Classification, GenAI, RAG, predictive, agentic workflow | RAG answer + intent classifier + case summarization |
| Users | Customer, employee, analyst, developer | Logged-in mobile customers and contact center agents |
| Affected process | Named process and owner | Card servicing dispute intake, owned by Servicing Operations |
| Customer impact | Funds, account, credit, compliance, complaint, service quality | Medium; high when complaint or provisional credit language appears |
| Automation level | Decision, recommendation, draft, triage, search | Draft answer and routing recommendation, no autonomous credit decision |
| Data | Source systems and sensitivity | Authenticated profile, dispute case metadata, policy knowledge base |
| Supplier | Internal or third-party dependency | Hosted LLM API, internal retriever |
| Initial risk tier | Low / medium / high / material with reason | High due to customer-facing regulated servicing paths |
| Required next gate | Specific gate path | G1 impact assessment and G2 architecture review |
14.2 AI Impact Assessment Template
| Section | Required content | Strong example |
|---|---|---|
| Context | Business process, AI role, users, channels | Mobile and contact center dispute servicing assistant |
| Stakeholders | Customers, employees, operators, control functions | Customers, agents, disputes ops, compliance, complaints |
| Potential harms | Concrete adverse outcomes | Wrong fee answer, failure to escalate complaint, unsupported dispute timeline |
| Data and privacy | Data categories, permission, retention, logging | PII and account metadata; prompts logged with redaction and retention limit |
| Automation and reversibility | Whether AI can trigger action and how errors are corrected | AI cannot issue final dispute decision; agent or specialist confirms |
| Fairness / segment | Customer groups or channels requiring monitoring | language, channel, vulnerable customer indicator where policy allows |
| Supplier | Third-party model / platform dependency | vendor LLM with no customer data training permitted |
| Controls | Mapped control IDs | CTRL-GENAI-005, CTRL-HITL-006, CTRL-SUP-007 |
| Residual risk | Remaining risk and acceptance condition | Complex disputes may require specialist routing; accepted only in limited pilot |
14.3 AI Inventory Record Template
| Field | Strong example |
|---|---|
| AI system ID | AI-FRAUD-REALTIME-002 |
| Use case | Real-time card fraud step-up model |
| Risk tier | High because output can block or challenge customer transactions |
| Owner | Fraud Product Owner; Fraud Platform Tech Owner; Operational Risk Owner |
| Model / vendor | Internal gradient boosting model with third-party device intelligence |
| Data | transaction history, device signal, merchant data, customer profile |
| Human oversight | high-value uncertain cases route to fraud analyst; customer appeal path available |
| Key controls | risk tiering, segment performance, threshold governance, monitoring, incident response |
| Release status | production with quarterly review |
| Monitoring | false positive proxy, confirmed fraud, step-up completion, customer complaint, drift |
| Evidence binder | linked to controlled evidence repository record |
14.4 Release Gate Memo Template
| Section | Strong content |
|---|---|
| Decision | Approve limited pilot for logged-in mobile customers in English for low-risk fee and dispute status intents |
| Scope | Excludes adverse action explanations, legal advice, complaint final response and credit decisioning |
| Evidence reviewed | impact assessment, architecture review, GenAI eval, prompt injection test, supplier review, human oversight SOP |
| Conditions | regulated intents escalate, unsupported answers refused, weekly QA, monitoring threshold breach pauses expansion |
| Residual risk | AI may fail to identify edge-case complaint language; mitigated by intent escalation and QA sampling |
| Accepted by | Servicing Product Owner and delegated Operational Risk authority; AIMS Council informed |
| Review cadence | pilot review weekly for first month, then monthly if stable |
14.5 Control Library Entry Template
| Field | Strong example |
|---|---|
| Control ID | CTRL-GENAI-ANSWERABILITY-011 |
| Control objective | Customer-facing GenAI only answers when authoritative sources support the response |
| Control activity | The system checks answerability, citation support and source freshness before final response; failing outputs route to refusal or human escalation |
| Applicability | Customer-facing RAG and employee copilot used for customer communication |
| Owner | AI Platform Owner and Business Product Owner |
| Frequency | Every response at runtime; eval suite before release and after material change |
| Evidence | runtime logs, eval report, source registry, QA sampling |
| Failure trigger | unsupported high-risk answer or citation support threshold breach |
14.6 Management Review Pack Template
| Section | Strong content |
|---|---|
| Portfolio summary | number of AI systems by risk tier, new launches, retired systems |
| Objective performance | inventory completeness, release gate adherence, evidence freshness |
| Risk metrics | incidents, KRI breaches, complaints, supplier issues, overdue actions |
| Control effectiveness | audit findings, release exceptions, monitoring coverage, human oversight results |
| External changes | regulatory expectations, major industry incidents, new technology risks |
| Resource needs | tooling, staffing, training, supplier remediation |
| Decisions requested | approve policy update, risk appetite change, funding, portfolio pause or expansion |
15. 30 天训练计划
目标:30 天内把 AIMS 从标准概念训练成可展示的金融零售 AI 操作模型、治理架构和作品集资产。训练默认读者已经具备 CBAP、产品、流程、架构和 stakeholder management 基础。
| Day | 主题 | 产出 |
|---|---|---|
| 1 | 阅读 ISO official ISO/IEC 42001 页面,整理 AIMS 的管理体系定位 | 1 页 AIMS executive briefing |
| 2 | 阅读 NIST AI RMF overview,整理 Govern / Map / Measure / Manage | AI RMF 到 AIMS 映射表 |
| 3 | 阅读 NIST AI RMF 1.0 publication,提炼 trustworthy AI characteristics | 金融零售 AI risk taxonomy |
| 4 | 阅读 NIST Generative AI Profile,整理 GenAI 增量风险 | GenAI control addendum |
| 5 | 设计金融零售 AI inventory schema | inventory data dictionary |
| 6 | 为 customer-facing GenAI 做 use case intake | intake record |
| 7 | 为 credit AI 做 impact assessment | impact assessment memo |
| 8 | 为 fraud AI 做 risk tiering and monitoring plan | risk tier and KRI dashboard |
| 9 | 为 KYC / AML AI 做 human oversight design | analyst-in-the-loop SOP |
| 10 | 为 internal copilot 做 data boundary and access design | role-based access and source control map |
| 11 | 设计 AIMS governance forums | forum charter pack |
| 12 | 设计 RACI | enterprise AI RACI |
| 13 | 设计 control library categories | control library v1 |
| 14 | 写 10 条 sample controls | control library entries |
| 15 | 设计 AI architecture review checklist | architecture review gate |
| 16 | 设计 lifecycle gates G0-G7 | lifecycle gate model |
| 17 | 设计 release gate memo | release decision template with example |
| 18 | 设计 GenAI evidence pack | groundedness and citation support evidence map |
| 19 | 设计 supplier AI risk assessment | vendor control questionnaire |
| 20 | 设计 supplier change control process | supplier change response matrix |
| 21 | 设计 production monitoring stack | KPI / KRI dashboard spec |
| 22 | 设计 AI incident taxonomy | incident severity matrix |
| 23 | 设计 incident communication protocol | audience-specific communication matrix |
| 24 | 设计 management review pack | quarterly AIMS management review deck outline |
| 25 | 设计 evidence binder | binder index and record retention map |
| 26 | 做 audit readiness dry run | internal audit Q&A evidence table |
| 27 | 做 certification readiness boundary statement | safe language and overclaiming guardrail |
| 28 | 写 executive memo | AIMS operating model business case |
| 29 | 写 interview story | STAR-T answer set |
| 30 | 整理 portfolio package | architecture diagram, governance model, sample controls, release gate, evidence binder |
16. 面试答案
16.1 什么是 AI Management System,和普通 AI governance 有什么区别?
30 秒回答:
AI Management System 是组织级管理体系,用来建立、实施、维护和持续改进 AI 的政策、目标、流程、角色、控制、证据和管理评审。普通 AI governance 经常停留在原则和审批;AIMS 要把这些原则转成 inventory、risk tiering、lifecycle gates、monitoring、incident management 和 continual improvement。
2 分钟展开:
在金融零售里,AI 风险不是只存在于模型团队。一个客户可见 GenAI、一个 fraud model 或一个第三方 IDV vendor 都会同时牵涉产品、数据、架构、安全、隐私、供应商、运营和合规。AIMS 的价值是把这些角色纳入同一个操作系统:先建立 AI policy 和 risk appetite,再通过 use case intake 和 inventory 明确边界,然后用 risk assessment、architecture review、eval、release gate 和 monitoring 控制生命周期。最后,管理层通过 KPI / KRI、incident、audit finding 和 corrective action 看见体系是否有效。
16.2 ISO/IEC 42001 和 NIST AI RMF 如何一起用?
30 秒回答:
我会把 ISO/IEC 42001 理解成 AI management system 的组织骨架,把 NIST AI RMF 的 Govern / Map / Measure / Manage 当成日常运行语言。前者帮助建立管理体系,后者帮助组织 AI 风险识别、测量、处置和监控。
2 分钟展开:
ISO/IEC 42001 强调建立、实施、维护和持续改进 AIMS,适合定义政策、目标、过程、角色、绩效评价和改进机制。NIST AI RMF 更适合在具体 use case 上推动风险工作:Govern 建立问责和政策,Map 理解上下文和影响,Measure 测量模型和系统风险,Manage 做风险处置和监控。金融机构可以用 AIMS 作为 operating model,用 NIST AI RMF 作为风险活动分类,再用内部控制库和 release gates 把它们变成可执行证据。
16.3 AI inventory 为什么是 AIMS 的核心?
30 秒回答:
因为没有 inventory,组织不知道哪里用了 AI、谁负责、风险等级是什么、依赖什么供应商、上线状态如何、如何监控。没有对象清单,就无法做风险管理、事件响应、供应商控制或管理评审。
2 分钟展开:
成熟的 AI inventory 不是模型清单,而是 AI system 清单。它应该包括 use case、业务流程、owner、risk tier、客户影响、模型或供应商、数据源、human oversight、controls、release status、monitoring owner 和 evidence binder。比如一个信用卡 servicing RAG assistant 不只是 LLM,它还包括知识库、retriever、权限过滤、intent classifier、客户渠道、人工升级和监控。只有把这些记录下来,AIMS 才能支撑 release gate、incident response 和 audit readiness。
16.4 如何设计金融零售 AI release gate?
30 秒回答:
我会按风险等级设计 gates:intake、impact assessment、architecture review、evaluation、release decision、production review、material change review 和 retirement。高风险系统必须有完整 evidence binder 和明确 residual risk acceptance。
2 分钟展开:
Release gate 不是单一审批,而是生命周期门禁。G0 判断是否属于 AI 和是否进入 portfolio;G1 做风险和影响评估;G2 看架构、数据流、供应商、人类监督和 fallback;G3 做模型、GenAI、安全、隐私和公平性评估;G4 决定 pilot、limited launch 或 full launch;上线后 G5 看生产表现;G6 处理模型、prompt、retriever、threshold 和 vendor change。每个 gate 都要有证据和 owner,尤其在 customer-facing GenAI、credit、fraud、KYC / AML 场景。
16.5 Customer-facing GenAI 在 AIMS 中最关键的控制是什么?
30 秒回答:
关键控制是 scope boundary、authoritative source registry、retrieval permission、answerability、groundedness、citation support、regulated-intent escalation、customer UX boundary、QA sampling 和 incident monitoring。
2 分钟展开:
客户可见 GenAI 的风险不是它偶尔说错,而是它用非常流畅的方式把不确定或无依据的内容呈现得像正式银行意见。我会先定义哪些意图可以自动回答,哪些必须升级人工,例如投诉、信贷、法律解释、账户限制和正式通知。架构上必须有权限过滤、权威来源、引用支撑和拒答机制。治理上要有 GenAI eval、prompt injection testing、QA sampling、complaint monitoring 和 incident trigger。这样才能把 GenAI 从聊天功能变成受控服务渠道。
16.6 如何处理第三方 AI 模型风险?
30 秒回答:
第三方 AI 不能只走普通采购。需要供应商 criticality、数据使用限制、模型版本变更通知、SLA、incident notification、assurance evidence、机构自测权利和 exit strategy。
2 分钟展开:
金融机构使用 hosted LLM、ID verification、fraud consortium score 或 AI SaaS 时,风险来自透明度不足、数据外流、模型变更、服务中断和合同限制。我会把第三方 AI 放进 AI inventory,并要求 supplier risk assessment。合同和操作控制要覆盖数据是否用于训练、日志保留、subprocessor、模型更新通知、审计证据、SLA 和退出路径。每次供应商模型升级或条款变化,都要触发 change control 和 regression eval,不能让供应商变化绕过内部 release gate。
16.7 Management review 在 AIMS 中看什么?
30 秒回答:
Management review 看 AIMS 是否适用、充分和有效。它应审查 AI portfolio、objectives、KPI / KRI、incident、audit finding、supplier issue、corrective action、资源和外部变化,并输出具体改进决策。
2 分钟展开:
如果 management review 只是汇报“上线了多少 AI”,它没有治理价值。高级做法是让管理层看到 inventory completeness、high-risk exceptions、release gate adherence、unsupported GenAI answers、customer complaints、human override、supplier incidents、evidence freshness 和 overdue corrective actions。输出不只是会议纪要,而是政策更新、资源决策、风险偏好调整、portfolio pause、control remediation 和培训要求。这是 AIMS 持续改进的管理层闭环。
16.8 如何避免把 audit readiness 说成 compliance claim?
30 秒回答:
我会使用“AIMS-aligned operating model”“audit-ready evidence binder”“certification readiness support”等表达,而不是声称某个系统已经 ISO 42001 compliant 或 legally compliant。正式合规和认证结论需要授权审核和法律 / 合规判断。
2 分钟展开:
PM 和架构师可以设计控制、证据和运行机制,但不能替代法律意见或认证审核。作品集和内部方案里可以说:我们把 AI inventory、risk assessment、control library、release gate、monitoring、incident、supplier controls 和 management review 组织成 audit-ready evidence binder。这样表达的是准备度和操作能力,不是外部认证结论。这个边界很重要,尤其是在金融零售和 regulated AI 场景。
16.9 AIMS 如何支持快速创新,而不是拖慢 AI 产品?
30 秒回答:
AIMS 通过 risk-based governance 加速创新:低风险场景走轻量路径,高风险场景走完整门禁;复用 inventory、control library、templates、eval platform 和 supplier controls,减少每个团队从零设计治理。
2 分钟展开:
没有 AIMS 的组织看似快,但每个团队都会重复争论数据能不能用、供应商能不能接、模型怎么评、谁来批准、出事怎么处理。AIMS 把这些问题产品化:use case intake 决定路径,control library 告诉团队需要哪些控制,release gates 规定证据,AI platform 提供 eval 和 monitoring,供应商标准提前解决合同边界。这样低风险 copilot 可以快速试点,高风险 credit 或 customer-facing GenAI 有更强控制,整体速度反而更可持续。
16.10 作为 AI PM / Architect,你会如何落地 AIMS?
30 秒回答:
我会先建立 AI inventory 和 risk tiering,再设计 governance forums、RACI、control library 和 lifecycle gates。随后选 3-4 个金融零售 use cases 做样板,把 impact assessment、architecture review、release gate、monitoring 和 evidence binder 跑通,最后进入 management review 和 continual improvement。
2 分钟展开:
落地不能从写一份大政策开始。我会先做 discovery,识别 production AI、shadow AI 和 third-party AI,建立 inventory。然后定义 risk tier 和 control library,让团队知道不同风险等级需要什么证据。接着建立 intake、architecture review、validation、release gate、supplier review 和 incident review 等论坛,并明确 RACI。样板 use cases 会选 customer-facing GenAI、fraud model、KYC / AML AI 和 internal copilot,因为它们覆盖了客户影响、模型风险、GenAI、供应商和运营监督。最后用 KPI / KRI 和 management review 推动持续改进。
17. 作品集表达
如果把本文转成作品集,可以包装成一个金融零售 AI 管理体系案例:
Case:
Building an AIMS operating model for a financial retail institution
deploying customer-facing GenAI, fraud AI, KYC / AML AI, internal copilots
and third-party model integrations.
Problem:
AI use cases were growing faster than governance. Teams had fragmented
model records, inconsistent release gates, unclear supplier controls,
weak GenAI evidence and limited management visibility.
Design:
- enterprise AI inventory with risk tiering
- AI policy hierarchy and objectives
- governance forums and RACI
- control library mapped to lifecycle gates
- architecture review for data, model, prompt, retriever, tools and fallback
- release gate evidence binder
- supplier AI risk controls
- incident, change and communication process
- management review with KPI / KRI and continual improvement
Financial retail coverage:
- customer-facing GenAI assistant
- credit AI and adverse-action boundary
- fraud detection and transaction step-up
- KYC / AML analyst-in-the-loop
- internal copilot with data boundary
- hosted LLM and third-party model dependency
Evidence:
- inventory schema and sample records
- impact assessment template
- lifecycle gate model
- control library entries
- architecture review checklist
- release gate memo
- supplier risk matrix
- monitoring KPI / KRI dashboard
- management review pack
Outcome:
AI delivery moved from ad hoc approval to risk-based operating model.
Low-risk AI could move faster through lightweight gates, while high-risk
customer, credit, fraud, KYC / AML and supplier-dependent systems produced
audit-ready evidence and management-visible residual risk decisions.
面试中的高级表达:
我把 AIMS 当成 AI 产品和架构的操作系统,而不是合规文档库。真正的问题不是“有没有 AI policy”,而是组织能否持续知道 AI 在哪里、影响谁、由谁负责、有什么控制、谁接受残余风险、生产中是否仍然有效,以及管理层如何基于证据持续改进。