返回 Papers
AI 扩展计划 / Playbooks

AI Regulator Exam Simulation Pack

本模拟包连接三份既有学习资产:

472AI_REGULATOR_EXAM_SIMULATION_PACK.md

AI 监管检查 / 审计问询模拟包

定位: 面向 AI PM / AI BA / Solution Architect / Enterprise Architect 的检查室训练材料。目标不是背法规, 而是训练你在金融零售 AI 项目中面对监管、内审、模型风险、法务、合规、第三方风险和审计委员会追问时, 能用证据链证明系统边界清楚、控制有效、责任明确、风险可运营。

重要说明: 本文是学习与作品集训练材料, 不是法律意见、合规结论、监管解释或审计意见。正式项目必须由 Legal、Compliance、Risk、Model Risk、Internal Audit、Data Protection、Security 和业务管理层按适用司法辖区确认。


1. 定位与连接关系

本模拟包连接三份既有学习资产:

上游文档本文承接方式训练重点
docs/AI_REGULATORY_RESPONSE_PLAYBOOK.md把 regulatory radar、applicability judgment、risk tiering、AI inventory、evidence pack 转成检查室问答监管响应不是临时解释, 而是从 use case intake 到证据归档的闭环
docs/AI_BOARD_AUDIT_COMMITTEE_GOVERNANCE_PACK.md把 board / audit committee oversight 语言转成高层问询与控制有效性叙述用董事会可理解的方式说明 materiality、residual risk、stop rule、attestation
docs/AI_GOVERNANCE_EVALOPS_RISK_90_PLAN.md把 governance、EvalOps、RiskOps 的概念转成可提交材料、评测报告、事件记录和整改计划证明 AI 系统可上线、可审计、可监控、可回滚、可持续改进

本文的角色定位:

  • AI PM: 说明 AI use case 的业务目标、边界、客户影响、上线门槛、监控指标和停止条件。
  • AI BA: 说明业务流程、政策映射、人工复核点、例外路径、证据字段和问询口径。
  • Architect: 说明数据流、RAG / model / vendor 边界、日志、权限、模型网关、安全控制、变更与回滚。
  • Risk / Compliance 合作者: 说明适用性判断、控制矩阵、监管问询响应、整改和管理层报告。

检查室中的核心能力不是“知道所有答案”, 而是:

Question
-> factual answer
-> control owner
-> evidence artifact
-> test result
-> residual risk
-> remediation if gap exists

2. Exam Room Scenario

2.1 背景设定

一家金融机构准备或已经上线三个 AI 辅助系统:

Use case业务场景AI 作用边界默认风险判断
AML Investigation Copilot辅助 AML analyst 阅读告警、交易明细、KYC 档案和历史 case notes, 生成 investigation summary 和 SAR draftDraft / summarize / cite evidence, 不自动关闭告警, 不自动提交 SAR高影响, 因涉及 AML 合规义务、可疑活动判断和监管检查证据
KYC Policy Assistant帮助 KYC / onboarding / periodic review 团队查询政策、识别材料缺口、解释 EDD 触发条件Retrieve / explain / checklist, 不独立决定客户风险等级, 不跳过人工审批中高影响, 因涉及身份、客户尽调、EDD、制裁和政策解释
Credit Policy RAG帮助信贷运营人员检索授信政策、生成政策引用、辅助 adverse action reason reviewRetrieve / draft rationale / map reason code, 不直接作出授信批准或拒绝决定高影响, 因涉及信贷、公平性、解释义务和客户权益

检查类型:

  • 监管检查: 关注金融机构是否理解 AI 系统风险、是否有合规治理、是否能证明客户保护和监管义务没有被 AI 稀释。
  • 内审检查: 关注治理设计与实际执行是否一致, 控制是否可测试, 证据是否完整, 管理层 attestation 是否有依据。
  • 模型风险审查: 关注模型 / RAG / prompt / eval / monitoring 是否受控, 是否能复现输出, 是否有漂移、偏差、幻觉和解释风险。
  • 第三方风险审查: 关注 LLM provider、KYC vendor、cloud、embedding model、case management integration 的 due diligence、合同、SLA、subprocessor 和退出计划。

2.2 检查室角色

角色可能追问你要准备的回答风格
Lead examiner / Internal audit lead这个系统是否影响客户权益或监管义务? 控制是否真实执行?先讲范围和边界, 再给证据索引, 最后讲残余风险和整改
Compliance officer是否改变 AML / KYC / credit policy 执行方式? 是否需要通知、解释或审批?用政策条款、流程图、审批记录和抽样结果回答
Model risk reviewer模型如何选择、验证、监控、变更? RAG 是否算模型依赖?用 model / vendor record、eval report、版本记录、监控指标回答
Security reviewer是否存在 prompt injection、PII 泄露、越权访问、日志泄露?用威胁模型、访问控制、DLP、red-team、事件 runbook 回答
Legal / privacy reviewer数据使用、保留、跨境、客户解释、合同责任是否清楚?用 data lineage、privacy assessment、contract terms、retention policy 回答
Audit committee observer管理层如何知道控制有效? 哪些风险超出 appetite?用 dashboard、control effectiveness narrative、stop rule 和 remediation plan 回答

2.3 检查目标

检查方通常不是只问“AI 准确率多少”, 而是看四条证据链:

  1. Governance chain: AI inventory、risk tier、approval、owner、RACI、policy mapping 是否一致。
  2. Data and model chain: 数据来源、权限、模型 / vendor、prompt、RAG index、版本、日志是否可追溯。
  3. Control chain: human oversight、eval gate、access control、monitoring、incident response、change control 是否被测试。
  4. Outcome chain: 客户影响、错误纠正、投诉、adverse action、AML / KYC 质量、运营指标是否被监控并进入整改。

2.4 检查室基本回答纪律

  • 先确认范围: “本回答覆盖 AML Copilot limited production, 不覆盖尚未上线的 credit RAG pilot。”
  • 先讲事实: “AI drafts, analysts approve. It cannot close alerts or file SARs.”
  • 不夸大控制: 不说 “AI 不会出错”, 改说 “错误通过 eval、人工复核、抽样监控和 stop rule 管理。”
  • 不把供应商当挡箭牌: “Vendor provides model service, but business owner and technology owner retain accountability.”
  • 不给口头承诺替代证据: 每个回答都落到 artifact、owner、date、sample、test result。
  • 主动披露缺口: 有缺口时说明 interim control、risk acceptance、target date 和 owner。

3. Evidence Pack Checklist

检查室证据包要能在 30 分钟内定位材料, 在 2 小时内完成 walkthrough, 在 5 个工作日内补交抽样明细。建议为每个 material AI use case 建立独立 evidence folder, 并用 evidence index 维护版本。

Evidence item必备内容Examiner 会检查什么Owner
AI inventorySystem ID、business owner、technology owner、AI role、stage、risk tier、model / vendor、data categories、go-live date是否所有生产和 pilot AI 都登记; 影子 AI 是否被遗漏; materiality 是否合理AI governance lead
Use case risk tier初始风险、固有风险、残余风险、客户影响、自动化程度、监管触发、审批记录高风险 use case 是否有更强控制; risk tier 是否与实际功能一致Risk owner / PM
Data lineage数据源、授权依据、PII 标记、RAG source、freshness SLA、retention、cross-border、data owner输出是否可追溯到授权来源; 过期政策是否能被识别; 数据最小化是否执行Data owner / Architect
Model / vendor record模型名称、版本、供应商、部署方式、subprocessor、SLA、model update notification、exit plan机构是否知道自己用的模型; vendor change 是否会绕过内部变更流程Vendor owner / Technology owner
Eval reportGolden set、scenario coverage、quality / safety / fairness / grounding / security metrics、threshold、失败样本、release decision上线是否有门槛; 测试是否覆盖真实业务和边界场景; 失败是否被处理EvalOps owner
Human oversight哪些步骤必须人工复核、角色资格、approval queue、override log、training record、sample review人工是否真正有能力介入; AI 是否事实替代最终决策Business operations owner
Incident log事件等级、发现渠道、影响范围、containment、客户影响、根因、整改、复盘、监管通知判断是否能发现并升级 AI 事故; 近失事件是否进入控制改进RiskOps owner
Change logPrompt、model、RAG index、policy source、threshold、tool integration、access role、deployment 变更变更是否审批; 变更后是否回归测试; vendor update 是否记录Technology owner
Access controlRBAC、least privilege、entitlement review、segregation of duties、break-glass、audit log sample谁能访问客户数据和 AI 输出; 是否存在越权或共享账号Security owner
Third-party due diligenceVendor risk rating、criticality、SOC / security review、privacy terms、model use terms、subprocessor、BCP、termination第三方是否按 criticality 管理; 合同是否覆盖数据、审计、事件、退出TPRM / Vendor owner

证据包质量标准:

  • 每份材料有 version / owner / approval date / evidence location / retention rule
  • 每个控制能回答 control objective / design / operation / test method / result / exception
  • 每个高风险输出能重建 input / retrieved source / model version / prompt / output / reviewer action / final decision
  • 每个缺口有 risk acceptance or remediation plan, 不能只写“已知问题”。

4. 50 个 Examiner Questions

使用方法:

  • 每个问题先用 30 秒给 factual answer。
  • 然后提交 evidence index 中的对应材料。
  • 如果证据不完整, 立即给 interim control 和 remediation plan。

4.1 Strategy Questions

#Examiner question回答要点应提交证据
1为什么这些 AI use cases 值得上线, 而不是继续用人工流程?业务目标绑定具体指标: AML case review time、KYC policy consistency、credit policy citation quality; 强调 AI 只辅助, 不替代最终责任; 说明风险收益经过治理审批Business case、risk acceptance memo、baseline metrics、release gate approval
2管理层如何定义 material AI system?用客户影响、监管义务、自动化程度、数据敏感度、第三方依赖、可逆性和财务 / 合规影响定义 materialityAI materiality policy、AI inventory、risk tier worksheet
3哪些 AI 场景被明确禁止或暂缓?禁止欺骗性输出、自动提交 SAR、自动拒贷、绕过 KYC / EDD、无日志使用客户 PII; 暂缓直接面向客户的高影响建议AI acceptable use policy、prohibited use list、governance committee minutes
4AI portfolio 的投资优先级如何决定?先做证据链清楚、人工流程成熟、数据授权明确、控制可测试的 use case; 高价值但无法审计的 use case 不优先Portfolio prioritization matrix、funding memo、stop rule dashboard
5如果控制成本超过业务收益, 管理层如何决策?说明 unit economics、control cost、风险等级和替代方案; 高风险场景不能为了 ROI 降低最低控制要求ROI model、control cost estimate、risk committee decision record

4.2 Governance Questions

#Examiner question回答要点应提交证据
6谁拥有这些 AI 系统的最终 accountability?Business owner 拥有业务结果和流程风险; technology owner 拥有系统控制; risk / compliance 提供挑战; AI governance committee 做上线门禁RACI、system owner record、committee charter
7AI governance committee 实际批准了什么?批准 use case scope、risk tier、上线阶段、required controls、exceptions、monitoring cadence 和 stop rule, 不是泛泛批准“使用 AI”Committee minutes、release gate checklist、approval matrix
8第一、第二、第三道防线如何分工?一线设计和运行控制; 二线制定政策、挑战和监控; 三线独立测试控制设计与执行Three lines of defense map、audit plan、second-line review notes
9是否存在 shadow AI 或未登记模型?所有生产 AI 必须通过 model gateway / procurement / inventory; 通过 DLP、SaaS discovery、expense review 和 attestation 发现未授权使用AI inventory reconciliation、DLP report、business attestation
10高风险 use case 的 exception 如何批准和复核?例外必须有业务理由、风险补偿控制、到期日、owner 和复核频率; 永久例外需升级Exception register、risk acceptance memo、monthly review evidence

4.3 Data Questions

#Examiner question回答要点应提交证据
11AML Copilot 使用哪些数据, 是否超出 analyst 原本权限?数据来自 case management、transaction monitoring、KYC profile、approved policy docs; AI 访问权限继承 analyst RBAC, 不扩大数据可见范围Data lineage diagram、RBAC mapping、sample access log
12KYC Policy Assistant 的政策来源如何保持最新?只使用 approved policy repository; source refresh 有 SLA; 政策版本进入 RAG metadata; stale source 会触发阻断或提示Source register、freshness monitoring、change log
13Credit Policy RAG 如何防止使用过期或非批准政策?检索源有 version、effective date、owner、approval status; 未批准草稿不进入 production index; 输出必须带 citationRAG source control、index build record、citation audit sample
14是否使用客户 PII 训练或微调模型?默认不把客户 PII 用于供应商训练; 如有微调, 必须有数据授权、脱敏、最小化、保留和审批Data use assessment、vendor data terms、DPIA / privacy review
15如何重建一次 AI 输出所使用的数据上下文?日志保留 input reference、retrieved document IDs、document version、model version、prompt version、output、reviewer actionAudit log sample、retrieval trace、retention policy

4.4 Model Questions

#Examiner question回答要点应提交证据
16你们用的是哪个模型, 为什么选它?说明模型名称、版本、部署模式、评估结果、安全能力、数据条款、成本和替代方案; 选择依据要与风险等级匹配Model selection memo、vendor assessment、benchmark report
17RAG 系统中模型、prompt、retriever、index 谁算受控组件?全部算受控组件; 任何影响输出的组件都进入版本管理、变更审批和回归评测Architecture diagram、component registry、change log
18模型更新由谁触发和审批?Vendor model update、internal prompt change、index refresh、tool change 都有 change type; 高风险变更需 regression 和 release approvalChange policy、model update notice、approval ticket
19如何处理 hallucination 或 unsupported answer?通过 citation requirement、answer abstention、confidence threshold、human review、quality sampling 和 incident trigger 控制Eval report、unsupported claim metric、sample review
20是否使用 post-hoc explanation, 其准确性如何验证?信贷场景不能只依赖近似解释; reason code 必须映射实际使用因素或政策依据, 并由人工复核Explanation validation report、reason code mapping、adverse action review sample

4.5 Eval Questions

#Examiner question回答要点应提交证据
21Eval dataset 如何代表真实业务风险?覆盖正常案例、边界案例、历史事故、政策例外、高风险客户、拒答场景、prompt injection 和不完整资料Eval dataset card、sampling rationale、scenario coverage matrix
22上线阈值如何确定?阈值来自业务风险、监管义务、历史人工基线和风险 appetite; 不同 use case 用不同门槛Eval policy、threshold approval、baseline comparison
23AML Copilot 的 false negative 风险如何测试?使用 labelled alert / case sample 检查 red-flag recall、证据遗漏、错误总结和 SAR draft quality; 人工最终判断保留AML eval report、case sample review、analyst calibration record
24Credit Policy RAG 是否测试公平性或客户影响?测试政策引用准确性、reason code consistency、protected-class proxy 风险、adverse action completeness 和 human review outcomeFair lending review、reason code eval、adverse action sample
25上线后如何发现模型性能下降?生产监控覆盖 unsupported claim、citation completeness、override rate、complaint、incident、drift、latency 和 costMonitoring dashboard、monthly control test、alert history

4.6 Security Questions

#Examiner question回答要点应提交证据
26你们如何防止 prompt injection 影响 RAG 输出?对来源进行 trust boundary 分类; 对用户输入和 retrieved content 做安全过滤; 禁止政策外工具调用; red-team 测试进入 release gateThreat model、red-team report、prompt injection test results
27AI 系统能调用哪些工具或执行哪些动作?AML / KYC / credit 场景默认不允许自动执行客户影响动作; 工具调用有 allowlist、scope、approval 和 audit logTool registry、permission matrix、execution log sample
28日志中是否包含敏感信息, 如何保护?日志最小化、加密、访问控制、保留期限、脱敏视图和审计访问; 调试日志不能绕过数据政策Logging policy、log access review、data masking evidence
29如何防止 AI 输出泄露其他客户信息?RBAC、row-level security、context isolation、prompt boundary、retrieval filter 和 test cases 控制跨客户泄露Access test report、retrieval filter config、DLP findings
30发生 vendor 或 model API 安全事件时如何响应?启动 third-party incident runbook; 评估影响范围; 视情况暂停调用、切换模型、通知 stakeholders、保留证据Incident runbook、vendor notification terms、BCP / exit plan

4.7 Human Oversight Questions

#Examiner question回答要点应提交证据
31人工复核是否只是形式, 还是有实质判断?复核人员有培训、权限、时间、界面证据和 override 能力; 系统记录 AI suggestion 与最终决定差异Training record、override log、workflow screenshots
32AML analyst 可以忽略 AI 建议吗?可以; 最终 case disposition 由 analyst / supervisor 依据政策决定; AI 输出只是 draft 和 evidence organizerSOP、case disposition sample、analyst attestation
33KYC Policy Assistant 答案错误时员工如何处理?员工必须查看 citation; 高风险或不确定答案升级 policy team; 反馈进入 knowledge base correction 和 eval sampleEscalation procedure、feedback log、policy correction record
34Credit Policy RAG 是否影响最终授信决定?它辅助政策检索和 reason draft; final credit decision 和 adverse action notice 由授权人员 / 系统规则按政策批准Credit workflow map、approval records、adverse action review
35如何防止 automation bias?培训员工识别 AI 局限; UI 显示 confidence / source / warning; 抽样检查过度依赖; tracking override and agreement rateTraining material、UI control spec、reviewer behavior analytics

4.8 Vendor Questions

#Examiner question回答要点应提交证据
36第三方 AI provider 是否被视为 critical third party?按业务 criticality、数据敏感度、替代难度、客户影响和监管义务评估; critical provider 进入增强 due diligenceThird-party risk assessment、criticality rating、TPRM approval
37Vendor 是否可用客户数据训练其模型?合同默认禁止将机构客户数据用于供应商通用模型训练; 明确 data retention、subprocessor、region 和 deletionContract terms、data processing addendum、vendor attestation
38Vendor model change 是否会破坏你们的控制?要求 model update notice; 重大变更触发 regression eval、risk review 和 staged rollout; 不能静默升级高风险系统Vendor change notification、regression report、deployment approval
39如果 vendor 服务中断或被限制, 业务如何继续?有 BCP: fallback manual workflow、alternative model / provider、queue management、priority rules 和 customer communicationBCP plan、exit strategy、manual fallback test
40你们如何验证 vendor 声称的安全和模型能力?结合 SOC / security review、penetration evidence、contract warranties、internal red-team、benchmark 和 production monitoringVendor due diligence file、security review、internal validation report

4.9 Incident Questions

#Examiner question回答要点应提交证据
41什么算 AI incident, 什么算 near miss?定义覆盖客户损害、合规错误、数据泄露、越权输出、错误政策引用、模型漂移、未授权变更和供应商事件AI incident taxonomy、severity matrix、runbook
42最近一次 AI near miss 是什么, 如何处理?讲事实、影响、containment、root cause、control improvement; 不把 near miss 隐藏为普通 bugIncident log、postmortem、control update record
43事件是否可能需要监管通知?Legal / compliance 依据适用法规、合同、客户影响和严重程度判断; runbook 中有通知决策点和证据保留Notification decision log、legal review note、timeline
44如何证明同类事件不会重复?把根因转成新 eval case、monitoring alert、policy update、training、access control 或 vendor requirementRemediation evidence、updated test set、control retest result
45谁有权暂停 AI 系统?预先定义 stop authority: business owner、risk owner、CISO、incident commander 或 AI governance lead 可触发暂停Stop rule policy、kill switch evidence、incident command RACI

4.10 Customer Impact Questions

#Examiner question回答要点应提交证据
46客户是否知道 AI 参与了流程?视场景和法规要求确定 disclosure; 内部辅助工具也要确保客户权益、解释和申诉机制不被削弱Disclosure assessment、customer communication policy、legal review
47信贷客户收到 adverse action 时, AI 复杂性是否影响具体原因说明?不允许用黑箱作为不能说明原因的理由; reason 必须 specific、accurate, 与实际因素或政策依据一致Adverse action reason mapping、sample notices、CFPB alignment memo
48如果 AML Copilot 错误总结导致客户被误伤, 如何纠正?AML 输出不直接造成客户动作; 若影响账户限制或调查结论, 启动 review、correction、complaint handling 和 impact assessmentCase review record、customer impact assessment、complaint log
49KYC Assistant 是否可能造成不一致客户待遇?通过 approved policy source、citation、employee training、quality sampling 和 exception governance 降低不一致Policy QA results、EDD decision sample、quality review
50如何衡量 AI 对客户的净影响?同时看效率、准确性、投诉、纠错、appeal overturn、adverse impact、accessibility 和服务质量; 不只看成本节省Customer impact dashboard、complaint analysis、fairness / quality metrics

5. Red Flag List

以下回答会引发更深检查。训练目标不是避免承认问题, 而是避免暴露“无 owner、无证据、无控制、无整改”的治理失效。

Red flag answer为什么危险更稳妥的回答方式
“这是供应商模型, 风险由供应商负责。”监管通常仍期待机构管理第三方风险和业务结果“Vendor 负责服务承诺, 我方保留 use case accountability, 并通过 due diligence、合同、监控和退出计划管理风险。”
“AI 只是建议, 所以不需要治理。”建议会影响员工判断和客户结果, automation bias 仍可能造成损害“AI 是辅助型, 但因影响 AML / KYC / credit workflow, 仍按风险等级设置控制。”
“模型太复杂, 我们无法解释。”信贷等场景下可能直接触发解释和客户保护风险“我们限制模型用途, 通过 reason code mapping、policy citation 和人工复核确保可解释输出。”
“上线后再补评测。”高风险场景上线前无门槛会被视为控制失效“上线前已有 minimum eval gate; 生产后持续监控并扩展测试集。”
“我们相信员工会检查。”信任不是控制; 需要 workflow、培训、override、抽样和日志“人工复核有角色、步骤、记录、抽样和质量指标。”
“没有事故。”如果没有 incident taxonomy 和监控, “没有事故”可能只是没有发现“截至本期无 severe incident; near miss 和低等级事件记录在 incident log, 并已完成复测。”
“RAG 的知识库由业务自己维护, 技术不清楚来源。”数据血缘和 source approval 不清会影响证据可信度“业务是 source owner, 技术维护 index metadata, 每次 build 记录 source version 和 approval status。”
“Prompt 改动不算系统变更。”Prompt 可显著改变输出, 高风险场景必须受控“Prompt、index、model、tool 和 threshold 都是受控组件。”
“我们没有保存输入输出, 因为有隐私风险。”缺日志会无法复现决策; 隐私风险应通过最小化和访问控制管理“我们保存最小必要审计日志, 对敏感字段脱敏并限制访问。”
“客户没有看到 AI 输出, 所以没有客户影响。”员工用 AI 输出作出决定仍可能影响客户“客户影响按流程结果判断, 不只按输出是否直接展示给客户判断。”
“这个 use case 没有进入模型风险, 因为不是传统模型。”GenAI / RAG 可能仍需模型风险或 AI governance 覆盖“它不一定按传统评分模型处理, 但进入 AI inventory、EvalOps、change control 和 evidence pack。”
“Vendor 更新模型我们通常不知道。”静默变更会破坏验证和可复现性“合同要求变更通知; 未通知变更会触发 vendor issue 和内部风险评估。”
“控制有效性由项目经理判断。”控制测试需要独立挑战或可重复测试“一线运行控制, 二线抽样挑战, 内审可独立测试。”
“我们没有 stop rule, 因为还在 pilot。”Pilot 也可能影响真实数据和员工行为“Pilot 有用户、数据、输出用途和停止条件限制。”
“Adverse action reason 由模型生成。”信贷解释必须具体准确, 不能把最终责任交给模型“模型只辅助草拟; 授权人员按 approved reason mapping 审核并批准。”

6. 10 天演练计划

节奏: 每天准备一个材料, 进行一次 20-30 分钟问答模拟, 写一个改进动作。第 10 天完成完整检查室 dry run。

Day准备材料当天问答模拟当天改进动作
1AI Inventory for AML / KYC / Credit use cases用 5 分钟解释每个系统的 AI role、owner、stage、risk tier补齐 inventory 中缺失的 owner、model version、data categories
2Use Case Risk Tier Worksheet回答 “为什么 AML 和 Credit 是 high impact?”把 risk rationale 写成可审计句子, 加入 customer / regulatory impact
3Data Lineage and RAG Source Registerwalkthrough 一次 Credit Policy RAG 输出如何追溯到政策版本增加 source owner、effective date、refresh SLA、retention 字段
4Model / Vendor Record回答 “供应商更新模型会怎样?”补充 model update notification、regression trigger、exit plan
5Eval Report用 10 分钟说明 eval dataset、metrics、threshold、失败样本处理将历史 near miss、边界场景和 red-team 样本加入 eval coverage
6Human Oversight Evidence演示 analyst / officer 如何 review、override、approve增加 override reason code、reviewer training evidence、抽样频率
7Security and Access Control Pack回答 prompt injection、PII、越权访问、日志保护问题补充 threat model、RBAC sample、log masking 和 access review
8Incident Log and Runbook模拟一次 “错误政策引用” near miss 的处理完成 severity、containment、root cause、retest、owner 字段
9Third-Party Due Diligence Pack回答 vendor criticality、data terms、BCP、subprocessor补齐合同控制、SLA、vendor incident notification、termination support
10Full Exam Briefing Pack45 分钟模拟监管 / 内审 walkthrough, 从 strategy 到 customer impact形成 remediation plan, 标出 30 天、60 天、90 天行动

演练评分标准:

Score标准
1只能口头解释, 无材料或材料散乱
2有材料, 但 answer、control、evidence、owner 对不上
3能回答主要问题, 但缺少抽样、测试结果或残余风险说明
4证据链完整, 能说明控制设计和运行情况
5能主动暴露缺口、说明补偿控制、给出整改计划和复测证据

7. Templates

以下模板按 “可直接用于演练” 设计。示例统一使用 AML Investigation Copilot。实际项目使用时替换 use case、owner、日期、指标和证据路径。

7.1 Exam Briefing Memo

# Exam Briefing Memo - AML Investigation Copilot

## Purpose
AML Investigation Copilot assists analysts by summarizing alerts, organizing transaction evidence, retrieving approved AML policy references, and drafting investigation narratives. It does not close alerts, approve case disposition, freeze accounts, or file SARs.

## Scope
- Business process: AML alert investigation and case narrative drafting
- Users: trained AML analysts and AML supervisors
- Stage: limited production
- Data: transaction alerts, KYC profile references, historical case notes, approved AML policy documents
- AI components: model gateway, prompt template, RAG index, evidence citation service, case management integration

## Risk Classification
- Inherent risk: High
- Residual risk: Medium with controls
- Rationale: impacts AML regulatory workflow, uses sensitive customer and transaction data, and may influence analyst judgment

## Key Controls
- Human approval required for final case disposition and SAR decision
- Approved source-only RAG with document version and citation logging
- Monthly eval for red-flag recall, unsupported claim, evidence omission, and narrative quality
- RBAC inherited from AML case management entitlement
- Prompt, model, index, and tool changes require change approval and regression test
- Incident runbook defines containment, escalation, customer / regulatory impact assessment, and retest

## Current Metrics
- Citation completeness: 98.8%
- Unsupported claim rate: 1.4%
- Analyst override rate: 17%
- Severe incidents this quarter: 0
- Low / near miss events this quarter: 2, both remediated and retested

## Known Gaps and Remediation
- Gap: reviewer calibration sample size is below target for complex trade finance alerts
- Interim control: supervisor review required for trade finance cases
- Owner: Head of Financial Crime Operations
- Target date: 2026-07-31
- Retest: 200-case calibration sample after new examples are added to eval set

7.2 Evidence Index

Evidence IDArtifactUse caseOwnerVersion / dateControl supportedLocation pattern
E-001AI Inventory RecordAML CopilotAI Governance Leadv1.4 / 2026-06-15Governance / accountabilityevidence/aml-copilot/01_inventory.pdf
E-002Risk Tier WorksheetAML CopilotRisk Ownerv1.2 / 2026-06-16Risk classificationevidence/aml-copilot/02_risk-tier.xlsx
E-003Data Lineage DiagramAML CopilotData Ownerv1.1 / 2026-06-12Data traceabilityevidence/aml-copilot/03_data-lineage.png
E-004Model and Vendor RecordAML CopilotTechnology Ownerv1.3 / 2026-06-18Model / vendor governanceevidence/aml-copilot/04_model-vendor.md
E-005Eval ReportAML CopilotEvalOps Ownerv2.0 / 2026-06-20Release gate / qualityevidence/aml-copilot/05_eval-report.pdf
E-006Human Oversight SampleAML CopilotOps Owner2026-Q2 sampleHuman reviewevidence/aml-copilot/06_oversight-sample.xlsx
E-007Incident LogAML CopilotRiskOps Owner2026-Q2Incident managementevidence/aml-copilot/07_incident-log.xlsx
E-008Change LogAML CopilotTechnology Owner2026-Q2Change controlevidence/aml-copilot/08_change-log.csv
E-009Access ReviewAML CopilotSecurity Owner2026-Q2Access controlevidence/aml-copilot/09_access-review.pdf
E-010Third-Party Due DiligenceAML CopilotTPRM Owner2026 annual reviewVendor riskevidence/aml-copilot/10_tprm.pdf

7.3 Control Effectiveness Narrative

# Control Effectiveness Narrative - AML Copilot Human Oversight

## Control Objective
Ensure AI-generated investigation summaries and SAR draft language do not replace required analyst and supervisor judgment.

## Control Design
The AML Copilot interface presents AI output as draft-only. The case disposition workflow requires a trained analyst to review evidence citations, edit or reject the AI draft, select disposition reason codes, and submit the case for supervisor approval when required by AML policy.

## Control Operation
For every assisted case, the system logs the AI draft, retrieved evidence references, analyst edits, final disposition, reviewer ID, timestamp, and override reason where the final decision differs from AI recommendation. Supervisors review a weekly sample of assisted cases and all high-risk typology cases.

## Test Method
Internal control testing sampled 150 assisted cases from May 2026. The tester verified that each case had analyst review, citation check, final disposition, timestamp, and reviewer identity. The tester separately reviewed 30 high-risk typology cases for supervisor approval.

## Test Result
147 of 150 cases passed all required fields. Three cases had incomplete override reason text but still had analyst review and final disposition evidence. All 30 high-risk typology cases had supervisor approval.

## Exceptions and Remediation
The incomplete override reason issue was remediated by making override reason a required structured field. Retest on 2026-06-21 found 50 of 50 cases complete.

## Residual Risk
Residual risk remains medium because automation bias can still influence analyst judgment. The control is supported by monthly reviewer calibration, override trend monitoring, and targeted training for low-override analyst groups.

7.4 Regulator Q&A Sheet

QuestionApproved answerEvidence IDsSpeakerFollow-up owner
Does AML Copilot make final AML decisions?No. It drafts summaries and organizes evidence. Analysts retain final case disposition authority, and SAR decisions follow the existing AML approval process.E-001, E-006Head of Financial Crime OperationsOps Owner
Which data sources are used?Approved case management, transaction monitoring, KYC profile references, historical case notes, and approved AML policies. Access is filtered by the analyst's existing entitlement.E-003, E-009Data OwnerArchitect
What happens if the model hallucinates?Unsupported claims are controlled through citation requirement, human review, monthly eval, quality sampling, and incident escalation if customer or regulatory impact is possible.E-005, E-006, E-007EvalOps OwnerRiskOps Owner
How are vendor model changes handled?Vendor updates trigger change assessment. Material changes require regression eval and approval before production use.E-004, E-008, E-010Technology OwnerVendor Owner
What are the known gaps?Reviewer calibration for complex trade finance alerts needs a larger sample. Interim supervisor review is active until the July retest is complete.E-005, E-006Risk OwnerBusiness Owner

7.5 Remediation Plan

FindingRiskRoot causeInterim controlPermanent actionOwnerTarget dateEvidence of closure
Override reason text incomplete in 3 sampled casesWeak audit trail for human oversightFree-text field was optional in workflowSupervisor review of cases with blank override reasonMake structured override reason mandatory and update analyst trainingOps Owner2026-06-21Retest sample showing 50 of 50 complete
Trade finance alert eval sample below targetEval may under-cover complex AML typologyHistorical labelled sample not sufficientSupervisor review for all trade finance casesAdd 200 labelled cases and recalibrate red-flag recall thresholdEvalOps Owner2026-07-31Updated eval report and threshold approval
Vendor model update notice not mapped to internal ticket typeModel change could bypass regression testingContract notice workflow not integrated with change managementManual monthly vendor update reviewAdd vendor model update as required change category in ITSMTechnology Owner2026-07-15ITSM configuration record and first completed ticket

8. 面试表达

8.1 30 秒版本

我会把金融 AI 监管检查准备成一条证据链, 而不是一份口头说明。以 AML Copilot、KYC Policy Assistant、Credit Policy RAG 为例, 我会先明确 AI 只是 draft / retrieve / recommend, 不做最终客户或合规决定; 然后用 AI inventory、risk tier、data lineage、model/vendor record、eval report、human oversight、incident log、change log、access control 和 third-party due diligence 证明控制设计和运行有效。面对问询时, 每个回答都落到 owner、artifact、测试结果、残余风险和整改计划。

8.2 2 分钟版本

在金融零售 AI 项目中, 监管或内审真正关心的是三件事: 第一, 机构是否知道自己上线了哪些 AI, 这些 AI 是否影响客户权益、合规义务或关键运营; 第二, 控制是否只是文档, 还是已经进入流程、系统、日志、评测、人工复核和变更管理; 第三, 出现错误时能否发现、止损、复盘和整改。

我的准备方式是把 use case 拆成四条证据链。Governance chain 包括 AI inventory、risk tier、RACI 和审批记录。Data and model chain 包括数据血缘、RAG source、模型版本、vendor record 和日志。Control chain 包括 eval gate、human oversight、access control、change control、security test 和 incident runbook。Outcome chain 包括客户影响、投诉、override、adverse action reason、AML / KYC 质量和整改记录。

如果是 Credit Policy RAG, 我会特别强调不能因为模型复杂就无法提供 specific and accurate adverse action reason; 如果是 AML Copilot, 我会强调 AI 不自动关闭告警或提交 SAR, 并用人工复核和 case evidence trail 证明责任没有转移给模型; 如果是 vendor LLM, 我会按 third-party risk lifecycle 准备 due diligence、合同、SLA、model update notification 和 exit plan。这样回答既能体现产品和架构能力, 也能体现合规与审计意识。

8.3 Chief Risk Officer 深挖

CRO 可能问: “你如何证明 residual risk 在 appetite 内?”

回答口径:

  • 先说明固有风险: AML / credit 属于 high impact, 因为影响监管义务、客户权益或重要运营。
  • 再说明控制组合: risk tier approval、human oversight、eval thresholds、access control、change management、incident response、vendor due diligence。
  • 再说明控制测试结果: 抽样数量、通过率、异常、复测和趋势。
  • 再说明 residual risk: 哪些风险仍存在, 为什么可以接受, review date 是什么。
  • 最后说明 stop rule: 哪些指标触发暂停、降级、回滚或升级到 risk committee。

强回答示例:

The residual risk is medium and within the approved appetite for limited production, not for full-scale rollout. The basis is not a subjective assessment. We have release-gate eval results, human review samples, RBAC testing, incident trend, and vendor controls. The main remaining risk is automation bias in complex cases, so we added supervisor review for high-risk typologies and a stop rule if red-flag recall or override anomalies breach threshold for two review cycles.

8.4 Audit Committee 深挖

Audit Committee 可能问: “How does management know controls are operating, not just designed?”

回答口径:

  • 每个关键控制都有 owner、频率、证据和测试方法。
  • 管理层 dashboard 不只显示 AI adoption, 也显示 control pass rate、incident、override、complaint、log completeness。
  • 二线风险 / 合规做 challenge, 三线内审可抽样验证。
  • 对异常不回避, 要展示 finding、root cause、remediation、retest。

强回答示例:

Management receives monthly control evidence, not just project status. For AML Copilot, the evidence includes case-level review logs, citation completeness, unsupported claim sampling, analyst override trends, access review, and change tickets. Internal control testing found three override reason exceptions, the workflow was remediated, and retesting showed 50 of 50 sampled cases complete. That gives audit committee a basis to challenge both design and operating effectiveness.

9. Source Anchors

以下官方来源用于建立学习锚点和问询方向。所有法规适用性、义务范围和监管解释均需在正式项目中由 legal / compliance 复核。Access date: 2026-06-29。

SourceOfficial link本模拟包中的使用方式
EU AI Act official pagehttps://digital-strategy.ec.europa.eu/en/policies/regulatory-framework-ai用于训练 risk-based approach、高风险 AI、透明度、人类监督、日志、文档、稳健性、网络安全、GPAI 和实施时间线意识
NIST AI RMF Generative AI Profilehttps://www.nist.gov/publications/artificial-intelligence-risk-management-framework-generative-artificial-intelligence用于训练 GenAI 风险识别、trustworthiness、AI lifecycle、evaluation、AI actors 和治理闭环语言
OCC / Fed / FDIC third-party risk guidance via OCC Bulletin 2023-17https://www.occ.gov/news-issuances/bulletins/2023/bulletin-2023-17.html用于训练 AI vendor、model provider、cloud、data provider 的 third-party risk lifecycle、criticality、due diligence、monitoring 和 exit plan
CFPB Circular 2022-03 on adverse action and complex algorithmshttps://www.consumerfinance.gov/compliance/circulars/circular-2022-03-adverse-action-notification-requirements-in-connection-with-credit-decisions-based-on-complex-algorithms/用于训练 Credit Policy RAG / AI credit assistant 的 adverse action reason、specific and accurate explanation、黑箱不可作为不解释理由

学习时可把 source anchors 转成四类检查问题:

  1. Risk classification: 这个 AI use case 是否属于高影响或监管敏感场景?
  2. Control expectation: 是否有 human oversight、logging、documentation、robustness、security、eval 和 monitoring?
  3. Third-party accountability: vendor 是否受控, 机构是否保留 accountability?
  4. Customer explanation: 当 AI 影响信贷等客户权益时, 是否能提供具体、准确、可审计的理由?

10. 自检清单

用于完成本文后自测:

  • 覆盖 AML Investigation Copilot、KYC Policy Assistant、Credit Policy RAG 三个检查场景。
  • 覆盖 AI inventory、risk tier、data lineage、model/vendor record、eval report、human oversight、incident log、change log、access control、third-party due diligence。
  • 提供 50 个 examiner questions, 并按 strategy、governance、data、model、eval、security、human oversight、vendor、incident、customer impact 分类。
  • 每个问题都有回答要点和应提交证据。
  • 提供 red flag list, 说明哪些回答会引发更深检查。
  • 提供 10 天演练计划, 每天包含材料准备、问答模拟和改进动作。
  • 提供 Exam Briefing Memo、Evidence Index、Control Effectiveness Narrative、Regulator Q&A Sheet、Remediation Plan 模板。
  • 提供 30 秒、2 分钟、CRO 深挖、Audit Committee 深挖面试表达。
  • 包含 EU AI Act、NIST GenAI Profile、OCC third-party risk、CFPB adverse action 四个官方 source anchors。