AI 扩展计划 / Playbooks

AI Continuous Control Monitoring / Assurance Playbook

本文不是基础 BA 教程, 也不是通用审计材料。它面向已经理解 AI control library、audit evidence binder、model risk、business process、requirements traceability 和 financial retail controls 的高级学习者。重点不是“有哪些控制”, 而是“控制是否持续运行、是否有效、是否弱化、谁负责

728 行AI_CONTINUOUS_CONTROL_MONITORING_ASSURANCE_PLAYBOOK.md

AI Continuous Control Monitoring / Assurance Architecture Playbook

适用对象: CBAP+ Business Analyst、AI Product Manager、Product Architect、Enterprise Architect、Control Owner、EvalOps Lead、Model Risk、Operational Risk、Compliance、Internal Audit、金融零售业务负责人。核心问题: 金融零售 AI 系统上线后, 如何把 controls、evals、telemetry、incidents、exceptions、audit evidence 和 control owners 转成持续运行的 assurance system, 证明控制不是只在上线当天有效。目标: 建立一套可落地的 AI continuous control monitoring architecture, 覆盖 capability model、operating model、control test taxonomy、event schema、sampling、KRI dashboard、RACI、lifecycle gates、financial retail examples、templates、30-day lab 和 interview answers。

重要说明: 本文是学习、架构设计和作品集材料, 不构成法律、监管、审计、模型验证或合规意见。真实金融机构项目必须由 business owner、technology、security、privacy、legal、compliance、model risk、operational risk、third-party risk 和 internal audit 按机构政策确认。

1. Executive Framing

AI governance 常见三层成熟度:

Maturity	表现	风险
Static control list	有政策、控制库、上线 checklist	不知道控制是否运行
Evidence binder	有上线证据、审批记录、测试报告	证据偏静态, 生产退化不可见
Continuous assurance	控制被事件化、测试化、指标化、责任化、复核化	需要架构、流程和组织共同运行

AI continuous control monitoring, 简称 CCM:

AI CCM is the continuous process of testing whether AI controls operate as designed, detecting exceptions and KRIs, assigning owners, driving management action, and preserving evidence of control effectiveness over time.

它回答 10 个管理层问题:

哪些 AI controls 被视为 material controls?
每个 control 的 owner 是谁?
控制如何被自动测试、抽样测试或人工复核?
控制失败如何定义 severity?
哪些 exceptions 已打开、超期、重复发生?
哪些 KRIs 表明客户伤害、合规缺口或运营风险正在上升?
哪些 incidents 说明 control design 不足?
哪些 management actions 已完成并验证有效?
哪些 evidence 可以证明控制运行有效?
哪些 GenAI / agentic AI 风险不能被旧模型风险模板充分覆盖?

相邻资产边界:

Asset	主要产物	本手册关注
AI Control Library	control objective、activity、risk-control mapping	recurring tests and operating effectiveness
AI Audit Evidence Binder	evidence index、traceability	evidence freshness and exception closure
AI Release Governance	gate、canary、rollback	post-release control monitoring
AI Observability	traces、metrics、logs	telemetry mapped to controls and KRIs
Model Risk Management	inventory、validation、monitoring	expanded prompt/RAG/tool/agent/workflow assurance

成熟表达:

We monitor control effectiveness by linking AI runtime events to control objectives,
control owners, test methods, KRIs, exceptions, incidents, action plans and evidence.

2. Source Anchors

Anchor	Official link	本手册使用方式
NIST AI RMF	https://www.nist.gov/itl/ai-risk-management-framework	用 AI risk management 的结构化语言组织 govern、map、measure、manage, 强调持续治理
NIST AIRC AI RMF Functions	https://airc.nist.gov/airmf-resources/airmf/	用 Govern / Map / Measure / Manage 映射 owner、context、metric/test 和 treatment
ISO/IEC 42001	https://www.iso.org/standard/42001	把 CCM 放进 AI management system: operational control、performance evaluation、management review、improvement
Federal Reserve SR 26-2	https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm	截至 2026-06-30 的美国模型风险治理锚点: SR 26-2 于 2026-04-17 superseded SR 11-7 and SR 21-8

SR 26-2 nuance:

SR 26-2 在 2026-04-17 已 superseded SR 11-7 and SR 21-8。
传统模型风险中的 risk-based approach、inventory、validation、ongoing monitoring、effective challenge 和 control ownership 仍有价值。
GenAI、RAG、agentic AI、tool-using AI 和 multi-agent workflow 应使用更宽的 AI governance and control assurance framework。
NIST AI RMF、ISO/IEC 42001、operational risk、privacy、security、third-party risk、customer harm、incident management 和 business controls 要放在同一 assurance architecture 中。
面试和作品集中不要把 SR 11-7 当成现行主引用; 可以作为历史概念, 但现行锚点应写 SR 26-2。

NIST AI RMF functions 映射:

Function	CCM interpretation	Example
Govern	control owners、risk appetite、RACI、management review	AI control registry
Map	use case、business process、customer outcome、data flow	lending explanation mapped to adverse action
Measure	control tests、evals、sampling、KRIs、dashboards	citation support test
Manage	exceptions、incidents、actions、rollback、risk acceptance	disable write tool after control failure

3. Capability Model

AI CCM 不是单个工具, 而是一组能力:

Policy and Control Layer
  -> Control Registry
  -> Control Test Catalog
  -> Telemetry and Evidence Contracts
  -> Automated Control Checks
  -> Sampling and Independent Review
  -> Exception and Action Management
  -> KRI Dashboards
  -> Management Review
  -> Assurance Evidence Archive

Capability	Purpose	Minimum viable implementation	Mature implementation
Control registry	记录 controls、owners、cadence、risk mapping	GRC table	Versioned registry linked to risks, requirements and telemetry
Control test catalog	把 objective 转成 test method	Manual checklist	Automated and sampled tests with pass/fail logic
Evidence contract	定义 runtime 必须捕获字段	Trace ID and model version	Control ID, release ID, policy decision, tool version, human action
Event collection	收集 AI runtime and workflow events	Logs	Event stream with schema validation and lineage
Automated checks	自动发现控制失败	Scheduled SQL checks	Policy-as-code, eval pipelines, drift checks and alerting
Sampling	抽检高风险输出	Weekly manual review	Risk-stratified independent sampling
Exception workflow	记录控制失败和处置	Ticket queue	Severity, owner, due date, root cause, retest, residual risk
KRI dashboard	管理层看控制健康	Basic dashboard	Appetite thresholds, trend, segment, aging and action status
Management review	定期评价控制有效性	Monthly meeting	Evidence-based review with decisions and escalations
Assurance archive	保存控制测试证据	Shared folder	Access-controlled evidence store with retention and retrieval

Architecture:

AI Runtime
  -> event pipeline with schema validation and PII redaction
  -> control test engine with rules, evals, drift checks and evidence checks
  -> assurance operations with exceptions, incidents and actions
  -> dashboards, monthly memos, audit packs and regulator response

Design principles: treat controls as operational products; test operating effectiveness, not only model performance; link every KRI to risk appetite; create exceptions automatically where possible; use risk-based sampling; preserve traceability from customer outcome to AI artifact versions; close incidents only after control improvements are retested; review GenAI and agentic AI under broader AI assurance, not old MRM alone.

4. Operating Model and RACI

Role	Responsibility
Business Control Owner	Owns business risk, approves objective, accepts residual risk
AI Product Manager	Owns product behavior, customer impact and risk-value tradeoffs
CBAP+ BA	Maps process, requirements, decisions, exceptions, evidence and owners
Product Architect	Designs control plane, telemetry, policy gateway, rollback and evidence architecture
EvalOps Lead	Designs evals, regression tests, sampling and performance monitoring
Model Risk / AI Risk	Provides independent challenge and monitoring expectations
Compliance / Legal	Confirms regulatory interpretation, disclosures and complaint concerns
Security / Privacy	Reviews data handling, access, redaction, retention and third-party exposure
Operations Lead	Runs queues, human review, exception triage, remediation and training
Internal Audit	Reviews control design, operating effectiveness and evidence quality

Forum	Frequency	Inputs	Decisions
Daily control triage	Daily for Tier 2/3	critical exceptions, KRI breaches, incidents	pause, escalate, assign action
Weekly assurance review	Weekly	test results, sampling failures, exception aging	remediate, retest, adjust threshold
Monthly effectiveness review	Monthly	trend dashboards, repeat failures, action closure	control rating, management actions
Quarterly AI management review	Quarterly	portfolio KRIs, audit findings, risk appetite	investment, policy update, risk acceptance
Incident-triggered review	Event-driven	incident report, affected traces, root cause	rollback, customer remediation, redesign

RACI:

Activity	Business	AI PM	BA	Architect	EvalOps	Risk/Compliance	Security/Privacy	Operations	Audit
Define control objective	A	R	R	C	C	C	C	C	C
Map process and decision impact	C	R	A/R	C	C	C	C	R	C
Define evidence contract	C	C	R	A/R	R	C	A/R	C	C
Build automated test	C	C	C	R	A/R	C	C	C	C
Define sampling plan	C	C	R	C	A/R	A/R	C	R	C
Review failed test	A	R	R	C	R	C	C	A/R	C
Approve residual risk	A	C	C	C	C	A/R	C	C	C
Close management action	A	R	C	C	R	C	C	R	C
Assess effectiveness	C	C	C	C	C	A/R	C	C	A/R

Legend: A = accountable, R = responsible, C = consulted.

Control states: Designed -> Implemented -> Testing -> Effective -> Monitoring -> Exception -> Remediation -> Retest -> Effective. Exceptions can also move to Accepted Risk or Escalated; effective controls can be retired when the risk, process or system no longer applies.

5. Control Test Taxonomy

Test family	Purpose	AI example
Completeness	Confirm every required control event exists	Every Tier 3 tool write has approval token
Accuracy	Confirm control decision is correct	Policy classifier blocks prohibited advice
Authorization	Confirm actor/system has authority	Only approved reviewer releases adverse action explanation
Timeliness	Confirm control runs within required time	AML escalation review completed within SLA
Threshold	Confirm metric stays within appetite	Unsupported answer rate below approved limit
Drift	Detect distribution or behavior shift	Retrieval source mix changed after corpus update
Reconciliation	Compare two systems or records	AI action log reconciles to CRM task creation
Sampling	Human or independent review of selected population	Weekly review of high-risk transcripts
Reperformance	Independent retest of control	Second-line reruns eval on locked dataset
Evidence quality	Confirm evidence is complete and usable	Trace has release_id, control_id and versions
Exception aging	Confirm open issues are managed	High severity exceptions not open beyond 14 days
Action effectiveness	Confirm remediation solved issue	Post-fix retest below threshold

By control object:

Control object	Example test	Cadence
Prompt	Prompt version belongs to approved registry	Every deploy and daily sample
Model route	Route uses approved model and region	Real time
RAG corpus	Citation source appears in approved manifest	Hourly and release-triggered
Retrieval	Critical policy document recall remains above threshold	Daily
Reranker	Compliance documents not demoted below rank 3	Daily
Tool call	Write action has policy token and human approval	Real time
Policy gateway	Prohibited action is blocked	Daily
Human review	Required reviewer completed review before output	Real time
Telemetry	Required evidence fields present and redacted	Hourly
Incident	Incident created for critical KRI breach	Real time
Vendor	Vendor notice reviewed before material route change	Event-driven
Training	Staff completed procedure update	Monthly or release-triggered

6. Event and Schema Examples

Control test event:

{
  "event_type": "ai.control_test_result",
  "event_time": "2026-06-30T14:15:03Z",
  "control_id": "AI-CS-CRM-003",
  "control_name": "Human confirmation required before CRM write",
  "use_case_id": "customer_service_agent_assist",
  "risk_tier": "Tier3",
  "test_id": "TST-CS-CRM-003-REALTIME",
  "test_method": "automated_completeness_check",
  "population_count": 1,
  "sample_count": 1,
  "pass_count": 1,
  "fail_count": 0,
  "result": "pass",
  "release_id": "rel-2026-06-30-004",
  "trace_id": "trc-9d71f4",
  "evidence_uri": "evidence://ai-control-tests/2026/06/30/trc-9d71f4",
  "owner_team": "Customer Service Operations"
}

AI runtime event enriched for control testing:

{
  "event_type": "ai.tool_call",
  "event_time": "2026-06-30T14:14:59Z",
  "trace_id": "trc-9d71f4",
  "customer_segment": "retail_checking",
  "channel": "contact_center",
  "model_version": "approved-route-cs-2026-06",
  "prompt_version": "cs_agent_prompt_v18",
  "rag_index_version": "retail_policy_index_2026_06_28",
  "tool_name": "crm.createFollowUpTask",
  "tool_contract_version": "crm-followup-v3",
  "tool_action_type": "write",
  "policy_decision": "allow_with_human_confirmation",
  "approval_token_id": "appr-6b38",
  "approver_role": "licensed_contact_center_rep",
  "pii_redaction_status": "redacted",
  "control_ids": ["AI-CS-CRM-003", "AI-PRIV-LOG-002", "AI-OPS-HITL-005"]
}

Exception event:

{
  "event_type": "ai.control_exception_opened",
  "event_time": "2026-06-30T15:00:00Z",
  "exception_id": "EX-AI-2026-0630-017",
  "control_id": "AI-RAG-CITE-004",
  "severity": "high",
  "reason": "Unsupported citation rate breached approved threshold for complaint policy questions",
  "detected_by": "daily_retrieval_support_test",
  "affected_use_case": "customer_service_agent_assist",
  "affected_release_id": "rel-2026-06-30-004",
  "owner_team": "Knowledge Operations",
  "due_date": "2026-07-05",
  "required_action": "Roll back complaint SOP index snapshot and retest critical complaint sample"
}

KRI specification:

kri_id: KRI-AI-CS-007
name: Unsupported citation rate for customer-visible answers
risk_theme: Customer harm and misleading information
control_objective: Customer-visible answers must be grounded in approved knowledge sources
population: all customer_service_agent_assist responses with citations
metric_formula: unsupported_citation_count / cited_response_count
green_threshold: <= 0.25%
amber_threshold: > 0.25% and <= 0.75%
red_threshold: > 0.75%
segmentation: [channel, product, customer_segment, policy_domain]
owner: Knowledge Operations
review_forum: Weekly AI Assurance Review
management_action_red: pause index ramp, open high severity exception, sample affected answers, retest corpus manifest
evidence_source: ai_response_events and citation_validation_results
retention: 7 years for regulated customer-impacting cases

SQL-like automated test:

select
  date_trunc('day', event_time) as test_day,
  count(*) as write_events,
  sum(case when approval_token_id is null then 1 else 0 end) as missing_approval,
  sum(case when approver_role not in ('licensed_contact_center_rep', 'supervisor') then 1 else 0 end) as invalid_approver
from ai_tool_call_events
where tool_action_type = 'write'
  and risk_tier = 'Tier3'
group by 1
having missing_approval > 0
    or invalid_approver > 0;

7. Dashboards and KRIs

Executive dashboard:

Widget	Question answered	Red signal
Control health by use case	Which AI systems have failing material controls?	Any Tier 3 critical control failed
Exception aging	Which exceptions are overdue?	High severity open over 14 days
Repeat failures	Which controls fail repeatedly?	Same control fails 3 times in 60 days
Customer harm signals	Are complaints, reversals or adverse outcomes rising?	Above baseline plus appetite
Evidence completeness	Can we prove control operation?	Required evidence coverage below 99% for Tier 3
Action closure	Are actions completed and retested?	Closed without retest evidence

Product and operations KRIs:

KRI	Why it matters	Segments
Human override rate	High override means AI may be low quality or mistrusted	product, queue, agent role
Manual review backlog	HITL control can fail when queue is overloaded	queue, risk tier, region
Tool write reversal rate	Indicates harmful system-of-record actions	tool, field, customer segment
Complaint escalation miss rate	Direct customer harm and regulatory risk	complaint type, product, channel
Fallback route rate	Fallback may bypass controls or degrade quality	model route, vendor, channel
Unsupported answer rate	Measures grounding control effectiveness	policy domain, language, channel

Risk and compliance KRIs:

KRI	Interpretation	Action when red
Policy violation count	Output or action breached guardrail	Freeze affected capability
Adverse action reason mismatch	Lending explanation not aligned to decision reason	Stop customer-visible path
AML typology miss rate	Copilot misses known scenario	Escalate to AML governance
PII telemetry failure	Sensitive data logged outside approved fields	Trigger privacy incident assessment
Vendor route drift	Requests routed to unapproved model, region or endpoint	Disable route and review vendor controls
Evidence missing rate	Audit trail incomplete for material control	Hold release or issue exception

Metric design rules: use both lagging indicators and leading KRIs; display trend and baseline; segment by use case, channel, product, customer segment, geography and risk tier; separate control failure from business performance decline; tie each red threshold to a named management action.

8. Lifecycle Gates

Gate	Required decisions	Exit criteria
Use case intake	Is AI use case in CCM scope? Which controls are material? Who owns them?	Control registry entries, owners, evidence contract, KRI appetite
Pilot	Are automated tests running? Does traffic carry metadata? Are exceptions created?	No unresolved critical design gap, rollback tested, evidence retrievable
Release	Did evals and control tests pass? Are red KRIs absent or accepted?	Release memo includes test results, dashboard live, evidence archived
Scale	Did effectiveness remain stable during ramp? Are segments behaving differently?	14/30-day review complete, open exceptions have action plans
Periodic review	Should controls be enhanced, retired or reclassified? Are KRIs still meaningful?	Monthly memo archived, registry updated, repeat failures analyzed
Incident review	Did incident expose control design gap? Are customers affected?	containment, root cause, control redesign, retest and customer remediation

9. Financial Retail Examples

9.1 AML investigation copilot

Use case:

AI helps analysts summarize alerts, retrieve typology guidance, draft suspicious activity narratives and suggest next investigation steps. Human analysts remain accountable for final decision.

Control	Continuous test	KRI
Approved typology grounding	Draft citations reference approved AML corpus	unsupported typology citation rate
No autonomous SAR filing	AI cannot submit SAR or close case	autonomous restricted action count
Analyst review completeness	Every AI narrative has analyst review event	missing review rate
False negative challenge sample	Closed no-SAR cases sampled against typology rubric	challenge failure rate
Alert prioritization drift	Monitor distribution by typology and customer risk	typology drift index
Evidence retention	Trace links alert, retrieval, prompt, output, analyst edit	evidence completeness rate

Actions: pause narrative auto-draft for high-failure typology, add missed scenario to eval and training, require supervisor review for affected segment until retest passes.

9.2 Lending decision explanation

Use case:

AI generates applicant-facing or banker-facing explanation text based on approved decision reason codes and policy language.

Control	Continuous test	KRI
Reason-code consistency	Output reason matches system decision reason	mismatch rate
Prohibited wording block	Output cannot imply guaranteed approval after reapplication	prohibited phrase count
Fair lending segment review	Sample explanations by segment and product	segment disparity signal
Source policy freshness	Explanation uses current policy version	stale policy citation rate
Human escalation	Ambiguous or complaint-sensitive cases route to reviewer	missed escalation rate
Adverse action evidence	Trace links decision, reason, prompt, policy and output	audit trace completeness

Actions: disable customer-visible explanation prompt after mismatch breach, require legal/compliance review, increase sample size for product and segment with complaint spike.

9.3 Payments fraud and dispute support

Use case:

AI supports fraud operations by ranking suspicious payment events, drafting dispute summaries and recommending next-best investigation action.

Control	Continuous test	KRI
No unapproved payment block	AI recommendation cannot directly block payment	AI-only block count
High-value transaction review	Transactions above threshold require human review	missing high-value review rate
Tool permission boundary	AI can create case note but cannot reverse payment	restricted tool call attempts
Latency fallback control	Fallback route preserves fraud policy gateway	fallback without policy decision count
Customer notification accuracy	Generated dispute message matches case status	message-status mismatch rate
Reversal monitoring	Track reversed or corrected AI-assisted actions	reversal rate

Actions: turn off recommendation display for payment type with high reversal rate, update policy gateway, reconcile AI action log to case management system daily.

9.4 Customer service agent assistant

Use case:

AI supports contact center employees with policy answers, call summaries, complaint detection and CRM follow-up task creation.

Control	Continuous test	KRI
Approved knowledge source	Customer-facing answer cites approved source	unsupported answer rate
Complaint escalation	Complaint indicators route to complaint workflow	missed complaint escalation rate
CRM write approval	Human approval token required for CRM task	write without approval count
Summary accuracy sample	Summaries reviewed against transcript	material omission rate
PII redaction	Telemetry redacts sensitive fields	PII leakage count
Staff overreliance	Monitor low-edit acceptance in high-risk scenarios	blind acceptance rate

Actions: limit assistant to read-only mode during complaint SOP index issue, retrain agents when blind acceptance increases, add exact complaint language scenarios to eval and sampling.

10. Exception, Action and Evidence Design

Severity:

Severity	Definition	Required response
Critical	Customer harm, unauthorized action, regulatory breach or material evidence failure	Immediate containment, executive notification, incident assessment
High	Material control failed but impact appears bounded	Owner action within SLA, weekly review until closure
Medium	Control weakness or threshold breach with limited impact	Action plan and trend monitoring
Low	Documentation, evidence freshness or minor process gap	Batch remediation and monthly review

Lifecycle:

Detect
  -> classify severity
  -> assign owner
  -> contain if needed
  -> analyze root cause
  -> define action
  -> implement fix
  -> retest control
  -> close with evidence
  -> feed learning into eval, control or training

Closure requires:

Root cause documented.
Affected population identified.
Customer or operational impact assessed.
Corrective action implemented.
Control retest passes.
Evidence archived.
Repeat-failure prevention defined.
Residual risk accepted by the right owner if not fully remediated.

Evidence categories:

Category	Examples
Design evidence	control objective, risk mapping, process map, policy mapping
Implementation evidence	configuration, policy-as-code, tool permission, prompt registry
Operating evidence	trace events, control test results, sampling records, dashboard snapshots
Exception evidence	exception record, root cause, action plan, approval, retest result
Management evidence	review minutes, risk acceptance, investment decision, control rating
Audit evidence	independent review, reperformance result, evidence quality score

Evidence contract fields:

trace_id
event_time
use_case_id
risk_tier
control_id
control_test_id
release_id
model_version
prompt_version
rag_index_version
tool_contract_version
policy_ruleset_version
human_review_status
policy_decision
customer_impact_category
evidence_uri
retention_class
redaction_status
control_owner

Good CCM does not mean storing every raw prompt forever. Store structured evidence where possible, control raw prompt retention by risk and privacy policy, hash identifiers for analytics, use stricter access for lending/AML/fraud/complaint records, and make sampling artifacts retrievable without exposing unnecessary PII.

11. Templates with Filled Examples

11.1 Control test card

control_id: AI-LEND-EXP-002
control_name: Lending explanation must match approved decision reason codes
business_process: unsecured_personal_loan_adverse_action
risk_statement: Customer receives inaccurate or misleading explanation for credit decision
control_owner: Head of Credit Operations Controls
control_activity: Explanation generator uses only approved reason codes and approved policy text
test_method: automated_reason_code_reconciliation plus weekly stratified human sample
population: all AI-generated lending explanations
cadence: real_time automated check and weekly human review
pass_rule: output_reason_codes are subset of decision_engine_reason_codes and policy citation is current
failure_severity: high for mismatch, critical for customer-visible prohibited explanation
kri_link: KRI-AI-LEND-003 reason-code mismatch rate
evidence_source: ai_explanation_events, decision_engine_events, citation_validation_results
management_action: stop customer-visible explanation path when mismatch rate exceeds red threshold

11.2 Exception record

exception_id: EX-AI-LEND-2026-0630-004
opened_on: 2026-06-30
severity: high
control_id: AI-LEND-EXP-002
detected_by: automated_reason_code_reconciliation
finding: 17 explanations contained one reason not present in the decision engine record
affected_population: unsecured personal loan applications in online channel
immediate_containment: customer-visible AI explanation disabled for affected product
owner: Head of Credit Operations Controls
root_cause: prompt v22 allowed summarization of bureau factors outside approved reason-code list
corrective_action: revert to prompt v21, add reason-code schema validation, expand eval set with 80 cases
retest_result: zero mismatches in 1,200 replayed explanations
closure_evidence: evidence://exceptions/EX-AI-LEND-2026-0630-004/closure-pack
closed_on: 2026-07-03

11.3 Monthly control effectiveness memo

# Monthly AI Control Effectiveness Review: Customer Service Agent Assistant

Review period: 2026-06-01 to 2026-06-30
System: customer_service_agent_assist
Risk tier: Tier3 for CRM write-enabled workflow
Overall rating: Effective with monitored exceptions

Key results:
- CRM write approval control passed for 100% of 18,442 write events.
- Unsupported citation rate was 0.31%, within amber threshold.
- Complaint escalation miss rate improved from 0.09% to 0.04%.
- Evidence completeness was 99.72%, with missing fields concentrated in fallback route events.

Open exception:
- EX-AI-CS-2026-0628-011: fallback route missing rag_index_version in 47 traces, due 2026-07-05.

Management actions:
- Knowledge Operations will retest complaint SOP retrieval after index patch.
- Architecture will add schema enforcement for fallback route telemetry.
- Operations will increase complaint-sensitive call sampling in July.

Conclusion:
Controls operated within appetite for the review period, with one telemetry completeness exception under active remediation.

11.4 Quarterly owner attestation

# Quarterly AI Control Owner Attestation

Control owner: Head of Customer Service Operations Controls
Quarter: 2026 Q2
Controls covered: AI-CS-CRM-003, AI-CS-RAG-004, AI-CS-COMP-006, AI-PRIV-LOG-002

Attestation:
I reviewed control test results, exceptions, management actions, incident linkages and evidence completeness. Based on the evidence reviewed, the controls operated within approved risk appetite for 2026 Q2 except for the documented exception below.

Exception requiring continued monitoring:
- EX-AI-CS-2026-0628-011: fallback route missing rag_index_version, due 2026-07-05.

Required improvements:
- Enforce telemetry schema for fallback routes.
- Increase complaint-sensitive sample size from 100 to 200 per week for July.

Residual risk:
Accepted through 2026-07-15 for telemetry completeness gap in fallback route, limited to read-only answer flow and excluding CRM write events.

12. 30-Day Lab

Goal:

Build a portfolio-ready AI Continuous Control Monitoring pack for one financial retail AI use case.

Recommended scenario:

Customer service agent assistant with RAG answers, call summaries, complaint detection and human-approved CRM follow-up task creation.

Day	Work	Output
1	Choose use case and process boundary	use case scope memo
2	Identify customer outcomes and operational decisions	decision and outcome map
3	List 12 material AI risks	risk register excerpt
4	Define 10 control objectives	control objective table
5	Assign control owners and RACI	owner matrix
6	Map controls to NIST AI RMF functions	Govern / Map / Measure / Manage mapping
7	Write one-page executive framing	CCM executive summary
8	Convert controls to test methods	control test catalog
9	Define runtime, test and exception schemas	schema examples
10	Define evidence fields and retention classes	evidence contract
11	Design automated checks for tool write, citation and telemetry completeness	test rule spec
12	Design stratified sampling plan	sampling methodology
13	Define pass/fail thresholds and severity	severity matrix
14	Run tabletop test using 20 synthetic events	test result walkthrough
15	Define 15 KRIs with owners and actions	KRI catalog
16	Design executive dashboard	dashboard wireframe in markdown
17	Design operations dashboard	operational metrics spec
18	Design risk/compliance dashboard	risk view spec
19	Write exception lifecycle	exception workflow
20	Write management action rules	action matrix
21	Draft monthly effectiveness memo	memo sample
22	Map lifecycle gates from intake to periodic review	gate checklist
23	Add AML, lending, payments and customer service variations	scenario appendix
24	Write quarterly owner attestation	attestation sample
25	Create evidence archive index	evidence map
26	Add SR 26-2 nuance and GenAI/agentic governance position	regulatory nuance note
27	Review distinction from control library and evidence binder	positioning note
28	Write interview 30-second and 2-minute answers	interview pack
29	Assemble final portfolio pack	PDF-ready markdown package
30	Conduct self-review against rubric	final quality checklist

Rubric:

Dimension	Strong answer
Control thinking	Converts risks into control objectives, not generic metrics
Testability	Every material control has population, cadence, pass rule and evidence
Financial realism	Handles AML, lending, payments, complaints, privacy and operations
Architecture	Shows telemetry, schema, test engine, exception workflow and evidence store
Management action	Red thresholds trigger named actions and owners
Regulatory nuance	Uses SR 26-2 as current anchor and broader AI governance for GenAI/agents
Interview readiness	Explains distinction from control library and audit binder clearly

13. Interview Answers

13.1 What is AI continuous control monitoring?

30-second answer:

AI continuous control monitoring is the operating system for proving AI controls remain effective after launch. It links control objectives to runtime events, automated tests, sampling, KRIs, exceptions, owners, management actions and evidence. It is different from a static control library because it tests whether controls actually operate over time.

2-minute answer:

In financial retail, AI risk does not stop at release. A lending explanation prompt can drift, a RAG index can lose a critical policy, a customer service agent can gain write access, or a human review queue can become overloaded. Continuous control monitoring turns these risks into recurring control tests. For each material control, I define the owner, population, cadence, evidence fields, pass/fail rule, threshold, exception severity and management action. Then I instrument runtime events so traces carry model, prompt, index, tool, policy, human review and release metadata. Automated checks catch completeness, authorization, threshold and evidence gaps; risk-based sampling catches nuanced customer harm. Dashboards show KRI trends, exception aging, repeat failures and action closure. The goal is operating effectiveness over time, not a one-time audit screenshot.

13.2 How is this different from an audit evidence binder?

An audit evidence binder organizes proof for review. Continuous control monitoring generates and tests that proof continuously. The binder answers “what evidence do we have”; CCM answers “did the control operate, did it pass, what failed, who owns the action and is effectiveness improving or degrading.”

13.3 How would you design control tests for an agentic AI system?

I would test the full behavior chain: prompt, model route, retrieval, policy decision, tool permission, human confirmation, system-of-record write, telemetry and exception handling. For every write-enabled tool, I would require authorization tests, idempotency or compensating control evidence, approval-token completeness, reconciliation to the target system and reversal-rate KRIs.

13.4 What KRIs matter for customer-facing financial AI?

I would track unsupported answer rate, complaint spike, adverse action reason mismatch, tool write reversal, missing human review, PII telemetry failure, fallback route without policy decision, override spike, exception aging, evidence completeness and repeat control failures. The point is not to collect many metrics, but to connect each KRI to risk appetite and action.

13.5 How should SR 26-2 affect GenAI governance?

SR 26-2 superseded SR 11-7 and SR 21-8 on 2026-04-17, so I would not cite SR 11-7 as the current anchor. I would keep the useful principles of risk-based governance, validation, ongoing monitoring and effective challenge, but govern GenAI and agentic AI through a broader AI assurance architecture that also covers NIST AI RMF, ISO 42001, operational controls, privacy, security, third-party risk, customer harm and incident response.

13.6 What would you show in a portfolio?

I would show a CCM pack for one financial retail AI use case: control registry, test catalog, event schema, KRI dashboard, exception workflow, RACI, lifecycle gates, sample monthly effectiveness memo and one incident-to-control-improvement example. That demonstrates PM, BA and architecture depth because it connects customer outcomes, controls, telemetry and management action.

14. Anti-Patterns and Self-Check

Anti-pattern	Why it fails	Better practice
Monitoring only latency and cost	Misses customer harm and control failure	Add control KRIs and evidence completeness
Using control library as proof	Design does not prove operation	Run recurring tests and preserve results
Dashboard without owners	Red indicators do not create action	Bind each KRI to owner, threshold and action
All sampling is random	Rare high-risk failures disappear	Use risk-stratified sampling
Closing incident without control update	Same failure returns	Feed root cause into eval, control and training
Treating HITL as magic	Humans can be overloaded or overtrust AI	Monitor queue, edits, overrides and blind acceptance
Ignoring telemetry schema	Evidence cannot support audit or replay	Define event fields before release
Forcing agents into old SR 11-7 framing	Tool use, autonomy and workflow risk are under-modeled	Use SR 26-2 plus broader AI control assurance

Before claiming maturity, confirm:

Every material AI risk has a control objective.
Every material control has an owner and test method.
Each test has population, cadence, pass rule and severity.
Runtime events carry the fields needed for testing.
Evidence is redacted, retained and retrievable.
KRIs have thresholds and named management actions.
Exceptions have due dates, owners, root cause and retest evidence.
Sampling is risk-stratified for high-impact use cases.
Incidents feed back into controls, evals and training.
Management review uses trends, not only point-in-time status.
SR 26-2 is cited as the current MRM anchor, with GenAI and agentic AI governed through broader AI assurance.

15. One-Page Summary

AI Continuous Control Monitoring is where AI governance becomes operational. The control library says what should be controlled; the evidence binder organizes proof; the release gate decides whether a version can go live. CCM asks every day whether the control ran, passed, failed, triggered an exception, received management action, was retested, and can be proven. For a CBAP-level financial retail PM/BA/architect, the differentiator is translating AI behavior into operational controls, tests, KRIs, exceptions, evidence and management decisions that hold up under product pressure, incident pressure and audit pressure.