返回 Papers
AI 扩展计划 / Playbooks

AI Continuous Control Monitoring / Assurance Playbook

本文不是基础 BA 教程, 也不是通用审计材料。它面向已经理解 AI control library、audit evidence binder、model risk、business process、requirements traceability 和 financial retail controls 的高级学习者。重点不是“有哪些控制”, 而是“控制是否持续运行、是否有效、是否弱化、谁负责

728AI_CONTINUOUS_CONTROL_MONITORING_ASSURANCE_PLAYBOOK.md

AI Continuous Control Monitoring / Assurance Architecture Playbook

适用对象: CBAP+ Business Analyst、AI Product Manager、Product Architect、Enterprise Architect、Control Owner、EvalOps Lead、Model Risk、Operational Risk、Compliance、Internal Audit、金融零售业务负责人。 核心问题: 金融零售 AI 系统上线后, 如何把 controls、evals、telemetry、incidents、exceptions、audit evidence 和 control owners 转成持续运行的 assurance system, 证明控制不是只在上线当天有效。 目标: 建立一套可落地的 AI continuous control monitoring architecture, 覆盖 capability model、operating model、control test taxonomy、event schema、sampling、KRI dashboard、RACI、lifecycle gates、financial retail examples、templates、30-day lab 和 interview answers。

本文不是基础 BA 教程, 也不是通用审计材料。它面向已经理解 AI control library、audit evidence binder、model risk、business process、requirements traceability 和 financial retail controls 的高级学习者。重点不是“有哪些控制”, 而是“控制是否持续运行、是否有效、是否弱化、谁负责、什么时候升级、管理层如何知道风险正在变化”。

重要说明: 本文是学习、架构设计和作品集材料, 不构成法律、监管、审计、模型验证或合规意见。真实金融机构项目必须由 business owner、technology、security、privacy、legal、compliance、model risk、operational risk、third-party risk 和 internal audit 按机构政策确认。


1. Executive Framing

AI governance 常见三层成熟度:

Maturity表现风险
Static control list有政策、控制库、上线 checklist不知道控制是否运行
Evidence binder有上线证据、审批记录、测试报告证据偏静态, 生产退化不可见
Continuous assurance控制被事件化、测试化、指标化、责任化、复核化需要架构、流程和组织共同运行

AI continuous control monitoring, 简称 CCM:

AI CCM is the continuous process of testing whether AI controls operate as designed, detecting exceptions and KRIs, assigning owners, driving management action, and preserving evidence of control effectiveness over time.

它回答 10 个管理层问题:

  1. 哪些 AI controls 被视为 material controls?
  2. 每个 control 的 owner 是谁?
  3. 控制如何被自动测试、抽样测试或人工复核?
  4. 控制失败如何定义 severity?
  5. 哪些 exceptions 已打开、超期、重复发生?
  6. 哪些 KRIs 表明客户伤害、合规缺口或运营风险正在上升?
  7. 哪些 incidents 说明 control design 不足?
  8. 哪些 management actions 已完成并验证有效?
  9. 哪些 evidence 可以证明控制运行有效?
  10. 哪些 GenAI / agentic AI 风险不能被旧模型风险模板充分覆盖?

相邻资产边界:

Asset主要产物本手册关注
AI Control Librarycontrol objective、activity、risk-control mappingrecurring tests and operating effectiveness
AI Audit Evidence Binderevidence index、traceabilityevidence freshness and exception closure
AI Release Governancegate、canary、rollbackpost-release control monitoring
AI Observabilitytraces、metrics、logstelemetry mapped to controls and KRIs
Model Risk Managementinventory、validation、monitoringexpanded prompt/RAG/tool/agent/workflow assurance

成熟表达:

We monitor control effectiveness by linking AI runtime events to control objectives,
control owners, test methods, KRIs, exceptions, incidents, action plans and evidence.

2. Source Anchors

AnchorOfficial link本手册使用方式
NIST AI RMFhttps://www.nist.gov/itl/ai-risk-management-framework用 AI risk management 的结构化语言组织 govern、map、measure、manage, 强调持续治理
NIST AIRC AI RMF Functionshttps://airc.nist.gov/airmf-resources/airmf/用 Govern / Map / Measure / Manage 映射 owner、context、metric/test 和 treatment
ISO/IEC 42001https://www.iso.org/standard/42001把 CCM 放进 AI management system: operational control、performance evaluation、management review、improvement
Federal Reserve SR 26-2https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm截至 2026-06-30 的美国模型风险治理锚点: SR 26-2 于 2026-04-17 superseded SR 11-7 and SR 21-8

SR 26-2 nuance:

  • SR 26-2 在 2026-04-17 已 superseded SR 11-7 and SR 21-8。
  • 传统模型风险中的 risk-based approach、inventory、validation、ongoing monitoring、effective challenge 和 control ownership 仍有价值。
  • GenAI、RAG、agentic AI、tool-using AI 和 multi-agent workflow 应使用更宽的 AI governance and control assurance framework。
  • NIST AI RMF、ISO/IEC 42001、operational risk、privacy、security、third-party risk、customer harm、incident management 和 business controls 要放在同一 assurance architecture 中。
  • 面试和作品集中不要把 SR 11-7 当成现行主引用; 可以作为历史概念, 但现行锚点应写 SR 26-2。

NIST AI RMF functions 映射:

FunctionCCM interpretationExample
Governcontrol owners、risk appetite、RACI、management reviewAI control registry
Mapuse case、business process、customer outcome、data flowlending explanation mapped to adverse action
Measurecontrol tests、evals、sampling、KRIs、dashboardscitation support test
Manageexceptions、incidents、actions、rollback、risk acceptancedisable write tool after control failure

3. Capability Model

AI CCM 不是单个工具, 而是一组能力:

Policy and Control Layer
  -> Control Registry
  -> Control Test Catalog
  -> Telemetry and Evidence Contracts
  -> Automated Control Checks
  -> Sampling and Independent Review
  -> Exception and Action Management
  -> KRI Dashboards
  -> Management Review
  -> Assurance Evidence Archive
CapabilityPurposeMinimum viable implementationMature implementation
Control registry记录 controls、owners、cadence、risk mappingGRC tableVersioned registry linked to risks, requirements and telemetry
Control test catalog把 objective 转成 test methodManual checklistAutomated and sampled tests with pass/fail logic
Evidence contract定义 runtime 必须捕获字段Trace ID and model versionControl ID, release ID, policy decision, tool version, human action
Event collection收集 AI runtime and workflow eventsLogsEvent stream with schema validation and lineage
Automated checks自动发现控制失败Scheduled SQL checksPolicy-as-code, eval pipelines, drift checks and alerting
Sampling抽检高风险输出Weekly manual reviewRisk-stratified independent sampling
Exception workflow记录控制失败和处置Ticket queueSeverity, owner, due date, root cause, retest, residual risk
KRI dashboard管理层看控制健康Basic dashboardAppetite thresholds, trend, segment, aging and action status
Management review定期评价控制有效性Monthly meetingEvidence-based review with decisions and escalations
Assurance archive保存控制测试证据Shared folderAccess-controlled evidence store with retention and retrieval

Architecture:

AI Runtime
  -> event pipeline with schema validation and PII redaction
  -> control test engine with rules, evals, drift checks and evidence checks
  -> assurance operations with exceptions, incidents and actions
  -> dashboards, monthly memos, audit packs and regulator response

Design principles: treat controls as operational products; test operating effectiveness, not only model performance; link every KRI to risk appetite; create exceptions automatically where possible; use risk-based sampling; preserve traceability from customer outcome to AI artifact versions; close incidents only after control improvements are retested; review GenAI and agentic AI under broader AI assurance, not old MRM alone.


4. Operating Model and RACI

RoleResponsibility
Business Control OwnerOwns business risk, approves objective, accepts residual risk
AI Product ManagerOwns product behavior, customer impact and risk-value tradeoffs
CBAP+ BAMaps process, requirements, decisions, exceptions, evidence and owners
Product ArchitectDesigns control plane, telemetry, policy gateway, rollback and evidence architecture
EvalOps LeadDesigns evals, regression tests, sampling and performance monitoring
Model Risk / AI RiskProvides independent challenge and monitoring expectations
Compliance / LegalConfirms regulatory interpretation, disclosures and complaint concerns
Security / PrivacyReviews data handling, access, redaction, retention and third-party exposure
Operations LeadRuns queues, human review, exception triage, remediation and training
Internal AuditReviews control design, operating effectiveness and evidence quality
ForumFrequencyInputsDecisions
Daily control triageDaily for Tier 2/3critical exceptions, KRI breaches, incidentspause, escalate, assign action
Weekly assurance reviewWeeklytest results, sampling failures, exception agingremediate, retest, adjust threshold
Monthly effectiveness reviewMonthlytrend dashboards, repeat failures, action closurecontrol rating, management actions
Quarterly AI management reviewQuarterlyportfolio KRIs, audit findings, risk appetiteinvestment, policy update, risk acceptance
Incident-triggered reviewEvent-drivenincident report, affected traces, root causerollback, customer remediation, redesign

RACI:

ActivityBusinessAI PMBAArchitectEvalOpsRisk/ComplianceSecurity/PrivacyOperationsAudit
Define control objectiveARRCCCCCC
Map process and decision impactCRA/RCCCCRC
Define evidence contractCCRA/RRCA/RCC
Build automated testCCCRA/RCCCC
Define sampling planCCRCA/RA/RCRC
Review failed testARRCRCCA/RC
Approve residual riskACCCCA/RCCC
Close management actionARCCRCCRC
Assess effectivenessCCCCCA/RCCA/R

Legend: A = accountable, R = responsible, C = consulted.

Control states: Designed -> Implemented -> Testing -> Effective -> Monitoring -> Exception -> Remediation -> Retest -> Effective. Exceptions can also move to Accepted Risk or Escalated; effective controls can be retired when the risk, process or system no longer applies.


5. Control Test Taxonomy

Test familyPurposeAI example
CompletenessConfirm every required control event existsEvery Tier 3 tool write has approval token
AccuracyConfirm control decision is correctPolicy classifier blocks prohibited advice
AuthorizationConfirm actor/system has authorityOnly approved reviewer releases adverse action explanation
TimelinessConfirm control runs within required timeAML escalation review completed within SLA
ThresholdConfirm metric stays within appetiteUnsupported answer rate below approved limit
DriftDetect distribution or behavior shiftRetrieval source mix changed after corpus update
ReconciliationCompare two systems or recordsAI action log reconciles to CRM task creation
SamplingHuman or independent review of selected populationWeekly review of high-risk transcripts
ReperformanceIndependent retest of controlSecond-line reruns eval on locked dataset
Evidence qualityConfirm evidence is complete and usableTrace has release_id, control_id and versions
Exception agingConfirm open issues are managedHigh severity exceptions not open beyond 14 days
Action effectivenessConfirm remediation solved issuePost-fix retest below threshold

By control object:

Control objectExample testCadence
PromptPrompt version belongs to approved registryEvery deploy and daily sample
Model routeRoute uses approved model and regionReal time
RAG corpusCitation source appears in approved manifestHourly and release-triggered
RetrievalCritical policy document recall remains above thresholdDaily
RerankerCompliance documents not demoted below rank 3Daily
Tool callWrite action has policy token and human approvalReal time
Policy gatewayProhibited action is blockedDaily
Human reviewRequired reviewer completed review before outputReal time
TelemetryRequired evidence fields present and redactedHourly
IncidentIncident created for critical KRI breachReal time
VendorVendor notice reviewed before material route changeEvent-driven
TrainingStaff completed procedure updateMonthly or release-triggered

6. Event and Schema Examples

Control test event:

{
  "event_type": "ai.control_test_result",
  "event_time": "2026-06-30T14:15:03Z",
  "control_id": "AI-CS-CRM-003",
  "control_name": "Human confirmation required before CRM write",
  "use_case_id": "customer_service_agent_assist",
  "risk_tier": "Tier3",
  "test_id": "TST-CS-CRM-003-REALTIME",
  "test_method": "automated_completeness_check",
  "population_count": 1,
  "sample_count": 1,
  "pass_count": 1,
  "fail_count": 0,
  "result": "pass",
  "release_id": "rel-2026-06-30-004",
  "trace_id": "trc-9d71f4",
  "evidence_uri": "evidence://ai-control-tests/2026/06/30/trc-9d71f4",
  "owner_team": "Customer Service Operations"
}

AI runtime event enriched for control testing:

{
  "event_type": "ai.tool_call",
  "event_time": "2026-06-30T14:14:59Z",
  "trace_id": "trc-9d71f4",
  "customer_segment": "retail_checking",
  "channel": "contact_center",
  "model_version": "approved-route-cs-2026-06",
  "prompt_version": "cs_agent_prompt_v18",
  "rag_index_version": "retail_policy_index_2026_06_28",
  "tool_name": "crm.createFollowUpTask",
  "tool_contract_version": "crm-followup-v3",
  "tool_action_type": "write",
  "policy_decision": "allow_with_human_confirmation",
  "approval_token_id": "appr-6b38",
  "approver_role": "licensed_contact_center_rep",
  "pii_redaction_status": "redacted",
  "control_ids": ["AI-CS-CRM-003", "AI-PRIV-LOG-002", "AI-OPS-HITL-005"]
}

Exception event:

{
  "event_type": "ai.control_exception_opened",
  "event_time": "2026-06-30T15:00:00Z",
  "exception_id": "EX-AI-2026-0630-017",
  "control_id": "AI-RAG-CITE-004",
  "severity": "high",
  "reason": "Unsupported citation rate breached approved threshold for complaint policy questions",
  "detected_by": "daily_retrieval_support_test",
  "affected_use_case": "customer_service_agent_assist",
  "affected_release_id": "rel-2026-06-30-004",
  "owner_team": "Knowledge Operations",
  "due_date": "2026-07-05",
  "required_action": "Roll back complaint SOP index snapshot and retest critical complaint sample"
}

KRI specification:

kri_id: KRI-AI-CS-007
name: Unsupported citation rate for customer-visible answers
risk_theme: Customer harm and misleading information
control_objective: Customer-visible answers must be grounded in approved knowledge sources
population: all customer_service_agent_assist responses with citations
metric_formula: unsupported_citation_count / cited_response_count
green_threshold: <= 0.25%
amber_threshold: > 0.25% and <= 0.75%
red_threshold: > 0.75%
segmentation: [channel, product, customer_segment, policy_domain]
owner: Knowledge Operations
review_forum: Weekly AI Assurance Review
management_action_red: pause index ramp, open high severity exception, sample affected answers, retest corpus manifest
evidence_source: ai_response_events and citation_validation_results
retention: 7 years for regulated customer-impacting cases

SQL-like automated test:

select
  date_trunc('day', event_time) as test_day,
  count(*) as write_events,
  sum(case when approval_token_id is null then 1 else 0 end) as missing_approval,
  sum(case when approver_role not in ('licensed_contact_center_rep', 'supervisor') then 1 else 0 end) as invalid_approver
from ai_tool_call_events
where tool_action_type = 'write'
  and risk_tier = 'Tier3'
group by 1
having missing_approval > 0
    or invalid_approver > 0;

7. Dashboards and KRIs

Executive dashboard:

WidgetQuestion answeredRed signal
Control health by use caseWhich AI systems have failing material controls?Any Tier 3 critical control failed
Exception agingWhich exceptions are overdue?High severity open over 14 days
Repeat failuresWhich controls fail repeatedly?Same control fails 3 times in 60 days
Customer harm signalsAre complaints, reversals or adverse outcomes rising?Above baseline plus appetite
Evidence completenessCan we prove control operation?Required evidence coverage below 99% for Tier 3
Action closureAre actions completed and retested?Closed without retest evidence

Product and operations KRIs:

KRIWhy it mattersSegments
Human override rateHigh override means AI may be low quality or mistrustedproduct, queue, agent role
Manual review backlogHITL control can fail when queue is overloadedqueue, risk tier, region
Tool write reversal rateIndicates harmful system-of-record actionstool, field, customer segment
Complaint escalation miss rateDirect customer harm and regulatory riskcomplaint type, product, channel
Fallback route rateFallback may bypass controls or degrade qualitymodel route, vendor, channel
Unsupported answer rateMeasures grounding control effectivenesspolicy domain, language, channel

Risk and compliance KRIs:

KRIInterpretationAction when red
Policy violation countOutput or action breached guardrailFreeze affected capability
Adverse action reason mismatchLending explanation not aligned to decision reasonStop customer-visible path
AML typology miss rateCopilot misses known scenarioEscalate to AML governance
PII telemetry failureSensitive data logged outside approved fieldsTrigger privacy incident assessment
Vendor route driftRequests routed to unapproved model, region or endpointDisable route and review vendor controls
Evidence missing rateAudit trail incomplete for material controlHold release or issue exception

Metric design rules: use both lagging indicators and leading KRIs; display trend and baseline; segment by use case, channel, product, customer segment, geography and risk tier; separate control failure from business performance decline; tie each red threshold to a named management action.


8. Lifecycle Gates

GateRequired decisionsExit criteria
Use case intakeIs AI use case in CCM scope? Which controls are material? Who owns them?Control registry entries, owners, evidence contract, KRI appetite
PilotAre automated tests running? Does traffic carry metadata? Are exceptions created?No unresolved critical design gap, rollback tested, evidence retrievable
ReleaseDid evals and control tests pass? Are red KRIs absent or accepted?Release memo includes test results, dashboard live, evidence archived
ScaleDid effectiveness remain stable during ramp? Are segments behaving differently?14/30-day review complete, open exceptions have action plans
Periodic reviewShould controls be enhanced, retired or reclassified? Are KRIs still meaningful?Monthly memo archived, registry updated, repeat failures analyzed
Incident reviewDid incident expose control design gap? Are customers affected?containment, root cause, control redesign, retest and customer remediation

9. Financial Retail Examples

9.1 AML investigation copilot

Use case:

AI helps analysts summarize alerts, retrieve typology guidance, draft suspicious activity narratives and suggest next investigation steps. Human analysts remain accountable for final decision.

ControlContinuous testKRI
Approved typology groundingDraft citations reference approved AML corpusunsupported typology citation rate
No autonomous SAR filingAI cannot submit SAR or close caseautonomous restricted action count
Analyst review completenessEvery AI narrative has analyst review eventmissing review rate
False negative challenge sampleClosed no-SAR cases sampled against typology rubricchallenge failure rate
Alert prioritization driftMonitor distribution by typology and customer risktypology drift index
Evidence retentionTrace links alert, retrieval, prompt, output, analyst editevidence completeness rate

Actions: pause narrative auto-draft for high-failure typology, add missed scenario to eval and training, require supervisor review for affected segment until retest passes.

9.2 Lending decision explanation

Use case:

AI generates applicant-facing or banker-facing explanation text based on approved decision reason codes and policy language.

ControlContinuous testKRI
Reason-code consistencyOutput reason matches system decision reasonmismatch rate
Prohibited wording blockOutput cannot imply guaranteed approval after reapplicationprohibited phrase count
Fair lending segment reviewSample explanations by segment and productsegment disparity signal
Source policy freshnessExplanation uses current policy versionstale policy citation rate
Human escalationAmbiguous or complaint-sensitive cases route to reviewermissed escalation rate
Adverse action evidenceTrace links decision, reason, prompt, policy and outputaudit trace completeness

Actions: disable customer-visible explanation prompt after mismatch breach, require legal/compliance review, increase sample size for product and segment with complaint spike.

9.3 Payments fraud and dispute support

Use case:

AI supports fraud operations by ranking suspicious payment events, drafting dispute summaries and recommending next-best investigation action.

ControlContinuous testKRI
No unapproved payment blockAI recommendation cannot directly block paymentAI-only block count
High-value transaction reviewTransactions above threshold require human reviewmissing high-value review rate
Tool permission boundaryAI can create case note but cannot reverse paymentrestricted tool call attempts
Latency fallback controlFallback route preserves fraud policy gatewayfallback without policy decision count
Customer notification accuracyGenerated dispute message matches case statusmessage-status mismatch rate
Reversal monitoringTrack reversed or corrected AI-assisted actionsreversal rate

Actions: turn off recommendation display for payment type with high reversal rate, update policy gateway, reconcile AI action log to case management system daily.

9.4 Customer service agent assistant

Use case:

AI supports contact center employees with policy answers, call summaries, complaint detection and CRM follow-up task creation.

ControlContinuous testKRI
Approved knowledge sourceCustomer-facing answer cites approved sourceunsupported answer rate
Complaint escalationComplaint indicators route to complaint workflowmissed complaint escalation rate
CRM write approvalHuman approval token required for CRM taskwrite without approval count
Summary accuracy sampleSummaries reviewed against transcriptmaterial omission rate
PII redactionTelemetry redacts sensitive fieldsPII leakage count
Staff overrelianceMonitor low-edit acceptance in high-risk scenariosblind acceptance rate

Actions: limit assistant to read-only mode during complaint SOP index issue, retrain agents when blind acceptance increases, add exact complaint language scenarios to eval and sampling.


10. Exception, Action and Evidence Design

Severity:

SeverityDefinitionRequired response
CriticalCustomer harm, unauthorized action, regulatory breach or material evidence failureImmediate containment, executive notification, incident assessment
HighMaterial control failed but impact appears boundedOwner action within SLA, weekly review until closure
MediumControl weakness or threshold breach with limited impactAction plan and trend monitoring
LowDocumentation, evidence freshness or minor process gapBatch remediation and monthly review

Lifecycle:

Detect
  -> classify severity
  -> assign owner
  -> contain if needed
  -> analyze root cause
  -> define action
  -> implement fix
  -> retest control
  -> close with evidence
  -> feed learning into eval, control or training

Closure requires:

  • Root cause documented.
  • Affected population identified.
  • Customer or operational impact assessed.
  • Corrective action implemented.
  • Control retest passes.
  • Evidence archived.
  • Repeat-failure prevention defined.
  • Residual risk accepted by the right owner if not fully remediated.

Evidence categories:

CategoryExamples
Design evidencecontrol objective, risk mapping, process map, policy mapping
Implementation evidenceconfiguration, policy-as-code, tool permission, prompt registry
Operating evidencetrace events, control test results, sampling records, dashboard snapshots
Exception evidenceexception record, root cause, action plan, approval, retest result
Management evidencereview minutes, risk acceptance, investment decision, control rating
Audit evidenceindependent review, reperformance result, evidence quality score

Evidence contract fields:

trace_id
event_time
use_case_id
risk_tier
control_id
control_test_id
release_id
model_version
prompt_version
rag_index_version
tool_contract_version
policy_ruleset_version
human_review_status
policy_decision
customer_impact_category
evidence_uri
retention_class
redaction_status
control_owner

Good CCM does not mean storing every raw prompt forever. Store structured evidence where possible, control raw prompt retention by risk and privacy policy, hash identifiers for analytics, use stricter access for lending/AML/fraud/complaint records, and make sampling artifacts retrievable without exposing unnecessary PII.


11. Templates with Filled Examples

11.1 Control test card

control_id: AI-LEND-EXP-002
control_name: Lending explanation must match approved decision reason codes
business_process: unsecured_personal_loan_adverse_action
risk_statement: Customer receives inaccurate or misleading explanation for credit decision
control_owner: Head of Credit Operations Controls
control_activity: Explanation generator uses only approved reason codes and approved policy text
test_method: automated_reason_code_reconciliation plus weekly stratified human sample
population: all AI-generated lending explanations
cadence: real_time automated check and weekly human review
pass_rule: output_reason_codes are subset of decision_engine_reason_codes and policy citation is current
failure_severity: high for mismatch, critical for customer-visible prohibited explanation
kri_link: KRI-AI-LEND-003 reason-code mismatch rate
evidence_source: ai_explanation_events, decision_engine_events, citation_validation_results
management_action: stop customer-visible explanation path when mismatch rate exceeds red threshold

11.2 Exception record

exception_id: EX-AI-LEND-2026-0630-004
opened_on: 2026-06-30
severity: high
control_id: AI-LEND-EXP-002
detected_by: automated_reason_code_reconciliation
finding: 17 explanations contained one reason not present in the decision engine record
affected_population: unsecured personal loan applications in online channel
immediate_containment: customer-visible AI explanation disabled for affected product
owner: Head of Credit Operations Controls
root_cause: prompt v22 allowed summarization of bureau factors outside approved reason-code list
corrective_action: revert to prompt v21, add reason-code schema validation, expand eval set with 80 cases
retest_result: zero mismatches in 1,200 replayed explanations
closure_evidence: evidence://exceptions/EX-AI-LEND-2026-0630-004/closure-pack
closed_on: 2026-07-03

11.3 Monthly control effectiveness memo

# Monthly AI Control Effectiveness Review: Customer Service Agent Assistant

Review period: 2026-06-01 to 2026-06-30
System: customer_service_agent_assist
Risk tier: Tier3 for CRM write-enabled workflow
Overall rating: Effective with monitored exceptions

Key results:
- CRM write approval control passed for 100% of 18,442 write events.
- Unsupported citation rate was 0.31%, within amber threshold.
- Complaint escalation miss rate improved from 0.09% to 0.04%.
- Evidence completeness was 99.72%, with missing fields concentrated in fallback route events.

Open exception:
- EX-AI-CS-2026-0628-011: fallback route missing rag_index_version in 47 traces, due 2026-07-05.

Management actions:
- Knowledge Operations will retest complaint SOP retrieval after index patch.
- Architecture will add schema enforcement for fallback route telemetry.
- Operations will increase complaint-sensitive call sampling in July.

Conclusion:
Controls operated within appetite for the review period, with one telemetry completeness exception under active remediation.

11.4 Quarterly owner attestation

# Quarterly AI Control Owner Attestation

Control owner: Head of Customer Service Operations Controls
Quarter: 2026 Q2
Controls covered: AI-CS-CRM-003, AI-CS-RAG-004, AI-CS-COMP-006, AI-PRIV-LOG-002

Attestation:
I reviewed control test results, exceptions, management actions, incident linkages and evidence completeness. Based on the evidence reviewed, the controls operated within approved risk appetite for 2026 Q2 except for the documented exception below.

Exception requiring continued monitoring:
- EX-AI-CS-2026-0628-011: fallback route missing rag_index_version, due 2026-07-05.

Required improvements:
- Enforce telemetry schema for fallback routes.
- Increase complaint-sensitive sample size from 100 to 200 per week for July.

Residual risk:
Accepted through 2026-07-15 for telemetry completeness gap in fallback route, limited to read-only answer flow and excluding CRM write events.

12. 30-Day Lab

Goal:

Build a portfolio-ready AI Continuous Control Monitoring pack for one financial retail AI use case.

Recommended scenario:

Customer service agent assistant with RAG answers, call summaries, complaint detection and human-approved CRM follow-up task creation.

DayWorkOutput
1Choose use case and process boundaryuse case scope memo
2Identify customer outcomes and operational decisionsdecision and outcome map
3List 12 material AI risksrisk register excerpt
4Define 10 control objectivescontrol objective table
5Assign control owners and RACIowner matrix
6Map controls to NIST AI RMF functionsGovern / Map / Measure / Manage mapping
7Write one-page executive framingCCM executive summary
8Convert controls to test methodscontrol test catalog
9Define runtime, test and exception schemasschema examples
10Define evidence fields and retention classesevidence contract
11Design automated checks for tool write, citation and telemetry completenesstest rule spec
12Design stratified sampling plansampling methodology
13Define pass/fail thresholds and severityseverity matrix
14Run tabletop test using 20 synthetic eventstest result walkthrough
15Define 15 KRIs with owners and actionsKRI catalog
16Design executive dashboarddashboard wireframe in markdown
17Design operations dashboardoperational metrics spec
18Design risk/compliance dashboardrisk view spec
19Write exception lifecycleexception workflow
20Write management action rulesaction matrix
21Draft monthly effectiveness memomemo sample
22Map lifecycle gates from intake to periodic reviewgate checklist
23Add AML, lending, payments and customer service variationsscenario appendix
24Write quarterly owner attestationattestation sample
25Create evidence archive indexevidence map
26Add SR 26-2 nuance and GenAI/agentic governance positionregulatory nuance note
27Review distinction from control library and evidence binderpositioning note
28Write interview 30-second and 2-minute answersinterview pack
29Assemble final portfolio packPDF-ready markdown package
30Conduct self-review against rubricfinal quality checklist

Rubric:

DimensionStrong answer
Control thinkingConverts risks into control objectives, not generic metrics
TestabilityEvery material control has population, cadence, pass rule and evidence
Financial realismHandles AML, lending, payments, complaints, privacy and operations
ArchitectureShows telemetry, schema, test engine, exception workflow and evidence store
Management actionRed thresholds trigger named actions and owners
Regulatory nuanceUses SR 26-2 as current anchor and broader AI governance for GenAI/agents
Interview readinessExplains distinction from control library and audit binder clearly

13. Interview Answers

13.1 What is AI continuous control monitoring?

30-second answer:

AI continuous control monitoring is the operating system for proving AI controls remain effective after launch. It links control objectives to runtime events, automated tests, sampling, KRIs, exceptions, owners, management actions and evidence. It is different from a static control library because it tests whether controls actually operate over time.

2-minute answer:

In financial retail, AI risk does not stop at release. A lending explanation prompt can drift, a RAG index can lose a critical policy, a customer service agent can gain write access, or a human review queue can become overloaded. Continuous control monitoring turns these risks into recurring control tests. For each material control, I define the owner, population, cadence, evidence fields, pass/fail rule, threshold, exception severity and management action. Then I instrument runtime events so traces carry model, prompt, index, tool, policy, human review and release metadata. Automated checks catch completeness, authorization, threshold and evidence gaps; risk-based sampling catches nuanced customer harm. Dashboards show KRI trends, exception aging, repeat failures and action closure. The goal is operating effectiveness over time, not a one-time audit screenshot.

13.2 How is this different from an audit evidence binder?

An audit evidence binder organizes proof for review. Continuous control monitoring generates and tests that proof continuously. The binder answers “what evidence do we have”; CCM answers “did the control operate, did it pass, what failed, who owns the action and is effectiveness improving or degrading.”

13.3 How would you design control tests for an agentic AI system?

I would test the full behavior chain: prompt, model route, retrieval, policy decision, tool permission, human confirmation, system-of-record write, telemetry and exception handling. For every write-enabled tool, I would require authorization tests, idempotency or compensating control evidence, approval-token completeness, reconciliation to the target system and reversal-rate KRIs.

13.4 What KRIs matter for customer-facing financial AI?

I would track unsupported answer rate, complaint spike, adverse action reason mismatch, tool write reversal, missing human review, PII telemetry failure, fallback route without policy decision, override spike, exception aging, evidence completeness and repeat control failures. The point is not to collect many metrics, but to connect each KRI to risk appetite and action.

13.5 How should SR 26-2 affect GenAI governance?

SR 26-2 superseded SR 11-7 and SR 21-8 on 2026-04-17, so I would not cite SR 11-7 as the current anchor. I would keep the useful principles of risk-based governance, validation, ongoing monitoring and effective challenge, but govern GenAI and agentic AI through a broader AI assurance architecture that also covers NIST AI RMF, ISO 42001, operational controls, privacy, security, third-party risk, customer harm and incident response.

13.6 What would you show in a portfolio?

I would show a CCM pack for one financial retail AI use case: control registry, test catalog, event schema, KRI dashboard, exception workflow, RACI, lifecycle gates, sample monthly effectiveness memo and one incident-to-control-improvement example. That demonstrates PM, BA and architecture depth because it connects customer outcomes, controls, telemetry and management action.


14. Anti-Patterns and Self-Check

Anti-patternWhy it failsBetter practice
Monitoring only latency and costMisses customer harm and control failureAdd control KRIs and evidence completeness
Using control library as proofDesign does not prove operationRun recurring tests and preserve results
Dashboard without ownersRed indicators do not create actionBind each KRI to owner, threshold and action
All sampling is randomRare high-risk failures disappearUse risk-stratified sampling
Closing incident without control updateSame failure returnsFeed root cause into eval, control and training
Treating HITL as magicHumans can be overloaded or overtrust AIMonitor queue, edits, overrides and blind acceptance
Ignoring telemetry schemaEvidence cannot support audit or replayDefine event fields before release
Forcing agents into old SR 11-7 framingTool use, autonomy and workflow risk are under-modeledUse SR 26-2 plus broader AI control assurance

Before claiming maturity, confirm:

  • Every material AI risk has a control objective.
  • Every material control has an owner and test method.
  • Each test has population, cadence, pass rule and severity.
  • Runtime events carry the fields needed for testing.
  • Evidence is redacted, retained and retrievable.
  • KRIs have thresholds and named management actions.
  • Exceptions have due dates, owners, root cause and retest evidence.
  • Sampling is risk-stratified for high-impact use cases.
  • Incidents feed back into controls, evals and training.
  • Management review uses trends, not only point-in-time status.
  • SR 26-2 is cited as the current MRM anchor, with GenAI and agentic AI governed through broader AI assurance.

15. One-Page Summary

AI Continuous Control Monitoring is where AI governance becomes operational. The control library says what should be controlled; the evidence binder organizes proof; the release gate decides whether a version can go live. CCM asks every day whether the control ran, passed, failed, triggered an exception, received management action, was retested, and can be proven. For a CBAP-level financial retail PM/BA/architect, the differentiator is translating AI behavior into operational controls, tests, KRIs, exceptions, evidence and management decisions that hold up under product pressure, incident pressure and audit pressure.