返回 Papers
AI 扩展计划 / Playbooks

AI Management Information / Board Reporting Playbook

弱 AI board update 通常汇报 use case 数、模型调用量、满意度、预计节省小时和 "没有重大事故"。董事会和审计委员会真正需要的是:

636AI_MANAGEMENT_INFORMATION_BOARD_REPORTING_PLAYBOOK.md

AI Management Information / Board Reporting Architecture Playbook

适用对象: CBAP-level Financial Retail PM / Senior BA / AI Product Architect / Enterprise Architect / Risk Product Lead / Model Risk Partner / Board Reporting Lead。 目标: 把 AI telemetry、业务价值、风险偏好、控制有效性、事件、客户伤害、采用、成本、供应商集中度和管理行动转成可追溯、可验证、可决策的 Management Information。 核心观点: Board pack 是展示层; MI architecture 是 metric contracts、lineage、thresholds、data quality、cadence、RACI、validation 和 action log 的操作系统。 边界说明: 本文是学习、作品集和架构设计材料, 不构成法律、监管、审计、模型验证、财务确认或董事会治理意见。正式项目必须由 business owner、risk、model risk、legal、compliance、privacy、security、finance、data、architecture、operations 和 internal audit 共同确认适用要求。


1. Executive Framing

弱 AI board update 通常汇报 use case 数、模型调用量、满意度、预计节省小时和 "没有重大事故"。董事会和审计委员会真正需要的是:

Which AI systems are material?
What value is proven, not estimated?
Which risks are outside appetite?
Which controls are working or failing?
What customer harm occurred?
Where are model/vendor/data dependencies concentrated?
What management actions are overdue?
What decision is required this quarter?

AI MI architecture 的目标:

AI fact capture
  -> metric contract
  -> lineage and quality control
  -> risk appetite threshold
  -> management view
  -> board / audit committee decision
  -> action log and evidence closure

Governance defines who should challenge and decide. MI architecture defines what facts they can rely on, where those facts came from, and what action follows.

PrinciplePractical meaning
Decision-first每个指标必须支持 approve、scale、hold、stop、fund、remediate、accept risk、escalate 或 retire。
Lineage-first每个 board number must trace back to system-of-record facts, not slide calculations.
Risk-appetite-awarered/amber/green thresholds must map to appetite, tolerance, stop rule or escalation trigger.
Balanced scorecardvalue, adoption, risk, control, incident, harm, cost, resilience and concentration are reported together.
Action-linkedamber/red without owner, due date and closure evidence is incomplete MI.
Cadence-fitsevere incidents and appetite breaches do not wait for quarterly board packs.
Audit-readyaudit can reconstruct selected numbers, actions and attestations without relying on oral explanation.
Executive narrative example:
This quarter, management reports 22 registered AI systems, of which 7 are material.
Portfolio residual risk is Medium with one Red appetite breach in vendor concentration.
No severe AI incident occurred; two medium customer-service harm events were contained and remediated.
Qualified AI value events increased 18%, but only three systems have finance-recognized benefits.
Decision requested: approve targeted platform funding for evidence automation and require remediation before expanding customer-facing GenAI scope.

2. Source Anchors

访问日期按 2026-06-30 记录。

AnchorOfficial link本 playbook 中的用法
NIST AI RMFhttps://www.nist.gov/itl/ai-risk-management-framework用可信 AI 和 AI risk lifecycle 语言定义 AI MI 的风险、影响、治理、测量和管理闭环。
NIST AIRC AI RMF functionshttps://airc.nist.gov/airmf-resources/airmf/用 Govern / Map / Measure / Manage 设计 board MI taxonomy、metric owners 和 action loop。
ISO/IEC 42001https://www.iso.org/standard/42001用 AI management system、performance evaluation、management review、continual improvement 连接 MI 到企业 AIMS。
Federal Reserve SR 26-2https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm用 2026 revised model-risk guidance 的 risk-based、materiality、inventory、monitoring、governance 和 documentation 语言校准金融机构报告。
OCC Model Risk Management Handbook legacy linkhttps://www.occ.gov/publications-and-resources/publications/comptrollers-handbook/files/model-risk-management/index-model-risk-management.html作为 legacy context: 该路径现在重定向, 不应当作为当前唯一依据; 当前模型风险锚点应转向 SR 26-2 / OCC 2026 guidance。
SR 26-2 nuance:
Topic2026 implication for MI
------
SupersessionSR 26-2 supersedes SR 11-7 and SR 21-8, so board reports should not cite those as current primary anchors without context.
Risk-based approachMI should show materiality, exposure, use, risk tier and proportional oversight, not uniform annual-review theater.
Institution relevanceIt is most relevant to banking organizations over $30B in total assets, while smaller firms may still adopt it for complex model-risk exposure.
Model definitionTraditional model-risk MI should distinguish complex quantitative models from deterministic rules and software without statistical/economic/financial theory.
GenAI / agentic AIThe guidance explicitly excludes generative and agentic AI from formal scope because they are novel and rapidly evolving.
Not a free passExcluded systems still need enterprise governance, controls, telemetry, incidents, evidence and management reporting.
Do not force every GenAI board metric into SR 26-2 terminology. Do create a parallel AI MI discipline that mirrors rigor: inventory, materiality, controls, monitoring, incidents, concentration, action tracking and attestation.
Source lensMI artifactBoard-useful output
---------
NIST Governowners, policy, accountability, oversight cadencewho owns AI risk and what forum acts
NIST Mapuse case inventory, impact map, dependency mapwhat systems and affected stakeholders are in scope
NIST Measureevals, controls, telemetry, incidents, metricshow management knows risk and value facts
NIST Manageactions, remediation, exceptions, stop ruleswhat management is doing about amber/red signals
ISO 42001management review, performance evaluation, continual improvementrecurring AIMS report and improvement loop
SR 26-2model inventory, materiality, monitoring, validation, governancemodel-risk-aligned evidence for quantitative models and reference discipline for broader AI

3. MI Architecture

AI products / workflows
  -> event and telemetry capture
       - model / agent gateway
       - RAG retrieval and citation logs
       - tool action logs
       - workflow and human approval events
       - eval, QA, red-team and control test results
       - incidents, complaints, appeals and remediation
       - cost, adoption and value events
       - vendor, model and dependency registry
  -> MI data product layer
       - canonical AI event model
       - metric contracts
       - data quality checks
       - lineage graph
       - threshold and appetite rules
       - entitlement, retention and privacy controls
  -> MI marts / dashboards
       - executive AI portfolio
       - board and audit committee pack
       - risk appetite dashboard
       - control effectiveness dashboard
       - value realization dashboard
       - incident and customer harm dashboard
       - concentration and resilience dashboard
       - management action log

Canonical AI MI event model:

EntityMinimum fields
AI systemsystem_id, name, owner, risk_tier, materiality, stage, approved_boundary, business_capability
AI exposure eventevent_id, system_id, workflow_id, user_or_case_id, timestamp, AI_role, channel, eligible_flag
Model callcall_id, event_id, provider, model_id, model_version, route, latency, cost, fallback_flag
Retrieval eventretrieval_id, event_id, source_doc_id, source_version, effective_date, citation_status, ACL_decision
Tool actionaction_id, event_id, tool_name, permission_scope, side_effect, approval_required, approval_status
Human reviewreview_id, event_id, reviewer_role, decision, override_reason, review_time, escalation_flag
Eval / QA resultresult_id, system_id, metric_name, sample_id, pass_fail, severity, reviewer, evidence_link
Incident / harmincident_id, system_id, severity, customer_impact, AI_attribution, containment_time, remediation_status
Value eventvalue_event_id, event_id, quality_pass, risk_pass, adoption_signal, business_outcome, finance_status
Dependencydependency_id, type, provider, criticality, substitutability, concentration_group, owner
Management actionaction_id, metric_id, issue, owner, due_date, status, evidence_link, closure_approver
Data product responsibilities:
LayerResponsibility
------
Captureinstrument gateways, workflow systems, QA tools, incident systems and finance baselines
Contractdefine canonical schemas, event grain, metric contracts and threshold metadata
Qualityvalidate completeness, freshness, uniqueness, referential integrity and threshold logic
Lineageconnect metric to source events, query version, transformation and report tile
Semanticexpose approved metric names, definitions, dimensions and aggregations
Reportinggenerate management dashboards, board packs, audit extracts and action reports
Assurancesample test, reconcile, attest and archive MI evidence
Metric contract fields:
FieldRequired content
------
Metric ID and namestable identifier and name used across dashboards and board packs
Decision purposescale, hold, stop, fund, remediate, accept risk, escalate
Definitionplain language definition
Numerator / denominatorexact inclusion rules and population
Grain and dimensionsresponse, case, customer, system, period; business line, risk tier, model, vendor
Source systemsauthoritative systems and fallback sources
Refresh cadencereal time, daily, weekly, monthly, quarterly
Quality rulescompleteness, freshness, duplication, reconciliation, sample validation
Thresholdsgreen, amber, red, breach and stop rule
Owner and evidencebusiness owner, risk owner, data owner, technology owner; query, sample, sign-off
Versioningdefinition version, effective date, change approver

4. Metric Taxonomy

CategoryCore questionExample metrics
Portfolio scopeWhat AI systems are in scope?registered AI systems, material systems, production systems, shadow AI detections
Value realizationAre we getting proven value?qualified value events, finance-recognized benefit, cost per outcome, backlog reduction
AdoptionIs AI used in the right workflow?eligible workflow adoption, repeat adoption, accepted output rate, override reason mix
QualityIs AI behavior good enough?groundedness, unsupported claim rate, citation correctness, task success, eval regression pass
Customer harmAre customers harmed or remediated?AI-attributable complaints, appeal overturns, remediation count, regulated-topic error
Control effectivenessAre controls operating?control pass rate, HITL bypass, source freshness, logging completeness, exception aging
IncidentsWhat went wrong?severity count, containment time, repeat root cause, incident-to-action closure
Risk appetiteAre we within approved appetite?red/amber/green by risk category, appetite breaches, accepted residual risk
ConcentrationAre dependencies over-concentrated?risk-weighted exposure by model/vendor/RAG/evidence stack, fallback coverage
Cost and resilienceCan we operate and recover?cost per value event, p95 latency, SLO breach, kill switch time, recovery drill pass
Governance healthIs management acting?overdue actions, expired exceptions, attestation coverage, audit finding aging
Value metrics should replace vanity metrics:
Weak metricStrong MI metric
------
model callsqualified AI-assisted workflow completions
generated summariesaccepted summaries with no critical QA defect
users activatedeligible users reaching repeat workflow adoption
estimated hours savedfinance-recognized capacity release or SLA improvement
features shippeduse cases passing release, adoption and benefits gate
AI exposure
  -> accepted AI output
  -> quality and risk pass
  -> workflow outcome improvement
  -> cost within boundary
  -> finance / business owner recognition

Risk, control, harm and concentration metrics:

AreaMetricDecision use
Hallucinationunsupported claim ratescale/hold for RAG and assistant systems
Wrong decision supportunsupported recommendation raterestrict scope or add review
Fairness / conductsegment error or appeal overturn disparityrisk committee challenge
Privacyrestricted data exposure countimmediate containment
Securityprompt injection successful exploit raterelease block or control remediation
Policy driftsuperseded document citation rateknowledge governance remediation
Control operationHITL bypass, source freshness, trace completenessauditability and scale readiness
Customer harmAI-attributable complaints, remediation count, appeal overturnboard harm and remediation view
Concentrationtop model/vendor exposure, shared evidence dependency, fallback coverageportfolio risk appetite and diversification
Adoption metrics must distinguish active usage from workflow adoption:
MetricBetter definition
------
active userseligible users who used AI in the target workflow at least N times
repeat adoptionusers who use AI in target workflow across multiple periods
accepted output rateoutputs accepted or lightly edited after quality review
override reason mixwhy humans reject or change AI output
review burdenadded review time and queue load caused by AI
abandonmenteligible cases where AI was available but intentionally bypassed

5. Data Lineage

Business event
  -> AI exposure event
  -> model / RAG / tool trace
  -> human or system decision
  -> QA / eval / incident / outcome record
  -> metric calculation
  -> dashboard tile
  -> board statement
  -> management action
  -> closure evidence
QuestionExpected answer
What is the event grain?response, case, customer, transaction, model call, value event
What source systems feed it?named systems, tables, APIs, owners
What is the calculation version?query version, semantic-layer metric version
What population is excluded?exclusions by risk tier, channel, geography, test traffic, missing QA
How is data quality checked?completeness, freshness, uniqueness, reconciliation, sample review
Who signed off?business, risk, data and technology owners
What changed since last period?metric definition, threshold, source, product scope, model version
Can audit replay a sample?yes, with trace, evidence and action links
Example lineage:
Board tile: Unsupported claim rate for Customer Service RAG = 1.6%, Green.
metric_id = MI-AI-QUAL-003
definition_version = v1.2
source systems = model_gateway_trace, rag_retrieval_log, qa_review_tool
denominator = regulated-topic AI-assisted responses sampled by QA
numerator = sampled responses judged unsupported by approved source
quality checks = 99.2% trace completeness, 100% QA reviewer assigned, 0 duplicate response_id
threshold = Green <= 2%, Amber >2% <=3%, Red >3%
owner = Head of Customer Operations QA
Data quality ruleExample threshold
Trace completeness>= 99% material AI events contain system_id, model_id, timestamp, user/case id, risk tier
Source freshness>= 99% regulated policy documents indexed within SLA
Unique event ID100% uniqueness for event_id in reporting period
Metric denominator reconciliationdashboard denominator reconciles to workflow system within 1%
Incident severity completeness100% incidents have severity, attribution, containment and owner
Action log completeness100% amber/red metrics have action owner or documented risk acceptance
Sample evidence availability>= 95% sampled board metrics have replayable evidence
Lineage failureSymptom
------
slide-only calculationnumber exists only in spreadsheet
denominator driftscope changes but trend line continues
late incident capturequarter-end report misses events
inconsistent system IDssame AI use case has multiple names
no action lineagered metric closed informally

6. Risk Appetite Dashboard

Appetite categoryBoard questionStatus example
Customer harmAre AI systems causing unacceptable customer harm?Amber: two medium contained events
Regulated decision boundaryIs AI making or implying unapproved final decisions?Green: 0 boundary bypasses
Control effectivenessAre key controls operating within threshold?Amber: source freshness breach
Incident severityAny severe or repeated incidents?Green: 0 severe, 2 medium
Vendor/model concentrationIs dependency concentration acceptable?Red: one model supports 72% material GenAI exposure
AuditabilityCan management reconstruct AI-assisted decisions?Amber: 94% sample reconstructability
Value realizationAre benefits proven enough to scale?Amber: 3 of 7 material systems have recognized benefit
Cost/resilienceIs cost and performance inside boundary?Green: cost per case under cap, p95 latency stable
MetricGreenAmber
------:---:
Severe AI incidents0not applicable
Unsupported claim rate<= 2%> 2% and <= 3%
HITL bypass count01 contained low-risk bypass
Audit trace completeness>= 99%95% to 98.9%
Vendor concentrationwithin approved exposureexceeds soft limit with fallback
Finance-recognized benefit>= 80% target50% to 79% target
Risk appetite statement:
The institution has no appetite for unapproved fully automated final decisions in credit, AML, fraud or wealth workflows; severe customer harm without immediate containment; restricted data exposure to unapproved AI tools; expansion of material AI where audit trace completeness is below 95%; or expired high-risk exceptions without risk committee review.
The institution has limited appetite for contained medium incidents, controlled pilots under stop rules, and temporary concentration risk where fallback and exit roadmap are funded.

Dashboard view:

CategoryMetricQ1Q2StatusAction
Customer harmAI-attributable medium incidents12AmberClose source freshness remediation
Decision boundaryfinal decision bypass00GreenContinue monitoring
Controlsource freshness SLA99.1%97.8%AmberDaily regulated-policy refresh
Auditabilitytrace completeness98.2%99.1%GreenMaintain threshold
Concentrationtop model material exposure61%72%RedApprove diversification roadmap
Valuesystems with finance-recognized benefit2/63/7AmberBenefits contract for remaining 4

7. Board Pack Templates

Quarterly AI MI cover page example:

# Quarterly AI Management Information Report
Reporting period: Q2 2026
Prepared for: Board Risk Committee / Audit Committee
Decision requested: approve funding for evidence automation and model diversification.
Executive conclusion: Portfolio residual AI risk is Medium, with one Red concentration breach. Material AI systems: 7. Severe incidents: 0. Medium AI-attributable customer harm events: 2, contained and under remediation. Finance-recognized benefits: 3 systems. Critical open actions: 1 overdue high action with approved extension.
Decisions required: approve platform funding; require remediation before expanding customer-facing GenAI to regulated fee and dispute categories; endorse updated concentration limit.

Portfolio summary:

SystemStageRisk tierAI roleValueRiskControlDecision
Customer Service RAGProductionHighdraft answerGreenAmberAmberScale low-risk intents only
Credit Memo AssistantPilotHighdraft memoAmberHighAmberHold before scale
AML CopilotLimited releaseHighdraft case summaryGreenMediumGreenExpand to second low-risk queue
Fraud TriagePilotHighprioritizeAmberHighAmberContinue pilot with appeal monitoring
Branch KnowledgeProductionMediumanswer staffGreenMediumGreenContinue
AI GatewayProductionEnabling controlroute/log/controlGreenMediumGreenFund evidence export
Board decision memo example:
Decision requested: approve limited expansion of Customer Service RAG to two additional low-risk product lines.
Evidence: qualified value events 74%; AHT reduction 18%; unsupported claim rate 1.6%; AI-attributable complaints flat to baseline; source freshness SLA 97.8% amber; top model exposure 72% red.
Conditions: no expansion to regulated fee, dispute or credit explanation categories until source freshness and cross-use-case regression pass; model diversification roadmap funded; weekly MI for 8 weeks.
Stop triggers: unsupported claim rate above 3%; any severe AI-attributable customer harm; HITL bypass in regulated topics; trace completeness below 95%.

Audit committee evidence requests:

Evidence requestRequired MI artifact
Show material AI inventoryAI system register with materiality, owner, risk tier and stage
Reconstruct sampled board metricmetric contract, lineage, source data, query version, dashboard tile
Prove control operationcontrol test result, sample evidence, owner sign-off, exception log
Prove incidents were reported completelyincident taxonomy, case list, severity assignment, late adjustment rule
Prove customer harm remediationcomplaint link, remediation amount/action, customer communication, closure approval
Prove management action closureaction log, evidence link, closure approver, residual risk update
Prove vendor concentrationdependency graph, risk-weighted exposure, fallback evidence
Board question bank:
Board questionStrong MI answer pattern
------
Are we within AI risk appetite?"Six of eight appetite categories are green, two amber, one red. The red is model concentration, with a funded remediation decision requested today."
How do we know the numbers are reliable?"Each board metric has a contract, owner, source lineage, quality checks and audit sample pack. Definition changes are versioned."
What harm did AI cause customers?"Two medium AI-attributable harm events occurred, both due to stale policy citations, affecting 31 customers, with no financial loss and remediation completed."
Are controls working?"Key controls are tested through QA, logs and workflow samples. Citation correctness and HITL approval are green; source freshness is amber."
Are we getting value?"Three systems have finance-recognized benefits. Others remain in pilot or lack sufficient adoption evidence, so no scale decision is requested for them."
Where are we over-concentrated?"One model supports 72% of material GenAI exposure and the evidence path for three systems. This is above hard appetite without additional fallback."

8. Management Action Log

Management reporting is incomplete if it only shows status.

signal -> issue classification -> management action -> owner and due date
  -> evidence of completion -> validation -> residual risk update -> closure or escalation
FieldDescription
action_idstable ID linked to metric, incident, audit finding or risk acceptance
source_signalred/amber metric, incident, audit finding, exception expiry, board challenge
issue and severityclear issue statement with critical, high, medium or low severity
owner and due_dateaccountable executive or delegate and risk-aligned date
statusopen, blocked, pending validation, closed, escalated
evidence_requiredconcrete closure evidence
validation_ownerrisk, audit, data, technology or business validator
residual_riskupdated after action
escalation_pathforum and date if overdue
Sample action log:
IDSource
------
ACT-001Source freshness amber
ACT-002Concentration red
ACT-003Audit finding
ACT-004Customer harm incident
Closure standard: evidence exists in system of record; metric/control retest passes; owner attests completion; validator confirms adequacy; residual risk is updated; next-cycle MI reflects closure.

9. Report Validation

ObjectiveValidation question
completenessAre all material systems and reportable incidents included?
accuracyDo calculations match metric contracts?
lineageCan each board number trace to source data and query version?
consistencyAre definitions stable across periods and business lines?
threshold correctnessAre red/amber/green statuses computed from approved thresholds?
action linkageDoes every breach have action, owner or risk acceptance?
evidence integrityCan audit replay sampled metrics and actions?
Pre-issue validation:
CheckPass condition
------
Material AI register reconciliationall production, pilot and retired-in-period material systems included
Metric contract coverage100% board KPIs/KRIs have approved contracts
Source freshnesssource systems refreshed within reporting cut-off
Data quality resultno critical quality rule failure unresolved
Trend comparabilitydefinition changes annotated or trend restated
Threshold approvalall thresholds approved and effective for period
Incident reconciliationincident, complaint and harm sources reconciled
Action log reconciliationall amber/red metrics map to action or risk acceptance
Evidence samplesample pack generated for high-risk metrics
Management sign-offbusiness, risk, tech, data owners sign relevant sections
Sample-based validation:
SampleValidation
------
5 material AI systemsinventory fields, owner, stage, risk tier, approved boundary
10 AI exposure eventstrace completeness, model ID, source docs, workflow link
5 QA failuresseverity, reviewer, source evidence, metric inclusion
5 incidentsseverity, AI attribution, customer harm, action log link
5 value eventsquality pass, risk pass, business outcome, cost inclusion
3 board metricsmetric contract, query version, denominator, threshold
3 closed actionsclosure evidence, validation, residual risk update
Restatement rule:
Restate a board metric when source-system error changes a board-level status, late incident capture changes severity or appetite status, metric definition was applied incorrectly, excluded population materially changes the decision conclusion, or data quality failure invalidates the period result. Record restatement reason, approver, impacted reports and management conclusion.

10. Cadence

CadenceAudienceContent
Real time / same dayincident commander, CRO/CIO, business ownersevere incident, boundary breach, privacy/security event, kill switch
WeeklyAI operations, platform, product, risk opstelemetry, incidents, quality, source freshness, SLO, actions
MonthlyAI governance committee, management steeringportfolio dashboard, risk appetite, value, control, concentration, actions
Quarterlyboard, audit committee, risk committeedecision-focused AI MI pack, attestation, residual risk, investment
Semiannualinternal audit, model risk, enterprise architectureevidence quality, dependency stress tests, report validation review
Annualboard strategy and risk appetite reviewAI strategy, appetite calibration, operating model maturity, investment roadmap
Trigger-based reporting:
TriggerReport
------
severe customer harmimmediate executive and board committee notification
unapproved final decision by AIimmediate risk/legal escalation and incident report
restricted-data exposuresecurity/privacy incident report
material vendor/model changemodel/vendor impact MI and regression status
concentration threshold breachrisk committee exception or remediation decision
audit evidence integrity failureaudit committee notification if material
repeated medium incidentsmanagement and board trend update
Quarterly cycle:
DayActivity
---:---
-15freeze metric contracts and reporting scope
-12reconcile material AI inventory and dependency register
-10refresh MI data marts and run quality checks
-8incident, complaint and customer harm reconciliation
-7generate risk appetite dashboard and action log
-6run sample validation and lineage tests
-5owner review: business, risk, tech, data, finance
-4management challenge session
-3draft board pack and audit evidence appendix
-2executive sign-off and attestation
-1distribute pre-read
0committee discussion and decisions
+3record board challenges, decisions and new actions
+10action owners confirm remediation plans

11. RACI

ActivityAccountableResponsibleConsultedInformed
AI MI strategyCRO / CIO jointlyAI Governance LeadBoard reporting, Legal, AuditBoard committee
Metric taxonomyAI Governance LeadPM / BA / Risk AnalyticsData, Finance, Model RiskBusiness owners
Metric contractsProduct / Risk ownerSenior BA + Data Product OwnerArchitecture, Audit, ComplianceAI committee
Telemetry architectureCTO / Data Platform OwnerProduct Architect + Platform EngineeringSecurity, Privacy, RiskProduct teams
Risk appetite thresholdsCRORisk, Compliance, Model RiskBusiness, Legal, AuditBoard risk committee
Value metricsBusiness ExecutivePM + Finance PartnerOperations, DataManagement steering
Control effectiveness metricsRisk ownerControl owner + QAInternal Audit, ComplianceAI governance
Incident and harm taxonomyOperational Risk ExecutiveIncident Management + Customer OpsLegal, Privacy, ComplianceBoard committee as needed
Concentration reportingEnterprise ArchitectAI Platform + Third-Party RiskProcurement, Model RiskRisk committee
Report validationData Governance LeadMI Reporting TeamInternal Audit, Risk, ProductAudit committee
Action logAI Governance LeadAction ownersRisk and Audit validatorsManagement and board
AttestationBusiness / Risk / Tech / Data executivesControl and metric ownersInternal Audit observerBoard / Audit Committee
LineMI responsibility
------
First lineoperate AI systems, define workflow outcomes, capture telemetry, own value and control operation
Second linedefine risk appetite, challenge metrics, approve thresholds, review exceptions and residual risk
Third linetest MI integrity, lineage, control evidence, report validation and management action closure
RoleWhat excellent looks like
------
PMcan explain why each board metric changes investment or scale decision
BAcan define the metric contract, event grain, business rule, threshold and exception flow
Architectcan show telemetry path, lineage, data product, control integration and evidence export
Data ownercan certify source quality, retention, privacy and semantic consistency
Risk ownercan map metric status to appetite, residual risk and escalation
Audit partnercan sample and reconstruct the report without relying on oral explanation

12. Templates

AI MI metric catalog:

Metric IDNameCategoryOwnerSourceThresholdCadence
MI-AI-SCOPE-001Material AI system countPortfolioAI Governance LeadAI registerreport-onlymonthly
MI-AI-VALUE-001Qualified value eventsValueProduct Ownerworkflow + gatewaytarget by systemweekly/monthly
MI-AI-HARM-001AI-attributable customer harm incident rateHarmCustomer Ops + Riskincident + traceappetite matrixweekly/quarterly
MI-AI-CTRL-001HITL bypass countControlOperations Control Ownerworkflow logszero for high-riskweekly
MI-AI-AUD-001Trace completenessAuditabilityPlatform Ownergateway logsgreen >= 99%weekly/monthly
MI-AI-CONC-001Top model exposureConcentrationEnterprise Architectdependency graphappetite limitmonthly
MI-AI-ACT-001Overdue high actionsGovernanceAI Governance Leadaction logzeroweekly/monthly
Concrete metric contract example:
FieldExample
------
Metric IDMI-AI-HARM-001
NameAI-attributable customer harm incident rate
Decision purposedetermine if customer-facing AI expansion remains within appetite
Definitioncustomer-impacting incidents where AI exposure materially contributed to harm
Numeratorconfirmed AI-attributable customer harm incidents in reporting period
Denominatoreligible AI-assisted customer interactions in reporting period
Grainincident_id and exposure_event_id
Source systemsincident platform, complaint platform, model gateway, workflow logs
Thresholdsgreen = no severe and rate within baseline; amber = contained medium incident; red = uncontained severe or repeat trend
OwnerCustomer operations executive and AI risk owner
Evidenceincident record, trace sample, postmortem, remediation record
Versionv1.0 effective 2026-07-01, approved by AI Governance Committee
Concrete attestation example:
For Q2 2026, management attests that the AI Management Information report covers all seven material AI systems in the approved register, applies approved metric contracts and thresholds, discloses the source freshness amber issue, reconciles reportable AI incidents and customer harm events, and links all amber/red appetite signals to management actions or approved risk acceptance. Signers are business owner, risk owner, technology owner, data owner, security/privacy owner and finance owner for benefit claims.

Concrete incident board update example:

Incident ID: AI-INC-2026-014.
Severity: Medium.
Systems affected: Customer Service RAG and Branch Knowledge Assistant.
Customer impact: 31 customer drafts contained stale fee-policy citation; no financial loss; remediation completed.
Root cause: regulated fee document missed daily source refresh.
Controls that worked: agent review prevented direct customer auto-send.
Controls that failed: superseded-document block did not cover one fee schedule.
Management actions: daily regulated-policy ingestion, block retired source IDs, QA retest by 2026-07-10.
Decision requested: no customer-facing direct response until source freshness returns to green for two cycles.

13. 30-Day Lab

Goal: produce a portfolio-ready AI MI and board reporting architecture artifact for a financial retail institution.

DayThemeOutput
1Select case portfolio5 AI systems: customer service, credit, AML, fraud, platform
2Define board decisionsscale, hold, stop, fund, remediate, accept risk decision list
3Build AI system registersystem_id, owner, risk tier, stage, AI role, materiality
4Define source systemstelemetry, workflow, QA, incident, complaint, finance, vendor
5Design canonical event modelexposure, model call, retrieval, tool, review, incident, value
6Draft metric taxonomyvalue, adoption, quality, harm, control, incident, concentration
7Review week 1one-page executive MI architecture narrative
8Write 5 metric contractsunsupported claim, harm incident, trace completeness, value event, concentration
9Define thresholdsgreen/amber/red and stop rules
10Define risk appetite statementsno appetite, limited appetite, appetite zones
11Draw lineage mapsource-to-metric-to-board-to-action
12Define data quality rulescompleteness, freshness, reconciliation, sample validation
13Build action log schemaaction fields, severity, escalation, closure
14Review week 2metric contract and lineage review
15Create portfolio dashboardsystems, risk, value, controls, actions
16Create risk appetite dashboardappetite category status and thresholds
17Create control effectiveness dashboardcontrol pass rate, exceptions, evidence
18Create harm and incident dashboardseverity, AI attribution, remediation
19Create concentration dashboardmodel/vendor/RAG/evidence exposure
20Draft board cover pagedecision requested, executive conclusion, key risks
21Review week 3board pack coherence test
22Build audit evidence checklistreport validation and sample pack
23Create quarterly cadencetimeline, cut-off, sign-off, restatement
24Build RACIfirst/second/third line and PM/BA/architect split
25Run synthetic metric experimentcalculate unsupported claim rate with lineage
26Write incident board updatestale policy or vendor model-change case
27Write management attestationqualified Q2 2026 version
28Write interview answers6 advanced Q&A
29Self-reviewremove vanity metrics, add lineage and action links
30Assemble portfolio artifactfinal MI architecture pack and 5-minute storyline
Completion standard: all board-level metrics have contracts; every metric has source lineage and threshold; every amber/red signal has action or risk acceptance; board pack asks for a decision; audit can reconstruct at least three sampled metrics.

14. Financial Retail Case

AI systemBusiness processRisk
Customer Service RAGservicing policy questionswrong commitment, complaint, privacy
Credit Memo Assistantunderwriter documentationfair lending, explanation, unsupported summary
AML Copilotalert investigationmissed suspicious activity, weak evidence
Fraud Triagecard fraud queuefalse positive, customer friction, loss
Branch Knowledge Assistantstaff operational supportinconsistent policy, stale source
AI Gatewayshared platform controlshadow AI, trace, cost, vendor concentration
Board concernMetricSource
---------
Valuequalified value events, finance-recognized benefitworkflow, cost ledger, finance baseline
Adoptioneligible workflow repeat adoption, output acceptanceuser telemetry, workflow eligibility
Qualityunsupported claim, citation correctness, summary defectQA, eval, RAG logs
HarmAI-attributable complaint, appeal overturn, remediationcomplaints, appeals, incident system
ControlHITL bypass, source freshness, trace completenessworkflow logs, source registry, gateway
Concentrationtop model/vendor exposure, fallback coveragedependency graph
Actionoverdue high actions, exception agingGRC/action log
Board narrative:
The AI portfolio is producing early value but is not ready for broad customer-facing autonomy.
Customer service and AML show stable value in human-supervised workflows.
Credit memo remains held due to explanation and fairness evidence gaps.
The largest portfolio risk is concentration: the same model and evidence export path support multiple material systems.
Management requests funding for evidence automation, source freshness controls and model fallback.
SystemDecisionReason
Customer Service RAGlimited scalevalue green, quality green, source freshness amber
Credit Memo Assistantholdfairness/explanation metrics not stable
AML Copilotlimited expansionvalue green, risk controls green, review capacity amber
Fraud Triagecontinue pilotfalse positive and appeal metrics need longer baseline
Branch Knowledgecontinuemedium risk, source freshness improving
AI Gatewayfundimproves auditability and concentration controls

15. Interview Answers

Q1: How do you design AI board reporting that is actually useful?

30 秒版本:

I start with board decisions, not charts. Each metric must support scale, hold, stop, fund, remediate or accept risk. Then I define metric contracts, source lineage, thresholds, owners, cadence and action logs. A board pack should be the presentation layer over a governed MI data product. 2 分钟版本: I first identify the decisions the board or audit committee needs to make: risk appetite, expansion, funding, remediation, residual risk acceptance or incident oversight. For each decision, I define metrics across value, adoption, quality, customer harm, control effectiveness, incidents, concentration, cost and auditability. Each metric gets a contract: numerator, denominator, grain, source systems, threshold, owner, quality rules and evidence. Then I design lineage from AI telemetry and workflow events to dashboard tiles and board statements. Finally, every amber or red metric links to a management action log.

Q2: What is the difference between AI governance and AI MI architecture?

30 秒版本:

Governance defines who oversees AI and what decisions they own. MI architecture defines the facts, definitions, lineage, thresholds and actions used in that oversight. Governance without MI becomes opinion; MI without governance becomes unused reporting.

Q3: How do you report AI value without falling into vanity metrics?

30 秒版本:

I avoid using model calls, feature count or generated content as primary value evidence. I use qualified value events: the workflow was eligible, AI was used, output was accepted, quality and risk passed, unit economics were within boundary, and the business or finance owner recognized the outcome.

Q4: How should SR 26-2 affect AI board reporting in 2026?

30 秒版本:

SR 26-2 supersedes SR 11-7 and SR 21-8 and moves toward risk-based, materiality-driven model risk management. It explicitly excludes generative and agentic AI from formal scope, so I would not misclassify every GenAI system as SR 26-2 governed. But I would use its discipline: inventory, materiality, monitoring, governance, documentation and action tracking, alongside broader AI risk frameworks.

Q5: How do you make AI control effectiveness visible to an audit committee?

30 秒版本:

I turn each control into an operating metric and evidence artifact. For example, "human-in-the-loop" becomes HITL bypass count, reviewer coverage, override reason, review queue age and sample evidence. Audit should be able to reconstruct whether the control operated, not just see that it exists in a policy.

Q6: What would you do if a board metric turns red?

30 秒版本:

A red metric triggers the pre-agreed action path: classify the issue, identify affected systems, assign owner, decide contain/hold/stop/escalate, record due date and evidence, then validate closure. If residual risk remains outside appetite, management needs risk acceptance or scope restriction.

Q7: How do you handle vendor concentration in board MI?

30 秒版本:

I report concentration as risk-weighted exposure by model, vendor, RAG platform, evidence stack and human review dependency. The board should see whether one dependency supports multiple material systems, whether fallback is tested, and whether exposure exceeds appetite.

Q8: How do you prove the board report is reliable?

30 秒版本:

I use report validation: reconcile the AI register, verify metric contracts, run data quality checks, test lineage, sample source records, validate threshold logic, reconcile incidents and actions, and obtain owner sign-offs. Audit should be able to replay sampled numbers end to end.


16. Common Pitfalls

PitfallWhy it failsBetter practice
Reporting AI projects instead of decisionsboard sees activity, not choiceslead with decisions requested
Using calls and users as valueusage may be rework or forced adoptionqualified value events and recognized outcomes
No denominator clarityrates and trends are misleadingmetric contracts with inclusion/exclusion
No MI lineageaudit cannot trust the reportsource-to-report lineage and sample packs
Controls reported as policiesno evidence of operationcontrol effectiveness metrics and samples
Incidents without harm taxonomycustomer impact hiddenseverity, attribution, remediation and reversibility
Thresholds copied from another firmappetite not institution-specificrisk appetite calibration and review
GenAI scope confusion under SR 26-2overclaim or under-governanceseparate formal scope but align governance discipline
No restatement ruleerrors silently correctedformal restatement and version history
Actions closed by assertionremediation not provenevidence, validation and residual risk update

17. Final Checklist

  • AI systems have stable IDs, owners, stages, risk tiers and materiality.
  • Board metrics have metric contracts.
  • Metric contracts include numerator, denominator, grain, source, threshold, owner and version.
  • Telemetry covers model, RAG, tool, human review, workflow, incident, harm, value and dependency events.
  • MI data product has quality checks, lineage and semantic layer.
  • Dashboards reuse approved metrics instead of slide-local calculations.
  • Thresholds map to appetite, stop rules or escalation.
  • Every amber/red signal links to action or approved risk acceptance.
  • Report validation samples metrics, incidents, actions and trace reconstructability.
  • Board pack asks for decisions, not passive awareness. CBAP+ mastery standard: you can define board-useful AI metrics without vanity measures, write metric contracts at business-rule precision, draw source-to-board MI lineage, map telemetry to value/risk/control/harm, connect thresholds to risk appetite and actions, explain SR 26-2 nuance without overclaiming GenAI scope, design an audit-ready evidence pack, and use MI to recommend scale, hold, stop, fund or remediate. Final memory card:
AI MI architecture is the control plane behind board reporting.
Board report = decision.
Metric contract = definition.
Lineage = trust.
Threshold = risk appetite.
Action log = management accountability.
Validation = audit readiness.