返回 Papers
AI 扩展计划 / Playbooks

AI Exception / Risk Acceptance / Waiver Playbook

AI Exception / Risk Acceptance / Waiver Architecture 是一套把“暂时不能满足标准 AI 控制”转成有业务理由、有风险归属、有补偿控制、有到期、有证据、有升级、有硬停止条件的治理机制。

792AI_EXCEPTION_RISK_ACCEPTANCE_WAIVER_PLAYBOOK.md

AI Exception / Risk Acceptance / Waiver Architecture Playbook

适用对象: 高级 AI Product Manager / AI BA / Product Architect / Enterprise Architect / Risk Partner / Model Risk Lead / Operational Risk Lead / Compliance / Privacy / Security / Third-Party Risk / Internal Audit / Board reporting owner。 目的: 训练金融零售 AI 团队如何管理政策例外、临时 waiver、剩余风险接受、补偿控制、到期续期、证据留存、升级路径和硬停止条件。重点不是基础 BA, 而是把 AI governance 做成可运行、可监控、可审计、可向高管和审计委员会说明的 exception control system。 核心观点: Risk appetite 定义组织的 AI 风险边界; exception / waiver architecture 管理的是在风险偏好和标准控制已经定义之后, 某个 use case 对标准控制的限时、限域、留证偏离。例外不能成为永久 shadow policy。

重要说明: 本文是学习、作品集和治理设计材料, 不是法律、审计、监管、模型验证或合规意见。正式项目必须由 business owner、legal、compliance、model risk、operational risk、security、privacy、third-party risk、technology、internal audit 和管理层结合机构、司法辖区、监管关系和内部政策确认。


1. Executive Framing

1.1 One-sentence positioning

AI Exception / Risk Acceptance / Waiver Architecture 是一套把“暂时不能满足标准 AI 控制”转成有业务理由、有风险归属、有补偿控制、有到期、有证据、有升级、有硬停止条件的治理机制。

An AI waiver is not permission to ignore controls.
It is a time-boxed, scope-bound, evidence-backed acceptance of residual risk.

1.2 Distinction from risk appetite

Layer解决的问题输出
Risk appetite组织愿意承担哪些 AI 风险, 哪些用途禁止, 哪些用途有条件允许风险偏好声明、risk tier、标准控制基线
Standard control catalog每个 risk tier 默认必须有哪些控制eval、model validation、HITL、DLP、tool gateway、monitoring、evidence
Exception / waiver某个 use case 暂时偏离标准控制时怎么办exception memo、risk acceptance、compensating controls、expiry、hard stop
Issue / incident控制失败或损害已经发生时怎么办incident response、RCA、remediation、customer correction

高级表达: Risk appetite is the baseline. Waiver management is the controlled deviation from that baseline. If the same waiver keeps renewing, the organization has unresolved control debt or an outdated policy baseline.

1.3 Why AI exception handling is special

AI 行为由 model route、prompt、RAG corpus、source registry、eval rubric、tool permissions、agent autonomy、human review capacity、privacy logging、security gateway、vendor terms、customer disclosure 和 remediation process 共同决定。

GenAI / agentic AI 的例外不能只归入模型风险。它需要同时组合 model risk、operational risk、consumer compliance、privacy、security、third-party、technology resilience、customer harm 和 audit evidence。

1.4 Executive question set

Executive questionGood answer requires
Which AI controls are being waived?control id、policy baseline、use case、risk tier
Why is the waiver necessary?business value、timing pressure、control debt, alternatives considered
Who accepts the residual risk?named role, delegated authority, approval record
How narrow is the exception?user, channel, geography, data, tool, model, traffic, duration
What compensating controls operate now?tested workflow controls, monitoring, review, fallback
What would force an immediate stop?hard stop triggers, kill switch, owner, runbook
When does it expire?fixed expiry date, renewal criteria, exit path
Is this becoming shadow policy?aging, renewals, repeat reason, remediation backlog
Can audit reconstruct the decision?evidence binder with versions, signoffs, logs and KRI history

2. Source Anchors

以下来源作为治理语言和证据结构锚点。本文把它们转成 AI 产品、架构、BA 和金融零售治理实践。访问日期按 2026-06-30 记录。

AnchorOfficial link本 playbook 使用方式
NIST AI RMFhttps://www.nist.gov/itl/ai-risk-management-framework用 Govern / Map / Measure / Manage 组织 AI exception 的背景、风险测量、控制补偿、持续监控、治理升级和证据闭环。
ISO/IEC 42001https://www.iso.org/standard/42001用 AI management system 视角把 exception 纳入 scope、role、operation、performance evaluation、management review 和 continual improvement。
Federal Reserve SR 26-2https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm作为 2026 年模型风险管理新锚点。SR 26-2 于 2026-04-17 替代 SR 11-7 和 SR 21-8; 对 AI / ML / GenAI / agentic AI 的 intended use、risk tier、effective challenge、monitoring 和 governance 有现实参考意义。
FFIEC IT Examination Handbook Management booklethttps://ithandbook.ffiec.gov/it-booklets/management.aspx用 board oversight、IT governance、risk management、third-party、change management、audit 和 management reporting 视角组织 exception management。

2.1 Current nuance

  • SR 26-2 替代 SR 11-7 和 SR 21-8, 因此 2026 年后的模型风险讨论不能只停留在旧 letter。
  • SR 26-2 把模型风险管理推向更明确的 risk-based tailoring、intended use、model inventory、monitoring、validation、effective challenge 和 governance。
  • 对 GenAI / agentic AI, exception handling 必须同时说明 operational risk、consumer compliance、privacy、security、third-party、tool autonomy 和 evidence gaps。
  • 任何长期续期的例外都要升级为 policy review、control investment 或 formal residual risk decision。

3. Exception Taxonomy

3.1 Baseline structure

Risk appetite
-> risk tier
-> standard control baseline
-> control gap
-> exception request

模糊说法: We need an exception for the AI assistant.

合格说法:

We request a 45-day exception from CTRL-EVAL-HIGH-003,
which requires full regression coverage for complaint and fee-dispute slices,
for a 5% employee-only pilot of the retail service copilot.
The exception excludes customer auto-send and all write-enabled tools.

3.2 Taxonomy by control domain

Exception typeControl gapExampleTypical compensating control
Eval coverageeval suite 不完整或新场景样本不足complaint slice 样本不足smaller pilot, daily QA, failed-case capture
Model validationindependent validation / challenge 未完成validation report due in 30 daysshadow mode, traffic cap, validation milestone
Model inventorymodel / prompt / RAG / tool 未完全登记prompt variants not in registryfreeze variants, manual register, block expansion
Data boundary数据分类、redaction、retention 证据不足prompt log redaction proof missingno sensitive data, DLP monitoring
Privacyconsent、purpose、retention 或 vendor terms 未完全证明transcript retention policy pending approvalrestricted dataset, no export
Securitygateway、RBAC、SIEM、prompt injection 或 tool permission 控制缺口missing deny reason in tool logread-only mode, extra SIEM alert
Third-partyvendor evidence、SLA、DPA、exit path 或 region routing 缺口vendor SOC report renewal in progressfallback route, lower traffic
Consumer compliancedisclosure、adverse action、UDAAP、complaint、recourse 相关缺口customer disclosure copy not fully approvedemployee-only release
Operational readinesshuman review queue、training、SOP、capacity 不达标review backlog risklower volume, queue KRI
Evidenceevidence binder 自动化或 trace coverage 不完整trace tags cover 92%, standard is 99%manual evidence pack
Change governanceartifact versioning、rollback、release approval 缺口RAG rollback drill not completedno auto-refresh, manual snapshot
Board visibilitymaterial exception 未进入 MI packhigh-tier exception approved locallyimmediate escalation

3.3 Taxonomy by AI autonomy

Autonomy levelException sensitivityTypical rule
Retrieve / summarizeLower but depends on data and citationShort waivers possible if employee-facing and low impact
RecommendMedium to highWaiver must preserve human decision and explainability
Draft customer communicationHighDisclosure, approved language and human send controls are difficult to waive
DecideHigh to criticalExceptions rare; prohibited uses remain no-go
Execute tool actionHigh to criticalWrite-enabled waiver requires tool gateway, approval token and rollback
Multi-agent orchestrationCritical when autonomousWaiver must cover delegation, identity, tool chain, monitoring and emergency stop

3.4 Non-waivable conditions

Do not approve when the request involves prohibited use, no accountable residual risk owner, no enforceable scope, no stop capability, no evidence path, known active harm without containment, unapproved sensitive-data exposure, or hidden final-decision automation.


4. Waiver Lifecycle

4.1 Lifecycle map

Exception trigger
-> intake
-> control gap classification
-> residual risk assessment
-> compensating control design
-> approval routing
-> restricted operation
-> KRI monitoring
-> evidence capture
-> expiry review
-> close / renew / remediate / policy update / stop

4.2 Trigger examples

TriggerExample
Product release pressurePilot date arrives before full eval coverage
Control maturity gappolicy engine supports deny but not reason logging
Vendor dependencymodel provider updates data-processing evidence late
New use casepolicy does not yet classify agentic workflow pattern
Operational constrainthuman review capacity below high-tier target
Incident remediationsystem can restart only with temporary traffic cap
Regulatory or audit findingcontrol gap must be tracked until remediation completes

4.3 Intake fields

FieldStrong content
Exception IDstable id used in release, telemetry, evidence and dashboard
Use caseproduct, business domain, channel, customer/employee impact
Risk tierlow, medium, high, critical with rationale
Standard controlcontrol id, policy name, required baseline
Requested deviationexact control gap, not vague description
Business reasonwhy the deviation is needed now
Alternatives consideredwait, reduce scope, manual process, different vendor, redesign
Scopeusers, channels, geography, data, model, tools, traffic, time
Residual riskwhat risk remains after compensating controls
Compensating controlspreventive, detective, corrective controls
Hard stopmeasurable conditions that end the waiver immediately
Evidence planwhat proves controls operate
Expiryspecific date and review forum

4.4 Classification and approval

DimensionLow signalHigh signal
Customer impactemployee-only internal draftcustomer-facing or customer-impacting output
Automationread-only assistantdecision or write-enabled tool
Data sensitivitypublic or approved internal knowledgePII, account, credit, wealth, AML, complaint data
Regulatory relevanceinternal productivitycredit, AML, privacy, complaint, advice, unfair practice
Reversibilityeasy rollback, no external recordirreversible record, notice, funds, account status
Control gap severityevidence formatting gapmissing validation, HITL or data approval
Risk tierApproval pattern
LowProduct owner and control owner; evidence owner informed
MediumProduct, business owner, control owner, risk partner, security/privacy if affected
HighBusiness executive delegate, product, model risk, compliance, operational risk, security/privacy, architecture, operations
CriticalExecutive risk owner, formal governance forum, legal/compliance/model risk, CISO/privacy/third-party as relevant, board/audit visibility if material

Authority principle: The person who wants the value should not be the only person accepting the residual risk.

4.5 Restricted operation

RestrictionArchitecture implementation
Traffic capfeature flag, release orchestrator, route control
User scopeIAM, group entitlement, role policy
Channel scopeAPI gateway, UI route, deployment config
Data scopedata gateway, retriever filter, DLP, purpose tag
Tool scopetool gateway, allowlist, deny-by-default
Model scopemodel route policy, region route, vendor allowlist
Time scopescheduled expiry flag, release calendar lock
Evidence scopetrace attribute exception_id, evidence binder rule

4.6 Expiry outcomes

OutcomeMeaning
Closestandard control is satisfied; waiver ends
Remediate then closecontrol gap fixed before continued operation
Renew with evidenceshort renewal approved with new evidence and stronger conditions
Convert to policy changerepeated waiver reveals baseline policy/control must be updated formally
Stopvalue no longer justifies residual risk or control gap persists

No silent extension.


5. Risk Acceptance Memo

5.1 Memo purpose

The memo proves what control is missing or weakened, why the business still wants limited operation, what risk remains, who accepts it, what controls compensate, what evidence will be reviewed, when the acceptance ends and what triggers immediate stop.

5.2 Memo template

SectionExample content
Decision summaryAI-WVR-2026-0042; Retail Service Low-Risk FAQ Pilot; temporary exception from CTRL-CITATION-PDF-004; high risk; limited approval for 30 days; expiry 2026-07-30
Business rationalepilot reduces call-center FAQ volume, tests self-service demand, stays limited to low-risk FAQ, excludes complaint / fee waiver / credit commitment / account closure
Alternativeswait for full checker, extend employee-only pilot, use manual FAQ update
Control baseline and gapcustomer-facing high-tier answers require automated citation support check; PDF table citation checker does not cover every fee table structure
Non-waivable boundariesno write tools, no customer-specific advice, no complaint handling, no credit decision
Residual riskcustomer may receive incomplete source support in low-risk FAQ; reduced by source allowlist, no-answer fallback and daily QA
Residual risk ownerHead of Retail Service, with Risk Partner concurrence
Compensating controls5% traffic cap, intent exclusions, daily 100-case QA, trace tags, route-to-human, feature flag disable
Hard stopswrong fee commitment > 0; complaint miss > 0; unsupported citation rate > 2%; trace completeness < 98%; sensitive data exposure
Evidenceapproval record, source registry, daily QA, KRI dashboard, trace report, incident log, stop-rule test
Expiry and exitreview on 2026-07-25; close after checker regression passes; stop if remediation evidence is insufficient

5.3 Memo quality bar

Weak memo: We need a temporary exception because the business needs to launch quickly.

Strong memo: We request a 30-day exception from a specific citation control for a 5% low-risk FAQ pilot. The waiver excludes high-risk intents and write tools, uses daily sample review and hard stop triggers, and expires before the next risk committee review.


6. Compensating Controls

6.1 Control design principles

Compensating controls must be tied to the specific gap, stronger where uncertainty is higher, feasible for operations, instrumented with evidence, time-boxed, reviewed at expiry and mapped to an owner.

6.2 Control catalog

GapPreventive controlDetective controlCorrective control
Eval coverage incompletenarrow scope, excluded intents, traffic capdaily QA sample, failed-case reviewexpand eval set, rerun gate, pause ramp
Model validation pendingshadow mode, no customer impactvalidation checkpoint, challenger reviewhold expansion, rollback candidate
Citation checker incompletesource allowlist, no-answer fallbackcitation QA, unsupported claim metricfix parser, remove source format
HITL capacity below standardlower traffic, queue limitqueue aging KRI, reviewer load dashboardroute to manual backlog, stop release
Tool logging gapread-only mode, deny write toolstool deny audit, SIEM ruledisable tool, add log fields
Privacy evidence gapdata minimization, redactionDLP sample, privacy reviewpurge logs, tighten route
Vendor evidence gapfallback vendor, reduced dataSLA and incident monitoringswitch route, stop vendor use
Evidence automation gapmanual evidence pack ownerdaily completeness checkblock renewal until automated

6.3 Compensating controls for agentic AI

Agent riskCompensating control
Tool chain expands beyond scopetool gateway with exception-specific allowlist
Agent delegates to another agentidentity propagation, delegation policy, trace parent id
Multi-step plan hides riskplan approval before action, step-level policy checks
Write action causes irreversible changedry-run, human approval token, idempotency and reversal path
Prompt injection changes tool useinstruction hierarchy, content isolation, policy post-check
Agent loops or escalates costbudget cap, step cap, timeout, emergency stop
Third-party tool changes behaviorcontract pinning, vendor change notice, fallback

6.4 Control evidence

ControlEvidence
Traffic caprelease config, exposure report
Excluded intentspolicy test, production deny logs
Human sample reviewsample list, reviewer, outcome, defect reason
Source allowlistregistry snapshot, retrieval trace
No write toolstool gateway config, deny logs
DLP routeredaction report, blocked request log
Hard stop testkill switch drill or feature flag proof
Expiry enforcementscheduled review, automated disable config

7. Expiry and Renewal

7.1 Expiry rule

Every waiver needs an expiry date, review owner, forum, renewal criteria, closure criteria, stop criteria and evidence bundle. No expiry means no waiver approval.

Risk tierNormal maximum durationRenewal posture
Low60-90 daysone renewal with owner approval
Medium30-60 daysrenewal requires evidence and control owner
High14-45 daysrenewal requires cross-functional approval
Criticalshortest practical windowrenewal discouraged; executive review required

7.3 Renewal criteria

Renewal should require: business rationale still valid; no hard stop triggered; KRI stable; compensating controls operated as designed; remediation progress; scope not expanded; residual risk owner re-accepts risk; new expiry is shorter or tied to a concrete delivery date.

7.4 Repeat renewal escalation

SignalRequired action
second renewalrisk forum review and remediation funding decision
third renewalexecutive review, policy/control baseline reassessment
over 90 days active for high-tierboard/audit committee visibility if material
same reason across multiple teamsplatform control investment or policy update
expired but activeincident or management issue, not administrative delay

7.5 Shadow policy test

Ask whether the same exception is active for more than one review cycle, teams design around the waiver as normal practice, the reason recurs across products, management stops discussing exit, controls remain unfunded or audit would see a gap between written policy and operations.


8. No-Go Criteria and Hard Stops

8.1 No-go criteria

No-go conditionReason
prohibited use is implicatedexceptions cannot override prohibited uses
residual risk owner lacks authorityaccountability mismatch
scope cannot be enforcedwaiver may spread beyond approval
hard stop cannot be executedrisk cannot be contained
customer harm is ongoingissue/incident path required
sensitive data route unapprovedprivacy/security/third-party risk unacceptable
control gap hides final decision automationautomation boundary not transparent
evidence cannot be retainedaudit and model risk cannot reconstruct decision
renewal history shows chronic non-remediationwaiver is becoming policy

8.2 Hard stop examples

Use caseHard stop
Customer service FAQconfirmed wrong fee commitment > 0
Complaint triagecomplaint escalation miss > 0
Credit memo copilotadverse action reason mismatch in any customer-impacting path
Wealth assistantpersonalized recommendation breach > 0
AML investigation copilotAI-generated SAR / no-SAR conclusion observed
Fraud agenttool action changes account state without approval token
RAG policy assistantunsupported citation in high-risk slice > threshold
Agent workflowtool loop, cost spike, or delegated action outside allowlist
Privacy-sensitive assistantPII sent to unapproved model route
Third-party model routevendor SLA or data-processing evidence becomes invalid

8.3 Stop rule runbook

Trigger detected
-> classify severity
-> pause or cap feature by exception_id
-> disable affected tool/model/channel if needed
-> notify product, risk, operations, compliance, security/privacy as relevant
-> preserve traces and evidence
-> review affected customers/cases
-> decide rollback, remediation, customer correction or incident escalation
-> update waiver status

8.4 Stop authority

RoleStop authority
Release ownerpause ramp or traffic cap
Operations leadroute to manual queue
Security leaddisable unsafe tool or route
Privacy leadstop data flow
Risk/compliance leadstop customer-facing use
Executive ownerterminate material waiver

9. Dashboards and KRIs

9.1 Exception dashboard

MetricMeaning
Active exceptions by risk tieroverall residual risk exposure
Exceptions by domaincredit, wealth, AML, fraud, service, operations
Exceptions by control domaineval, model risk, privacy, security, third-party, operations
Agingdays active and days to expiry
Renewal countshadow policy signal
Expired but activegovernance breach
Hard stops triggeredcontainment effectiveness
KRI breachesrisk is outside waiver conditions
Remediation progresswhether control debt is closing
Evidence completenessaudit readiness
Exceptions linked to incidentswhether waivers contributed to harm
Board-reportable exceptionsmaterial residual risk visibility

9.2 KRI catalog

KRIDefinitionManagement use
Exception agingactive days since approvalspot stalled remediation
Repeat renewal raterenewals / active exceptionsidentify shadow policy
Expired active exceptionsexpired exceptions still in productionimmediate escalation
Control gap concentrationsame control waived across teamsplatform investment need
High-tier exception counthigh/critical open waiversrisk appetite pressure
Hard stop trigger counttriggered stops by use casesystem risk and control quality
Evidence completenessrequired evidence present and currentaudit readiness
Remediation slippagemissed control fix datesgovernance effectiveness
Exception incident linkageincidents linked to active waiversrisk acceptance quality
Third-party exception exposurevendor-related open exceptionssupplier risk
Consumer harm signalcomplaints, appeals, upheld cases linked to waiver scopecustomer protection
Human review overloadreview SLA breach under waivercompensating control failure

9.3 Board and audit committee view

Board questionDashboard answer
Are we operating outside approved AI controls?count and severity of active high/critical waivers
Are exceptions temporary?aging, expiry and renewal pattern
Are customers exposed?customer-facing scope, complaint and harm signals
Who accepted residual risk?accountable executive roles
Are controls compensating effectively?KRI status and evidence completeness
Are repeat exceptions creating shadow policy?repeat reasons and policy update decisions
Are GenAI/agentic risks covered cross-functionally?model, operational, privacy, security, third-party and compliance view

9.4 Example dashboard row

FieldExample
Exception IDAI-WVR-2026-0042
Use caseRetail Service Low-Risk FAQ Pilot
Risk tierHigh
Control gapPDF table citation checker incomplete
Scope5% customer FAQ traffic, low-risk intents only
Residual risk ownerHead of Retail Service
Compensating controlsdaily QA, source allowlist, no write tools, complaint hard-route
Expiry2026-07-30
KRI statusgreen, no hard stops
Evidence status98.9% trace completeness, daily QA complete
Renewal count0
Board visibilityincluded in monthly AI risk MI if extended

10. RACI

10.1 Core roles

RoleAccountability
Business ownerowns business value and accepts business residual risk within authority
AI Product Managerdefines scope, value, user impact, release conditions and product guardrails
AI BAmaps policy/control gap to requirements, workflow, evidence and stakeholder decisions
Product Architectmaps waiver conditions to runtime architecture and controls
Model Riskevaluates model intended use, validation gap, monitoring and effective challenge
Operational Riskevaluates process, human control, capacity, incident and control operation
Complianceevaluates consumer, regulatory, disclosure, complaint and record implications
Privacyevaluates data purpose, minimization, consent, retention and rights
Securityevaluates access, gateway, tool abuse, logging, SIEM and incident path
Third-Party Riskevaluates vendor evidence, SLA, data terms, resilience and exit
Operationsruns human review, QA sampling, queues and fallback processes
Release Governanceensures approval routing, evidence, expiry and dashboard
Internal Auditassesses design and evidence quality without owning management risk
Board / Audit Committeereceives material exception visibility and challenges chronic exposure

10.2 RACI shorthand

ActivityAccountableResponsibleConsulted
Intake and residual risk memoBusiness ownerPM / BA / Release GovernanceArchitect, Risk, Compliance, Privacy, Security, TPRM, Ops
Control gap and compensating controlsArchitect / Operational RiskBA / PM / OpsModel Risk, Compliance, Privacy, Security
Approval routing and expiry reviewRelease GovernancePM / BAall required control owners by risk tier
Runtime enforcement and KRI dashboardArchitect / OperationsPlatform / Ops / Release GovernanceProduct, Risk, Security, Privacy
Board reportingBusiness owner / Release GovernanceRisk reporting ownerProduct, Architecture, Control owners

10.3 Forum design

ForumScopeCadence
Daily exception triagelow/medium intake, expiring exceptions, evidence gapsdaily or twice weekly
Weekly AI risk reviewhigh-tier waivers, KRI trends, renewal requestsweekly
Material AI governance forumcritical waivers, customer-impacting exceptions, cross-domain disputesas needed
Monthly management information reviewportfolio exposure, aging, repeat exceptions, remediation fundingmonthly
Quarterly board/audit packagematerial residual risk, policy drift, chronic exceptions, audit findingsquarterly

11. Financial Retail Examples

11.1 Credit policy copilot

Scenario: AI drafts credit policy memos for underwriters; standard control requires full fair-lending regression before expanded pilot; a small-business segment is incomplete.

AreaDecision
Scopeemployee-only memo draft, no final credit decision, no adverse action notice
Duration30 days
Residual riskunderwriter may over-trust draft in new segment
Compensating controlsmandatory underwriter attestation, second-line sample review, no customer communication
Hard stopany reason-code mismatch in customer-impacting path
Evidenceunderwriter review logs, sample review, eval expansion plan

No-go: AI automatically generates final adverse action reasons under incomplete validation.

11.2 Wealth guidance assistant

Scenario: AI supports advisor preparation and client education; advice boundary classifier has not completed edge-case testing for retirement product prompts.

AreaDecision
Scopeadvisor-facing only, meeting prep and educational summaries
Exclusionno direct client personalized recommendation
Compensating controlslicensed advisor final review, approved educational content, advice breach monitoring
Hard stoppersonalized buy/sell recommendation generated without advisor mediation
Expiryclassifier edge-case test completion date

No-go: customer-facing robo-advice with incomplete advice boundary.

11.3 Customer service AI FAQ

Scenario: Customer-facing low-risk FAQ pilot has incomplete PDF table citation automation.

AreaDecision
Scope5% low-risk FAQ sessions
Exclusioncomplaints, fee waivers, credit commitments, account closure
Compensating controlssource allowlist, no-answer fallback, daily QA, route to human
Hard stopwrong fee commitment > 0 or complaint miss > 0
Evidencecitation QA, trace samples, customer escalation log

11.4 AML investigation copilot

Scenario: AI summarizes transaction patterns and drafts narrative; tool audit log lacks one required attribute for analyst override reason.

AreaDecision
Scopeinternal draft only, no SAR/no-SAR decision
Compensating controlsanalyst-in-control, manual override reason field, weekly QA
Hard stopAI-generated final SAR conclusion or missing case trace
Evidencecase review records, manual override extract, tool trace

No-go: AI submits SAR or decides no-SAR with incomplete evidence controls.

11.5 Fraud operations agent

Scenario: Agent can recommend fraud queue actions; write-enabled account restriction tool is technically available but reversal process is not tested.

AreaDecision
Scoperead-only and recommendation mode
Not approvedaccount restriction write action
Compensating controlsfraud analyst approval, dry-run tool output, queue monitoring
Hard stopany account state change without approval token
Exitcomplete reversal test and dual-control design

11.6 Third-party model route

Scenario: Vendor model performs better for Spanish-language support; updated vendor evidence for data retention is due but not yet received.

ConditionRequired control
Data minimizedno sensitive account details in prompt
Region controlledapproved endpoint only
Duration limitedshort expiry aligned to vendor evidence date
Fallback availableroute to existing approved model
Monitoring activedata route and vendor SLA dashboard

No-go: PII or restricted data sent to unapproved or unverified route.


12. Templates

12.1 Exception intake form

FieldExample
IdentificationAI-WVR-2026-0042; Retail Service FAQ Pilot; Customer Service; request owner AI PM; business owner Head of Retail Service
Baselinehigh risk because output is customer-facing; CTRL-CITATION-PDF-004 requires automated citation support check
DeviationPDF fee table citation support incomplete; 30-day waiver requested; 5% traffic cap; complaints, fee waivers, credit commitments and account closure excluded
Residual riskcustomer may receive incomplete source support in low-risk FAQ
Controlssource allowlist, no-answer fallback, daily 100-case QA, route-to-human, no write tools
Hard stopswrong fee commitment > 0; complaint miss > 0; unsupported citation sample > 2%; trace completeness < 98%
Evidence and exitrelease config, source registry, QA, trace dashboard, incident log; close after checker regression or stop pilot

12.2 Residual risk acceptance table

FieldExample
Risk eventLow-risk FAQ answer contains unsupported citation
Potential impactcustomer confusion, trust impact, complaint
Inherent severitymedium
Compensating controlssource allowlist, no-answer fallback, daily QA
Residual severitylow to medium within limited scope
Residual risk ownerHead of Retail Service
Acceptance period2026-07-01 to 2026-07-30
Review cadenceweekly KRI, daily QA
Hard stopunsupported citation sample rate > 2%
EvidenceQA log, trace report, KRI dashboard

12.3 Compensating control matrix

Control gapRiskCompensating controlOwnerEvidenceFrequency
PDF citation parser incompleteunsupported source claimdaily sample review and no-answer fallbackOps QA Leadsample log and fallback metricdaily
trace completeness below standardaudit reconstruction gapdaily trace completeness checkObservability Ownerdashboard exportdaily
review queue capacity uncertaindelayed human escalationtraffic cap and queue aging alertOperations Leadqueue reporthourly
vendor evidence pendingdata-processing uncertaintydata minimization and fallback routeTPRM Ownerroute log and vendor trackerweekly

12.4 Waiver approval record

FieldExample
Decisionlimited approval for AI-WVR-2026-0042
Approved / not approvedapproved 5% low-risk FAQ traffic; not approved high-risk intents, customer-specific advice or write-enabled tools
Datesapproval 2026-07-01; expiry 2026-07-30
ApproversBusiness owner, Risk partner, Compliance, Privacy, Security, Product architect, Release Governance
Conditionsdaily QA before next-day ramp, no hard stop, weekly expiry review, daily evidence binder update

12.5 Board/audit committee summary

FieldExample
Portfolio exposureactive high/critical exceptions 4; customer-facing 2; critical 0; expired active 0; renewal count >= 2 is 1
Material exceptionAI-WVR-2026-0042, Retail Service Low-Risk FAQ Pilot, Head of Retail Service owns residual risk
Control gap and scopePDF table citation automation incomplete; 5% low-risk FAQ traffic; expiry 2026-07-30
StatusKRI green; no hard stops; pilot stops if citation checker regression is not completed
Management attentionrecurring citation-control waiver across service products indicates platform investment need

12.6 Expiry review decision

FieldExample
ReviewAI-WVR-2026-0042; review 2026-07-25; expiry 2026-07-30
EvidenceKRI dashboard, QA sample logs, trace completeness, complaint linkage, citation checker remediation
Optionsclose, renew, stop, convert to formal policy/control baseline review
Decisionclose after checker regression passes and production trace confirms coverage
Conditionsno traffic expansion until standard gate; regression failures enter remediation; evidence retained

13. Architecture Pattern

13.1 Exception control plane

ComponentPurpose
Exception registrysource of truth for waiver id, scope, owner, expiry, approvals
Control catalogmaps risk tier to required controls and waivable/non-waivable status
Policy engineenforces scope, deny rules, hard exclusions
Release orchestratorcontrols traffic cap, channel, model route, feature flag
Tool gatewayenforces read/write permission and approval token
Model gatewayenforces vendor/model/data route and logging requirements
EvalOps pipelineruns exception-specific regression and sample review
Observability layeremits exception_id and control-gap telemetry
Evidence binderstores memo, approval, KRI, logs, review and expiry decisions
Incident integrationlinks hard stops to incident and remediation workflow
Management dashboardreports aging, renewal, KRI, board/audit visibility

13.2 Runtime tagging

ai.exception_id, ai.exception_scope, ai.risk_tier, ai.control_gap_id, ai.residual_risk_owner, ai.expiry_date, ai.model_version, ai.prompt_version, ai.rag.source_registry_version, ai.tool.policy_version, ai.human_review_required, ai.hard_stop_profile

13.3 Enforcement flow

request enters AI gateway
-> check exception_id active
-> check current date before expiry
-> check user/channel/data/tool/model in approved scope
-> apply exception-specific policy bundle
-> emit trace tags
-> route allowed request
-> deny or route to human if outside scope

14. 30-Day Lab

DaysFocusDeliverables
1-3Select use case and baselineuse case one-pager, risk appetite statement, risk tier, standard control baseline
4-6Identify exception scenariocontrol gap memo, non-waivable boundary list, alternatives considered
7-9Draft risk acceptance memobusiness rationale, residual risk, scope, controls, approval roles, hard stops, expiry
10-12Design compensating controlspreventive/detective/corrective matrix, owner, evidence, workflow diagram
13-15Design architecture enforcementexception registry fields, runtime tags, policy checks, tool restrictions, kill switch
16-18Build dashboard specificationactive exceptions, aging, renewals, expired active, KRI status, evidence completeness
19-21Write templatesintake form, approval record, expiry review memo, board summary, evidence binder
22-24Simulate hard stopincident timeline, stop action, customer/case review, remediation update
25-27Conduct expiry reviewclose, renew, remediate, convert to policy update or stop decision
28-30Interview and portfolio pack30-second answer, 2-minute answer, CRO version, architect version, board/audit explanation

Success criteria: no open-ended waiver; no unowned residual risk; no missing hard stop; no unenforceable scope; no evidence gap; clear explanation of why the exception is not shadow policy.


15. Interview Answers

15.1 30-second answer

AI risk appetite defines the baseline; waiver management governs controlled deviations from that baseline. For every AI exception, I require a specific control gap, narrow scope, residual risk owner, compensating controls, expiry date, hard stop conditions, evidence plan and renewal criteria. I also track aging and repeat renewals, because recurring waivers can become shadow policy. For GenAI and agentic AI, I do not treat this as only model risk; I combine model risk, operational risk, consumer compliance, privacy, security and third-party controls.

15.2 2-minute answer

I would manage AI exceptions as a formal risk acceptance lifecycle. First, I start from the approved risk appetite and control catalog. If a use case cannot satisfy a standard control, the team must identify the exact policy or control being waived, not just say “we need an exception.” Then I classify the exception by customer impact, automation level, data sensitivity, regulatory relevance, reversibility and control gap severity.

Second, I write a residual risk memo. It explains the business reason, alternatives considered, limited scope, what harm could occur, who accepts the residual risk and for how long. The waiver must include compensating controls, such as traffic caps, employee-only scope, source allowlists, human review, QA sampling, no write tools, extra monitoring or fallback routing.

Third, I make the waiver operational. The exception registry connects to feature flags, policy engine, model gateway, tool gateway, telemetry and evidence binder. Every request under the waiver carries an exception id, scope, risk tier, control gap and expiry. Hard stops are pre-approved: for example, wrong fee commitment, missed complaint escalation, PII routed to an unapproved model or tool write without approval immediately pauses the feature.

Finally, I manage expiry. A waiver can close, renew with new evidence, remediate, convert to a formal policy update or stop. It cannot silently continue. Repeat renewal, expired active exceptions and recurring control gaps are escalated to management and, when material, board or audit committee reporting. That prevents exceptions from becoming permanent shadow policy.

15.3 CRO version

I would focus on residual risk accountability and aggregate exposure. The CRO should see which AI controls are being waived, which high/critical use cases are affected, who accepted residual risk, what compensating controls are operating, which KRIs are near breach, which exceptions are aging and whether repeat waivers indicate policy drift or underfunded controls. I would also distinguish waivable control gaps from prohibited uses. A waiver cannot be used to approve an unauthorized final decision, unapproved sensitive-data route or uncontrollable agent execution.

15.4 Chief Product Officer version

I would frame waivers as a way to learn safely, not as a way to bypass governance. Product teams can run limited pilots when the control gap is specific, the scope is narrow and the residual risk is accepted. But the waiver must shape the roadmap: if multiple teams keep requesting the same exception, that becomes a platform investment or policy decision. The product leader should track exception debt the same way they track technical debt, because unmanaged waivers slow down future releases and create audit risk.

15.5 Chief Architect version

I would implement exception management as a control plane. The exception registry should feed policy engine, model gateway, tool gateway, release orchestrator, observability and evidence binder. Runtime checks should enforce expiry, user/channel/data/tool/model scope and hard exclusions. Every trace should carry exception id, risk tier, control gap, model/prompt/source/tool versions and review status. The architecture must support artifact-level rollback, not just code rollback, because AI behavior can change through prompt, RAG, model route, tool schema or vendor configuration.

15.6 Internal audit version

I would ask whether management can reconstruct the decision and prove the waiver operated within approved boundaries. Evidence should include the policy baseline, control gap, approval roles, residual risk memo, compensating control tests, release configuration, trace samples, KRI history, hard stop drill and expiry decision. I would pay special attention to expired active exceptions, repeat renewals and cases where the same control is waived across multiple products, because those indicate shadow policy risk.

15.7 Board/audit committee version

At board or audit committee level, I would not show every low-risk waiver. I would show material exposure: active high/critical exceptions, customer-facing exceptions, exceptions linked to incidents, expired active items, repeat renewals, control-gap concentration and remediation progress. The key message is whether AI operations remain within approved risk appetite or whether exceptions are becoming the real operating model.


16. Common Failure Modes

Failure modeSymptomBetter practice
Waiver without expiry“temporary” approval remains active for monthsfixed expiry and automatic escalation
Waiver without control idnobody knows what is being waivedmap to policy/control catalog
Business-only approvalvalue owner approves own residual risktiered cross-functional approval
No runtime enforcementscope exists only in memofeature flag, policy engine, gateway control
No hard stopteam debates during breachpre-approved stop conditions
Weak compensating controls“manual monitoring” with no sample plandefined owner, frequency and evidence
Evidence afterthoughtaudit pack assembled manually months laterevidence generated as workflow operates
Repeated renewalsame gap extended repeatedlyplatform investment or policy review
Model-risk-only lensprivacy, security, operations and third party omittedcross-domain review
Agent tool gapwaiver ignores delegated tool actiontool gateway and identity propagation
Expired active waiverproduction still runs after expiryincident escalation and disable path
Shadow policyexceptions become standard practicemanagement review and formal baseline decision

17. Final Memory Card

ConceptOne line
Risk appetitebaseline boundary for AI risk-taking
Waivertime-boxed deviation from a standard control
Risk acceptanceaccountable acceptance of residual risk
Compensating controlsubstitute control that reduces risk while gap exists
Expirydate when waiver must close, renew, remediate, convert or stop
Hard stoppre-approved condition that immediately pauses or rolls back use
Shadow policyrepeated or indefinite exception that becomes the real operating model
Board visibilitymaterial exposure, aging, renewal, breach and remediation view
Agentic AI nuanceexception must cover model, tool, identity, delegation, security, privacy, operations and vendor risk

Most important sentence:

A mature AI organization does not pretend exceptions will disappear; it designs them as controlled, temporary, evidenced residual-risk decisions and escalates them before they become shadow policy.