AI Segregation of Duties / Dual Control Playbook
Segregation of duties, dual control 和 maker-checker 不是旧时代后台操作流程。
AI Segregation of Duties / Dual Control / Maker-Checker Architecture Playbook
定位: 面向高级 AI PM / AI BA / AI Product Architect / Enterprise Architect / Risk Technology Lead / Operations Lead / Internal Audit Partner, 把 AI segregation of duties 从“权限配置”升级为可设计、可运营、可度量、可审计的生产控制体系。 适用范围: agentic workflow、AI copilot、AML assistant、payment dispute assistant、credit underwriting copilot、complaint remediation agent、KYC review assistant、customer service RAG、vendor AI reviewer、AI platform tool gateway。 重要说明: 本文是学习、作品集和内部治理训练材料, 不是法律意见、合规结论、审计意见、模型验证报告或监管解释。正式项目必须由 Legal、Compliance、Risk、Model Risk、Internal Audit、Security、Privacy、Business Owner、Operations Owner、IAM Owner 和管理层结合机构类型、司法辖区、业务用途、客户影响和内部政策确认。
1. Executive Framing
Segregation of duties, dual control 和 maker-checker 不是旧时代后台操作流程。 在 AI workflow 中, 它们反而更重要。 原因很简单: AI agent 可以同时读数据、解释证据、推荐动作、调用工具、起草客户沟通、生成审计摘要。 如果设计不当, 一个“高效 agent”会把过去分散在 analyst、reviewer、supervisor、operator、auditor 和 system admin 之间的职责合并成单点权力。 本 playbook 的核心判断:
AI risk is not only unauthorized access.
It is incompatible duties collapsing into one automated chain.
中文表达:
AI 风险不只是越权访问, 也是不相容职责被自动化压缩成一条链。
1.1 与 Generic Authorization 的区别
| Generic authorization | SoD / dual control architecture |
|---|---|
| 问某个主体能不能做动作 | 问哪些职责不能由同一主体完成 |
| 以 user、role、scope 为中心 | 以 duty、workflow state、conflict 和 evidence 为中心 |
| 常见输出是 allow / deny | 常见输出是 allow / deny / require_checker / dual_approval / independent_challenge |
| 主要防止未授权访问 | 同时防止自我审批、橡皮图章、利益冲突和审计失真 |
| 通常在 IAM 或 API 层 | 必须横跨 workflow、tool gateway、review workbench、operations、audit |
| 可以只看当前动作 | 必须看 actor 曾经做过什么、接下来要做什么、与 case 有何关系 |
1.2 为什么 AI 让 SoD 更难
- AI maker 的输出可能看起来非常完整, 让 checker 放松独立判断。
- Copilot 可能预填审批理由, 降低人类 challenge 质量。
- Supervisor agent 可能与 worker agent 使用同一模型、同一知识源和同一 owner。
- Tool gateway 只校验 scope, 不校验 maker 是否等于 approver。
- 运营 KPI 可能奖励速度, 使 reviewer 变成形式按钮。
- Vendor 可能提供模型、监控和 incident report, 但缺少独立接受验证。
- Audit trail 可能只保存最终摘要, 缺少原始证据、审批链和参数 hash。
1.3 成熟 SoD 的目标
成熟设计要能回答:
| Question | Mature answer |
|---|---|
| 谁准备了建议 | maker identity, role, agent version, workflow run |
| 谁复核了建议 | checker identity, independence rule, evidence viewed |
| 谁批准了动作 | approver authority, approval id, action hash |
| 谁执行了动作 | tool gateway, service principal, execution result |
| 谁覆盖了结论 | override owner, reason code, second review trigger |
| 谁测试了控制 | independent QA / model risk / audit sampling |
| 证据在哪里 | evidence ledger with trace, versions, decisions and retention class |
2. Source Anchors
以下官方来源作为治理和控制设计锚点。本文把它们转成 AI workflow role boundary、maker-checker、dual authorization、independent challenge、entitlement separation 和 audit evidence 的设计语言。
| Anchor | Official link | 本文使用方式 |
|---|---|---|
| FFIEC Management booklet | https://ithandbook.ffiec.gov/it-booklets/management.aspx | 用管理层责任、风险管理、资源配置、控制监督和审计协作语言定义 AI SoD operating model |
| NIST AI Risk Management Framework | https://www.nist.gov/itl/ai-risk-management-framework | 用 Govern / Map / Measure / Manage 组织 AI SoD 的场景识别、控制设计、指标和持续处置 |
| ISO/IEC 42001 | https://www.iso.org/standard/42001 | 用 AI management system 思路连接 policy、accountability、operational control、performance evaluation 和 improvement |
| Federal Reserve SR 26-2 | https://www.federalreserve.gov/supervisionreg/srletters/SR2602.htm | 用模型风险管理、独立验证、挑战、治理和证据语言补强 AI 控制设计 |
2.1 SR 26-2 Nuance
SR 26-2 于 2026-06-24 发布, superseded SR 11-7 和 SR 21-8。 它的附件说明 generative AI 和 agentic AI 不在该 guidance 的直接范围内。 这意味着不能把 SR 26-2 简单当作 GenAI / agentic AI 的完整操作手册。 但金融零售 AI SoD 仍需要吸收其中的模型风险治理思想:
- 明确模型和系统用途。
- 区分开发、使用、验证和监督职责。
- 保留可复核证据。
- 对关键控制做独立挑战。
- 在变更、问题和重大使用场景中重新评估风险。 对于 GenAI / agentic AI, SoD 还必须额外组合:
- operational risk: 流程中断、错误执行、队列拥堵、人工复核失效。
- identity / authorization: agent identity、delegated scope、tool gateway、approval token。
- consumer compliance: 客户可见沟通、投诉、信贷、支付、隐私、公平性和弱势客户。
- audit evidence: maker、checker、approver、executor、override、policy version、action hash、result。
2.2 Mapping To NIST AI RMF
| NIST function | SoD 设计问题 | Evidence |
|---|---|---|
| Govern | 谁拥有职责边界、例外批准、冲突规则和控制测试 | RACI、SoD policy、management report |
| Map | 哪些 AI use case 会合并不相容职责 | duty inventory、workflow map、risk tier |
| Measure | SoD 控制是否真的阻止自我审批和利益冲突 | violation rate、override concentration、QA results |
| Manage | 发生职责冲突、审批失效或证据缺失时如何处置 | escalation runbook、kill switch、CAPA |
3. SoD Taxonomy For AI Workflows
3.1 Duty Families
| Duty family | Description | AI examples |
|---|---|---|
| Request | 发起任务、请求数据或请求工具动作 | CSR 让 complaint agent 草拟回复 |
| Retrieve | 读取客户、交易、政策、文档、case evidence | RAG service 检索贷款政策 |
| Extract | 从非结构化资料提取 facts | KYC AI 提取地址和有效期 |
| Analyze | 识别模式、风险、缺口或异常 | AML assistant 识别 red flags |
| Recommend | 生成业务建议或下一步 | dispute agent 建议 provisional credit |
| Draft | 起草 memo、case note、客户消息 | complaint response draft |
| Approve | 批准业务结论或动作 | supervisor 批准退款 |
| Execute | 调用工具改变系统状态 | payment API 发放临时贷记 |
| Notify | 对客户、监管或第三方发消息 | adverse action notice draft send |
| Override | 覆盖 AI、规则、人类或系统默认结果 | manager 覆盖争议拒绝建议 |
| Close | 关闭 case、alert、complaint 或 exception | AML alert closure |
| Validate | 验证模型、规则、控制或输出质量 | model risk validation |
| Audit | 独立复盘控制和证据 | internal audit replay |
3.2 Actor Types
| Actor | SoD concern |
|---|---|
| Human user | 可能同时是 maker、approver 和 override owner |
| AI agent | 可能自动生成、推荐、执行和自评 |
| Supervisor agent | 可能缺少独立性, 只是同一模型链的第二层 |
| Workflow engine | 可能默认推进状态, 需要外部 gate |
| Tool gateway | 执行控制点, 必须知道 duty context |
| Service account | 不能掩盖真实 maker 和 approver |
| Vendor AI | 不能自评自身错误和控制有效性 |
| Business owner | 可能有业绩冲突, 需要风险或合规 challenge |
| Model owner | 不应独自验证自己的模型风险控制 |
| Internal audit | 应拥有独立 replay 和 evidence access |
3.3 SoD Control Strength Levels
| Level | Name | Description | Suitable use |
|---|---|---|---|
| S0 | No separation | 同一主体可准备、批准和执行 | 仅限低风险草稿和非生产实验 |
| S1 | Logged self-check | 同一主体可执行, 但保留日志和抽样 | 低影响内部分类 |
| S2 | Maker-checker | maker 与 checker 分离 | 客户可见草稿、政策解释、case note |
| S3 | Independent challenge | checker 必须独立且能查看原始证据 | 信贷、AML、投诉、争议边界判断 |
| S4 | Dual authorization | 两个授权主体共同批准 | 资金、账户、监管、负面客户决定 |
| S5 | Three-lines evidence | 运营、风险/模型风险、审计各有证据视角 | 高影响 AI production control |
3.4 AI-Specific Independence Dimensions
独立性不能只看是否不同姓名。
| Dimension | Weak independence | Stronger independence |
|---|---|---|
| Person | 同一员工复核自己发起的 case | 不同员工且不同 queue assignment |
| Role | 同一一线角色审批自身建议 | supervisor、specialist、risk 或 compliance role |
| Team | 同一 sales 团队批准信贷例外 | independent credit authority |
| Model | 同一模型自我检查 | independent rules、different model class、human challenge |
| Prompt | 同一 system prompt 复制执行 review | separate review rubric and evidence-first UI |
| Data | checker 只看 maker 摘要 | checker 可看原始 source、version 和 missing evidence |
| Vendor | 供应商自评事故 | internal owner accepts evidence and audit samples |
| Incentive | reviewer KPI 只看速度 | balanced quality、risk、override、evidence metrics |
4. Incompatible Duty Matrix
4.1 Core Matrix
| Maker duty | Incompatible checker / approver duty | Control |
|---|---|---|
| Create customer-impacting recommendation | Approve same recommendation | maker-checker separation |
| Extract evidence | Certify evidence sufficiency | independent evidence review |
| Draft customer denial | Send final denial | pre-send review and approval |
| Recommend refund | Execute refund | dual authorization above threshold |
| Recommend account restriction | Approve account restriction | senior risk approval |
| Classify AML alert as low risk | Close alert | AML specialist review |
| Generate adverse action reason | Approve adverse action notice | approved reason code service + human review |
| Request new entitlement | Approve entitlement | IAM owner or manager approval |
| Build model / prompt | Validate model / prompt | independent model risk or QA |
| Operate AI control | Test control effectiveness | independent QA / audit sample |
| Cause incident | Approve restoration | incident commander + risk owner |
| Vendor produces control report | Accept control effectiveness | internal owner acceptance and audit rights |
4.2 Financial Retail Matrix
| Domain | Incompatible combination | Example control |
|---|---|---|
| Payments | dispute recommendation + provisional credit release | payment specialist approval, amount threshold dual control |
| AML | AI red-flag analysis + alert closure | analyst review, senior approval for closure |
| Lending | credit memo draft + final approve / decline | underwriter decision owner, fair lending review |
| Complaint | response draft + complaint closure | complaint lead review, regulatory SLA evidence |
| KYC | document classification + KYC status approval | EDD review for high-risk customer |
| Collections | hardship recommendation + repayment plan change | supervisor approval, customer confirmation |
| Wealth | product explanation + personal recommendation | licensed advisor handoff |
| IAM | agent requests scope + auto-grants scope | entitlement workflow and access recertification |
4.3 AI Platform Matrix
| Platform duty | Should be separated from | Why |
|---|---|---|
| Tool registry approval | Tool owner submitting tool | avoid self-approval of side effects |
| Prompt release | Prompt author only approval | prevent unchallenged behavior change |
| RAG source ingestion | Source quality approval | prevent stale or unauthorized source use |
| Eval dataset creation | Sole performance reporting | avoid test leakage and benchmark gaming |
| Model routing config | Production incident closure | avoid restoring unsafe route |
| Policy-as-code authoring | Production policy approval | enforce review of rule intent and tests |
| Audit log administration | Evidence review | prevent log tampering or selective evidence |
4.4 Conflict-Of-Interest Signals
| Signal | Example | Action |
|---|---|---|
| Same actor | CSR drafts and approves own fee waiver | block or route to checker |
| Same case owner | analyst closes case they triaged with AI | require supervisor review |
| Same team incentive | sales approves credit exception | route to independent credit |
| Same model owner | model team validates own release | model risk challenge |
| Same vendor | vendor investigates its own outage | internal acceptance and independent sample |
| Same queue pressure | reviewers measured only on throughput | add quality and challenge metrics |
| Same customer relationship | relationship manager approves customer complaint remedy | independent complaint owner |
| Same entitlement admin | platform engineer grants own prod scope | IAM dual control |
5. Dual-Control Patterns
5.1 Pattern A: Classic Maker-Checker
Use when one role prepares and another role verifies before workflow advances.
AI or human maker
-> evidence packet
-> independent checker workbench
-> accept / edit / reject / escalate
-> evidence ledger
适合:
- 客户可见回复草稿。
- 信贷 memo 草稿。
- AML narrative 草稿。
- KYC evidence summary。 设计要求:
- checker 看到原始 evidence, 不是只看 AI summary。
- checker action 不是默认 approve。
- checker reason code 进入日志。
- maker 不能修改 checker decision。
5.2 Pattern B: Four-Eyes Approval
Use when one human approval is insufficient but full committee is too slow.
maker proposal
-> first approver
-> second approver with independent authority
-> single-use approval token
-> tool execution
适合:
- 高额 provisional credit。
- 账户冻结或解除。
- 高影响投诉赔付。
- 高风险 KYC status 变更。 设计要求:
- second approver 不属于同一冲突链。
- 两个批准绑定同一 action hash。
- 任一批准过期则重新审批。
- disagreement 进入 adjudication。
5.3 Pattern C: Dual Authorization For Tools
Use when tool action has direct side effect.
AI proposes tool call
-> policy engine marks dual_control_required
-> approver A validates business reason
-> approver B validates risk / compliance boundary
-> tool gateway verifies both approval IDs
-> execute exact request hash
适合:
- funds movement。
- adverse customer notice。
- SAR-related submission package。
- customer data export。
5.4 Pattern D: Independent Challenge
Use when the issue is judgment, not only approval.
AI recommendation
-> checker reviews evidence without relying on AI conclusion
-> challenge questions answered
-> approve, reject, modify or escalate
Challenge questions:
- 哪些证据支持结论。
- 哪些证据冲突。
- 哪条 policy 或 rule 适用。
- 是否存在 customer harm。
- 是否存在 missing evidence。
- 是否存在 conflict of interest。
5.5 Pattern E: Post-Action Independent QA
Use when bounded automation is allowed but still needs assurance.
bounded auto action
-> sampling strategy
-> independent QA
-> defect classification
-> control tuning
-> management report
适合:
- 低风险自动分类。
- 标准内部 case note。
- 低影响 routing。
- RAG answer sample QA。
5.6 Pattern F: Break-Glass Dual Control
Use when emergency access or override could cause large harm.
incident request
-> incident commander approval
-> security / risk approval
-> time-boxed emergency entitlement
-> enhanced logging
-> post-incident review
原则:
- AI agent 默认不能自主触发 break-glass。
- break-glass 不能成为长期 admin 后门。
- 恢复普通权限前必须做证据复盘。
6. Workflow Controls
6.1 State Machine
生产 workflow 应显式建模职责状态。
draft_created
-> maker_submitted
-> checker_assigned
-> checker_decided
-> approval_pending
-> dual_approval_complete
-> execution_ready
-> executed
-> qa_sampled
-> closed
禁止状态跳跃:
| From | Forbidden direct jump | Reason |
|---|---|---|
| draft_created | executed | no checker |
| maker_submitted | dual_approval_complete | no checker decision |
| checker_decided | executed | missing approval for side effect |
| approval_pending | executed | approval incomplete |
| executed | evidence_deleted | audit preservation |
6.2 Policy Inputs
SoD policy engine 至少使用:
- actor id。
- actor role。
- actor team。
- agent id and version。
- workflow run id。
- case owner。
- prior maker id。
- prior checker id。
- requested action。
- risk tier。
- amount。
- customer impact。
- regulatory sensitivity。
- approval state。
- entitlement scope。
- conflict flags。
6.3 Policy Outputs
| Decision | Meaning |
|---|---|
| allow | action may proceed |
| deny | action violates SoD or entitlement |
| require_checker | independent checker must review |
| require_dual_approval | two approvals required |
| require_specialist | domain specialist required |
| require_compliance | compliance / legal / risk review required |
| require_blind_review | checker initially cannot see AI recommendation |
| route_to_audit_sample | action can proceed but enters QA / audit sample |
| quarantine | workflow or agent is stopped pending investigation |
6.4 Approval Binding
Approval must bind exact action, not just intent.
| Field | Example |
|---|---|
| approval_id | appr_20260630_8841 |
| action_type | payments.provisional_credit.execute |
| business_object | dispute_case_7788 |
| amount | USD 250.00 |
| customer_ref | hashed customer reference |
| maker_id | dispute_ai_v4 |
| checker_id | user_451 |
| approver_id | supervisor_092 |
| policy_version | sod-payments-2026.06.3 |
| action_hash | hash of operation, amount, case, customer, tool params |
| expiry | 15 minutes or one execution |
6.5 Evidence-First Checker UI
Recommended order:
Case context
-> Proposed duty and action
-> Customer / financial / regulatory impact
-> Original evidence and citations
-> AI recommendation and uncertainty
-> Policy / SoD decision
-> Conflict flags
-> Available actions
-> Required reason code
Anti-patterns:
- approve button appears before evidence。
- AI confidence is shown without source support。
- checker cannot see who made the recommendation。
- checker cannot reject or escalate。
- free-text reason is the only evidence。
- UI hides downstream tool impact。
6.6 Override Controls
Override is not failure by default. It is a controlled business action.
| Override type | Required control |
|---|---|
| AI recommendation override | reason code and evidence reference |
| policy exception | supervisor and risk owner approval |
| customer-impact override | customer impact note and QA sample |
| model-risk override | model owner plus independent challenge |
| emergency override | break-glass dual control and post-review |
| repeated override by same user | concentration alert |
6.7 Rubber-Stamp Detection
Rubber-stamp review is a SoD failure. Signals:
- checker approval time is consistently below realistic evidence review time。
- override rate collapses after productivity target changes。
- reason code is always generic。
- same checker approves same maker repeatedly。
- second approval occurs within seconds for complex cases。
- QA finds missed evidence while checker logs show complete review。 Controls:
- minimum evidence panel interaction for high-risk cases。
- blind review for calibration cases。
- gold case injection。
- reviewer quality dashboard。
- supervisor coaching and certification removal。
7. Entitlement Model
7.1 Duty-Based Entitlements
Do not grant broad workflow permissions. Use duty-specific scopes.
| Scope | Meaning |
|---|---|
case.evidence.read.assigned | read evidence for assigned case |
case.summary.draft.create | create draft summary |
case.recommendation.submit | submit recommendation for review |
case.check.perform | perform checker review |
case.approval.first | provide first approval |
case.approval.second | provide second approval |
tool.payment.propose | propose payment action |
tool.payment.execute.approval_bound | execute only with valid approval |
customer.message.draft | draft message |
customer.message.send.approval_bound | send approved message |
override.material.perform | perform material override |
audit.replay.read | read evidence for audit replay |
7.2 Separation Rules
| Rule | Example |
|---|---|
| submitter cannot check same work | maker id != checker id |
| checker cannot be execution-only service account | checker must be human or approved control service |
| approver cannot be requester for entitlement | access request needs independent manager or IAM approval |
| second approver cannot equal first approver | dual approval requires two accountable actors |
| model validator cannot be sole model builder | independent model risk or QA challenge |
| auditor cannot administer evidence store | protect evidence integrity |
7.3 Agent Identity Requirements
每个 agent 必须有:
agent_id。agent_version。- business owner。
- technical owner。
- risk owner。
- allowed duty roles。
- prohibited duty roles。
- tool allowlist。
- max autonomy level。
- checker requirements。
- evidence logging profile。
Example:
| Field | Value |
|---|---|
| Agent | Payment Dispute Assistant |
|
agent_id|payment-dispute-agent-prod| | Allowed duty | retrieve, extract, summarize, draft, recommend | | Prohibited duty | final approve, funds execute, complaint close, audit certify | | Checker requirement | payment specialist for customer-impacting action | | Dual control | amount above threshold or fraud / complaint flag | | Evidence profile | enhanced trace for all recommendations |
7.4 Service Account Boundary
Service accounts execute technical calls but do not own business approval. Tool logs must include:
- service principal。
- human initiator。
- agent id。
- workflow run。
- maker id。
- checker id。
- approval id。
- action hash。
- policy decision。
- execution result。
7.5 Access Review
Monthly or event-driven review should cover:
- stale checker roles。
- agents with prohibited duty scopes。
- service accounts with execute permission but no approval-bound constraint。
- repeated emergency overrides。
- tool scopes unused for 90 days。
- vendors with broad evidence access。
- audit users with write access to evidence store。
8. Maker-Checker Templates
8.1 Maker-Checker Requirement Pattern
For payment dispute provisional credit above the frontline threshold,
when AI or a frontline analyst recommends a credit,
the system must route the recommendation and evidence packet to a certified payment dispute checker
who is not the maker, not the case creator, and not in a conflicted incentive role,
before any payment tool can execute,
capturing maker id, checker id, evidence viewed, decision, reason code, approval id, action hash and timestamp.
8.2 Filled Design Brief
| Field | Filled example |
|---|---|
| Use case | Card dispute provisional credit |
| Maker | AI dispute assistant or frontline analyst |
| Checker | Certified payment dispute specialist |
| Incompatible duty | maker cannot approve or execute own proposal |
| Dual control trigger | amount above threshold, fraud signal, complaint signal |
| Evidence | transaction, merchant, customer claim, rule deadline, prior disputes |
| Allowed checker actions | approve, reject, edit amount, request evidence, escalate |
| Override owner | payment supervisor |
| Tool gate | payment execution requires approval-bound token |
| Audit fields | maker, checker, approver, action hash, policy version, result |
8.3 Checker Workbench Acceptance Criteria
- Checker can see maker identity and AI version。
- Checker can see original sources and policy effective date。
- Checker can see customer and financial impact。
- Checker can reject without manager workaround。
- Checker can request evidence without approving。
- Checker can escalate to specialist queue。
- Checker decision requires structured reason for high-risk cases。
- Tool execution is blocked until checker state is complete。
8.4 Dual Authorization Card
| Field | Example |
|---|---|
| Action | execute provisional credit |
| First approver | payment specialist |
| Second approver | payment operations supervisor |
| Independence rule | different user and different approval level |
| Expiry | 15 minutes |
| Replay prevention | single-use approval token |
| Parameter binding | action hash includes amount, case, customer, tool |
| Disagreement path | senior adjudication |
| Evidence retention | governed payment evidence ledger |
8.5 Independent Challenge Checklist
- Reviewer identified at least one supporting evidence item。
- Reviewer identified conflicting or missing evidence。
- Reviewer confirmed source version and effective date。
- Reviewer checked whether AI used prohibited factor or stale source。
- Reviewer confirmed customer impact and reversibility。
- Reviewer confirmed no role conflict or self-approval。
- Reviewer selected reason code tied to evidence。
- Reviewer documented escalation or override rationale。
8.6 Audit Evidence Schema
| Field group | Fields |
|---|---|
| Identity | trace_id, run_id, case_id, customer_hash, tenant_id |
| AI config | agent_id, agent_version, model_id, prompt_version, tool_manifest |
| Duty chain | maker_id, checker_id, approver_1, approver_2, executor |
| Policy | sod_policy_version, decision, reason, conflict_flags |
| Evidence | source_ids, source_versions, citation_refs, missing_evidence_flags |
| Approval | approval_id, action_hash, approval_expiry, approval_result |
| Execution | tool_name, operation, params_hash, outcome, rollback_ref |
| Override | override_flag, override_owner, override_reason, second_review |
| Timing | created_at, reviewed_at, approved_at, executed_at |
| Governance | retention_class, audit_sample_flag, incident_id |
9. Dashboards And KRIs
9.1 SoD Control Dashboard
| Metric | Purpose |
|---|---|
| SoD violation attempts | detect blocked self-approval and role conflict |
| same-maker-checker attempts | identify workflow design weakness |
| dual approval completion rate | measure high-impact action control |
| approval expiry rate | detect queue friction or stale approvals |
| action hash mismatch | detect approval replay or parameter mutation |
| approval-bound execution coverage | ensure tools cannot bypass approval |
| checker SLA by risk tier | monitor control capacity |
| evidence completeness | prove checker had enough basis |
9.2 Quality And Independence Dashboard
| Metric | Purpose |
|---|---|
| rubber-stamp rate | detect formal review without challenge |
| median review time by complexity | detect unrealistic review behavior |
| override concentration by user | detect conflict or training issue |
| maker-checker pair concentration | detect collusion or routing weakness |
| adjudication overturn rate | detect checker quality gap |
| gold case pass rate | certify independent challenge |
| QA defect rate by duty | locate weak control points |
| audit replay success | prove evidence chain completeness |
9.3 Risk KRIs
| KRI | Yellow | Red |
|---|---|---|
| same actor attempted approval | isolated attempts | repeated attempts or production bypass |
| action hash mismatch | one blocked mismatch | any executed mismatch |
| high-risk action without dual approval | blocked by gateway | executed or evidence missing |
| override concentration | above baseline | single actor dominates material overrides |
| checker SLA breach | forecast breach | active breach for P1/P0 |
| evidence completeness | minor missing metadata | cannot replay case |
| vendor self-review reliance | vendor report used with internal sample | no internal acceptance evidence |
| approval time too short | below expected band | high-risk approvals within seconds |
9.4 Executive View
Executives should see:
- high-risk actions attempted。
- high-risk actions executed。
- percentage with complete maker-checker evidence。
- dual-control exception count。
- material overrides and owners。
- control breaches and customer impact。
- reviewer capacity and SLA。
- audit replay success。
- open CAPA from SoD failures。
10. RACI
| Activity | Business Owner | AI PM | AI BA | Architect | IAM / Security | Risk / Compliance | Model Risk | Operations | Internal Audit |
|---|---|---|---|---|---|---|---|---|---|
| Define duty inventory | A | R | R | C | C | C | C | C | I |
| Define incompatible duties | A | R | R | C | C | A/R | C | C | I |
| Design workflow gates | C | R | R | A/R | C | C | C | C | I |
| Design entitlement scopes | C | R | C | R | A/R | C | C | C | I |
| Approve dual-control policy | A | C | C | C | C | A/R | C | R | I |
| Operate checker queues | C | C | C | C | I | C | I | A/R | I |
| Validate model independence | C | C | C | C | C | C | A/R | C | I |
| Monitor SoD dashboard | C | R | C | R | R | R | C | A/R | I |
| Investigate SoD breach | A/R | R | C | R | A/R | A/R | C | R | C |
| Audit replay | I | C | C | C | C | C | C | C | A/R |
| R = Responsible, A = Accountable, C = Consulted, I = Informed. |
10.1 Three Lines View
| Line | SoD responsibility |
|---|---|
| First line | Operate workflow, maintain maker-checker routing, own business outcomes |
| Second line | Define risk/control standards, challenge exceptions, review metrics |
| Third line | Independently test evidence, control design and operating effectiveness |
10.2 Governance Cadence
| Review | Frequency | Output | |---|---| | Operations review | weekly | queue SLA, violation attempts, evidence gaps | | Risk/control review | monthly | KRI trend, override concentration, exception acceptance | | Access recertification | monthly or on change | duty scopes, stale roles, service account constraints | | Model/control challenge | per release and periodic | independence assessment, eval and validation evidence | | Audit replay | quarterly or risk-based | replay results, findings, CAPA | | Management review | quarterly | residual risk, investment, control maturity |
11. Financial Retail Examples
11.1 Payments: Card Dispute Provisional Credit
Business goal: Speed up dispute handling while preventing unauthorized credits, wrong denials and weak customer evidence. Duty design:
| Step | AI role | Human / control role | SoD |
|---|---|---|---|
| Intake summary | extract and summarize | frontline reviews for obvious gaps | AI is maker only |
| Rule deadline check | propose based on policy | rules service returns deterministic result | AI not final authority |
| Credit recommendation | recommend amount | payment specialist checker | maker-checker |
| High amount approval | prepare packet | supervisor second approval | dual authorization |
| Execute credit | no direct execute | payment gateway executes approval-bound action | tool PEP |
| Customer update | draft message | complaint-aware reviewer sends | communication control |
| KRIs: |
- provisional credit wrong amount。
- action executed without dual approval。
- same-maker-checker block。
- complaint escalation after denial。
- evidence missing transaction or rule deadline。
11.2 AML: Alert Triage And Case Closure
Business goal: Improve investigation quality without letting AI close suspicious activity prematurely. Duty design:
| Step | AI role | Human / control role | SoD |
|---|---|---|---|
| Transaction pattern summary | summarize | AML analyst validates evidence | maker-checker |
| Red flag recommendation | recommend | analyst challenges against typology | independent challenge |
| SAR narrative draft | draft | senior AML reviewer | pre-submission review |
| Alert closure | prohibited | senior authority approves closure | AI cannot close |
| QA | none | independent QA samples closures | operator != tester |
| Controls: |
- AI cannot be sole basis for alert closure。
- High-risk typology requires senior review。
- SAR-sensitive content uses restricted evidence handling。
- QA samples include cases where AI recommended no escalation。
11.3 Lending: Credit Underwriting Copilot
Business goal: Improve memo quality and policy citation without collapsing underwriting, approval and fair lending review. Duty design:
| Step | AI role | Human / control role | SoD |
|---|---|---|---|
| Application completeness | extract missing items | underwriter verifies | evidence sufficiency separate |
| Policy explanation | cite policy | decision service applies rules | LLM not final decision |
| Credit memo | draft | underwriter owns decision | AI maker only |
| Exception approval | prepare rationale | independent credit authority | conflict control |
| Adverse action reason | draft wording from approved code | compliance / approved reason service | reason source controlled |
| Model monitoring | support analytics | model risk validates | builder != validator |
| Controls: |
- Sales cannot approve its own exception。
- AI cannot generate unsupported adverse action reasons。
- Protected-class and proxy-factor checks require evidence。
- Human decision owner sees AI limitations and source versions。
11.4 Complaint Remediation Agent
Business goal: Draft consistent remediation responses and route complaints without auto-closing regulated complaints. Duty design:
| Step | AI role | Human / control role | SoD |
|---|---|---|---|
| Complaint classification | suggest class and urgency | complaint specialist checks high risk | risk-based checker |
| Root-cause summary | summarize evidence | operations owner validates | maker-checker |
| Remediation proposal | recommend fee reversal or apology | complaint lead approves | approval-before-action |
| Customer letter | draft | compliance-aware reviewer sends | customer-visible control |
| Closure | no autonomous closure for regulated complaints | complaint owner closes with evidence | closure authority |
| Audit response | prepare binder | audit owner reviews | audit independent |
| Controls: |
- Legal threat, regulator mention, vulnerability and discrimination signals force escalation。
- Customer remediation amount uses dual approval above threshold。
- Closure requires evidence checklist and reviewer signoff。
- AI-generated root cause does not replace management accountability。
12. 30-Day Lab
目标: 30 天内完成一套可展示的 AI Segregation of Duties / Dual Control architecture portfolio pack。 推荐选择 Payment Dispute Assistant、AML Copilot、Credit Underwriting Copilot 或 Complaint Remediation Agent。
Week 1: Discovery And Duty Inventory
| Day | Artifact | Task |
|---|---|---|
| 1 | use-case-boundary-card.md | Define customer impact, systems, actors and AI role |
| 2 | duty-inventory.md | List at least 20 duties across retrieve, extract, recommend, approve, execute, audit |
| 3 | actor-taxonomy.md | Identify human, agent, supervisor agent, service account, vendor and audit actors |
| 4 | workflow-state-map.md | Draw states from draft to execution to closure |
| 5 | incompatible-duty-matrix.md | Mark duties that cannot be combined |
| 6 | conflict-signal-table.md | Define same actor, same team, vendor, incentive and model-owner conflicts |
| 7 | control-strength-decision.md | Assign S0-S5 control strength per duty |
Week 2: Control Design
| Day | Artifact | Task |
|---|---|---|
| 8 | maker-checker-flow.md | Design maker-checker route and evidence packet |
| 9 | dual-authorization-card.md | Define high-impact action requiring two approvals |
| 10 | entitlement-scope-catalog.md | Create duty-specific read, draft, submit, approve, execute, audit scopes |
| 11 | sod-policy-rules.md | Write allow, deny, require_checker and require_dual_approval rules |
| 12 | approval-binding-spec.md | Define approval id, action hash, expiry and replay prevention |
| 13 | checker-workbench-spec.md | Specify evidence-first UI and checker actions |
| 14 | override-governance.md | Define override authority, reasons, second review and concentration alert |
Week 3: Evidence, Monitoring And Operations
| Day | Artifact | Task |
|---|---|---|
| 15 | evidence-ledger-schema.md | Define maker, checker, approver, executor, policy, tool and result fields |
| 16 | dashboard-kri-spec.md | Define SoD, quality, independence and executive metrics |
| 17 | rubber-stamp-detection.md | Create signals and controls for weak review |
| 18 | access-review-plan.md | Define monthly access recertification and stale scope checks |
| 19 | incident-runbook.md | Define response to executed action without proper approval |
| 20 | audit-replay-plan.md | Define how audit reconstructs one case end to end |
| 21 | governance-raci.md | Build RACI across business, PM, BA, architecture, IAM, risk, model risk, ops, audit |
Week 4: Case Study And Interview Pack
| Day | Artifact | Task |
|---|---|---|
| 22 | payments-case-study.md | Apply the design to one payment dispute scenario |
| 23 | aml-case-study.md | Apply the design to one AML alert scenario |
| 24 | lending-case-study.md | Apply the design to one credit exception scenario |
| 25 | complaint-case-study.md | Apply the design to one remediation scenario |
| 26 | tabletop-self-approval.md | Run exercise: AI attempts to approve own tool action |
| 27 | tabletop-rubber-stamp.md | Run exercise: checker approvals become too fast |
| 28 | executive-memo.md | Summarize value, risk, controls and residual risk |
| 29 | audit-qa.md | Write audit / regulator questions and evidence answers |
| 30 | interview-story.md | Prepare 30-second, 2-minute and architecture deep-dive answers |
Completion Standard
| Capability | Self-check |
|---|---|
| Duty clarity | Can name maker, checker, approver, executor, override owner and auditor |
| Incompatibility design | Can explain which duties cannot be combined and why |
| Runtime enforcement | SoD rules are enforced by workflow or tool gateway |
| Entitlement separation | Scopes separate read, draft, submit, approve, execute, override and audit |
| Evidence | Audit can replay a case without relying on final narrative only |
| Monitoring | Dashboard detects self-approval, rubber-stamp and override concentration |
| Interview readiness | Can distinguish SoD from generic IAM in two minutes |
13. Interview Answers
Q1: What is the difference between authorization and segregation of duties in AI?
30 秒:
Authorization asks whether an actor can perform an action. Segregation of duties asks whether the same actor should be allowed to prepare, recommend, approve, execute, override and audit the same outcome. In AI workflows, this matters because an agent can collapse many roles into one automated chain. 2 分钟: I would start with a duty inventory rather than a permission list. For example, a payment dispute assistant may retrieve evidence, summarize the case, recommend provisional credit, draft a customer message and request a payment tool call. Generic authorization might say the agent has a payment scope. SoD asks which of those duties are incompatible. The agent can be maker for evidence and recommendation, but a certified payment specialist must check the recommendation, a supervisor may need second approval above threshold, and the tool gateway must execute only an approval-bound action hash. The evidence ledger records maker, checker, approver, executor, policy version and result.
Q2: How would you design maker-checker for an AI copilot?
30 秒:
I separate the AI maker role from the checker role. The AI can produce a recommendation and evidence packet. The checker must see original evidence, policy version, customer impact and AI uncertainty, then approve, edit, reject or escalate with a reason code. 2 分钟: I would define the review unit first: claim, recommendation, draft, tool action or case closure. Then I would route it based on risk tier, customer impact and conflict rules. The checker cannot be the maker, cannot be in a conflicted role and must have authority to challenge. The UI is evidence-first, not approve-first. For side-effect actions, approval is bound to an action hash and the tool gateway blocks execution unless the approval state, scope and workflow state match.
Q3: Can a supervisor agent act as checker?
30 秒:
It can support low-risk consistency checks, but for high-impact financial, compliance or customer-facing actions I would not treat another LLM call as independent control by itself. 2 分钟: A supervisor agent may catch formatting errors, missing fields, policy conflicts or tool schema violations. But independence depends on role, model, data, owner and authority. If the supervisor agent uses the same model, same prompt family, same evidence summary and same team owner, it is weak independence. For AML, lending, payments and complaints, I would combine supervisor agent checks with external policy gates, human specialist review, dual authorization and audit sampling.
Q4: What are common incompatible duties in AI workflows?
The common ones are: recommendation and approval, evidence extraction and evidence sufficiency certification, draft and final customer send, refund proposal and funds release, AML risk summary and alert closure, credit memo draft and final credit decision, model build and model validation, control operation and control testing, vendor incident reporting and internal acceptance.
Q5: How do you prevent rubber-stamp human review?
I would monitor review time, override rate, reason quality, gold case accuracy, maker-checker pair concentration and QA defect rate. The checker workspace must show original evidence, missing evidence, policy version, customer impact and conflict flags. For high-risk cases I may use blind review, delayed reveal, double review and calibration. A near-zero override rate is not automatically good; it may indicate automation bias.
Q6: What evidence proves SoD worked?
Evidence should include duty chain, not just output. I need maker id, checker id, approver ids, agent version, workflow run, source ids, policy version, SoD decision, approval id, action hash, tool execution result, override reason and audit sample flag. Audit should be able to replay the case and confirm that the actor who made the recommendation did not approve or execute it improperly.
Q7: How does SoD apply to model risk management?
Model teams can build, tune and operate models, but independent validation or challenge should test model use, assumptions, limitations, monitoring and control effectiveness. For GenAI and agentic AI, that independence must be combined with operational controls: tool gateways, approval workflows, identity claims, evidence logging and consumer compliance checks. SR 26-2 superseded SR 11-7 and SR 21-8, but GenAI and agentic AI need additional workflow and authorization controls beyond model risk alone.
Q8: How would you explain dual control to a product executive?
Dual control is intentional friction for high-impact actions. It protects the business from one person, one agent or one workflow chain making and executing a consequential decision alone. We still use AI to prepare evidence and reduce handling time, but funds movement, account restriction, adverse customer action and regulatory-sensitive closure require independent approval and evidence. The goal is controlled automation, not unmanaged speed.
Q9: What is the architecture pattern?
The pattern is workflow orchestrator plus SoD policy decision point plus checker workbench plus approval service plus tool gateway plus evidence ledger. The orchestrator tracks duty state. The policy engine decides whether a checker, dual approval or specialist is required. The approval service binds decision to action hash. The tool gateway enforces approval-bound execution. The evidence ledger keeps the duty chain for audit replay.
Q10: What would you put in acceptance criteria?
I would require that no high-risk action executes without complete duty chain evidence, no maker can approve own work, no dual-control action can use the same approver twice, no approval can execute with changed parameters, and audit can replay the maker-checker-approver-executor chain. I would also require dashboards for SoD violations, rubber-stamp review, override concentration, action hash mismatch and evidence completeness.
14. Common Pitfalls
| Pitfall | Why it fails | Better design |
|---|---|---|
| Treating SoD as RBAC only | RBAC cannot see workflow history or maker-checker conflict | duty-aware policy with workflow state |
| Letting AI self-check | same model chain may share blind spots | independent evidence, rules, human or QA challenge |
| Broad agent scope | read, draft, approve and execute become one bundle | duty-specific scopes and tool gateway |
| Approval not bound to parameters | agent can change action after approval | action hash and single-use approval token |
| Human review without authority | reviewer cannot reject or stop route | checker actions include reject, request evidence, escalate, stop |
| Throughput-only KPI | reviewer becomes rubber stamp | balance speed with quality, override and gold case metrics |
| Vendor self-certification | supplier evidence may be incomplete | internal acceptance, contract evidence, audit sampling |
| Audit narrative only | cannot prove control operation | structured event schema and replayable evidence |
| Ignoring service accounts | downstream only sees technical principal | actor chain claims and approval reference |
| No exception governance | overrides become hidden policy | override owner, reason code, second review, concentration alert |
15. Final Operating Principle
AI SoD is not bureaucracy for its own sake. It is the control system that lets an organization safely use AI speed without concentrating incompatible authority. For高级 AI PM / BA / Architect, the practical skill is to turn this principle into workflow states, duty-specific entitlements, maker-checker queues, dual-control gates, action-bound approvals, KRI dashboards and replayable audit evidence.