返回 Papers
AI 扩展计划 / Playbooks

AI AML Alert Triage / Investigation Workbench Playbook

本文是学习、作品集、架构训练和内部治理讨论材料。

935AI_AML_ALERT_TRIAGE_INVESTIGATION_WORKBENCH_PLAYBOOK.md

AI AML Alert Triage / Investigation Workbench Architecture Playbook

定位: 面向高级 AI PM / Senior BA / Product Architect / AML Technology Architect / Financial Crime Transformation Lead / Model Risk Partner, 把 AML alert triage 和 case investigation 从“告警处理工具”升级为 evidence-first、human-owned、model-risk-controlled、audit-ready 的 AI 工作台体系。 适用范围: transaction monitoring alert queue、manual referral、fraud-to-AML referral、sanctions context handoff、CDD/EDD context refresh、entity resolution、graph investigation、case assembly、analyst copilot、SAR consideration packet、QA sampling、scenario tuning feedback、model risk validation、audit replay。 核心产出: reference architecture、capability map、queue prioritization policy、evidence workspace schema、copilot contract、disposition guardrail、SAR draft control、QA rubric、feedback taxonomy、model risk checklist、audit event schema、SoD matrix、metrics dashboard 和 30/60/90 implementation roadmap。


0. Disclaimer

本文是学习、作品集、架构训练和内部治理讨论材料。

本文不是法律意见、合规结论、SAR filing decision、suspicious activity determination、sanctions disposition、模型验证报告、审计报告或监管解释。

正式项目必须由 Legal、BSA/AML Compliance、Sanctions、Fraud、Risk、Model Risk、Privacy、Security、Operations、Internal Audit、Business Owner 和管理层结合机构类型、监管关系、司法辖区、产品、客户、渠道、数据、监管承诺和内部政策确认。

关键边界:

  • AI 可以辅助 alert prioritization、evidence assembly、case summarization、graph explanation、gap detection、draft case note、QA pre-check 和 tuning insight。
  • AI 不应替代 AML/BSA owner 的 suspicious activity determination 或 SAR filing decision。
  • AI 不应自动 file SAR、自动关闭高风险 case、自动解除 sanctions hit、自动通知客户、自动联系 law enforcement 或自动改变账户限制。
  • AI 不应把 red flag 推断成犯罪结论。
  • AI 不应泄露 SAR-sensitive information、NSL-sensitive information、restricted investigation content 或超出 need-to-know 的客户数据。

1. Executive Framing

AML AI 的低成熟度目标通常是:

reduce false positives
summarize cases faster
draft SAR narratives
automate analyst work

这些目标单独看都有价值, 但不足以支撑生产级金融犯罪控制。高级目标应该是:

risk-adjusted investigation capacity
  + evidence completeness
  + human decision ownership
  + queue timeliness
  + typology/scenario coverage awareness
  + SAR quality support
  + QA and tuning feedback
  + model risk governance
  + audit replay

一句话:

AML investigation AI is an evidence and workflow control system before it is a language model productivity feature.

1.1 Product Principles

  1. Queue priority must be explainable, route-aware and SLA-aware。
  2. Entity graph must carry confidence, source and access class。
  3. Evidence workspace is the product core; copilot is an assistant attached to evidence。
  4. Disposition recommendation must preserve human ownership。
  5. SAR draft assistance must be evidence-cited and filing-gated。
  6. QA findings must feed training, rules, data, prompt, retrieval and staffing decisions through controlled routes。
  7. Tuning must be coverage-safe, not merely alert-volume-reducing。
  8. Every high-impact output must be replayable across evidence, model version, prompt version, user action and approval chain。

1.2 Strong Questions

QuestionStrong answer
Why is an alert first in queue?priority band, top drivers, SLA, route, evidence and model/scenario version
What entity does this alert belong to?entity-resolved subject graph with confidence and source lineage
What evidence supports the case summary?cited evidence cards, source record ids, timestamps and freshness
What can AI recommend?next investigation step, evidence gap, disposition options and draft language, not final SAR decision
Who owns the decision?named human role and case state transition
How does QA improve the system?defect taxonomy, feedback routing, change gate and regression eval
How can audit replay the case?append-only event log with evidence ids, user ids, versions, outputs and approvals

2. Source Anchors

AnchorOfficial link本手册使用方式
FFIEC BSA/AML Suspicious Activity Reportinghttps://bsaaml.ffiec.gov/manual/AssessingComplianceWithBSARegulatoryRequirements/04作为 alert identification、managing alerts、SAR decision making、SAR completion、continuing activity monitoring 的流程锚点。
FFIEC SAR Examination Procedureshttps://bsaaml.ffiec.gov/manual/AssessingComplianceWithBSARegulatoryRequirements/04_ep作为 monitoring system review、manual/automated monitoring、independent validation、alert research、CDD/EDD consideration、documented no-file decision、SAR quality 和 transaction testing 的 evidence anchor。
FFIEC Customer Due Diligencehttps://bsaaml.ffiec.gov/manual/AssessingComplianceWithBSARegulatoryRequirements/02用 customer risk profile、expected activity、ongoing monitoring 和 beneficial ownership context 支撑调查上下文。
FFIEC CDD Examination Procedureshttps://bsaaml.ffiec.gov/manual/AssessingComplianceWithBSARegulatoryRequirements/02_ep用于客户风险画像、信息不足、CDD documentation、OFAC sanctioned parties context 和 customer information testing。
FFIEC BSA/AML Independent Testinghttps://bsaaml.ffiec.gov/manual/AssessingTheBSAAMLComplianceProgram/03用于 independent testing、IT source completeness/accuracy、SAR process review、training evidence、deficiency tracking 和 board/senior management reporting。
FFIEC Appendix F Red Flagshttps://bsaaml.ffiec.gov/manual/Appendices/07作为 red flag 和 additional scrutiny 语言锚点; 具体 typology coverage 使用 repo 内单独 playbook。
FFIEC Appendix L SAR Quality Guidancehttps://bsaaml.ffiec.gov/manual/Appendices/13用于 SAR draft quality pre-check 和 narrative evidence discipline。
FinCEN SAR Resourceshttps://www.fincen.gov/suspicious-activity-reports-sars用于 SAR resources、BSA E-Filing handoff 和 official SAR resource navigation。
FinCEN BSA Filing Informationhttps://www.fincen.gov/resources/filing-information用于 E-Filing boundary、filing operations、test system separation 和 recordkeeping handoff。
FinCEN SAR FAQhttps://www.fincen.gov/resources/frequently-asked-questions-regarding-fincen-suspicious-activity-report-sar用于 filing instructions location、amended SAR process、BSA ID acknowledgement、save/recordkeeping considerations 的产品边界。
OFAC Sanctions List Servicehttps://ofac.treasury.gov/sanctions-list-service用于 sanctions list data、list update、search and screening evidence reference。
OFAC Sanctions List Search Toolhttps://ofac.treasury.gov/sanctions-list-search-tool用于 analyst-facing list search reference; 生产自动筛查应使用机构批准的 sanctions screening infrastructure。
OFAC Compliance Frameworkhttps://ofac.treasury.gov/media/16331/download用于 sanctions compliance program context、management commitment、risk assessment、internal controls、testing/auditing、training 的边界补充。
NIST AI RMFhttps://www.nist.gov/itl/ai-risk-management-framework用 Govern / Map / Measure / Manage 组织 AI risk management、eval、monitoring 和 remediation。
NIST AI RMF Corehttps://airc.nist.gov/airmf-resources/airmf/5-sec-core/用于把 AI control 设计成 continuous lifecycle function, 而不是上线前 checklist。
NIST GenAI Profilehttps://www.nist.gov/publications/artificial-intelligence-risk-management-framework-generative-artificial-intelligence用于 hallucination、data leakage、prompt injection、overreliance、third-party dependency、information integrity 和 GenAI eval。
ISO/IEC 42001https://www.iso.org/standard/81230.html用 AI management system 思路设计责任、运行控制、绩效评价、管理评审和持续改进。

架构映射原则:

official source anchor
  -> control objective
  -> product requirement
  -> workflow state
  -> evidence record
  -> owner and review cadence

3. End-to-End Operating Model

3.1 Target Flow

1. Alert intake
   rules, models, manual referrals, fraud referrals, sanctions context, law-enforcement request flags

2. Context assembly
   entity resolution, CDD/EDD, expected activity, prior cases, transaction timeline, graph context

3. Queue prioritization
   risk band, SLA, route, top drivers, missing evidence, capacity-aware assignment

4. Investigation workspace
   evidence cards, timeline, graph, source documents, copilot, analyst tasks

5. Human disposition
   close, continue monitoring, request more evidence, escalate, refer, SAR consideration packet

6. SAR draft support
   cited draft packet, quality pre-check, human approval path, filing handoff outside AI autonomy

7. QA and oversight
   sample selection, defect scoring, reviewer calibration, issue logging, training feedback

8. Tuning and governance
   scenario tuning, model/prompt/RAG change, data remediation, coverage regression, validation

9. Audit replay
   event stream, version history, evidence ledger, approval trail, management reporting

3.2 State Machine

StateEntry conditionAllowed AI supportHuman/control gate
alert_receivedalert from approved sourcenormalize, classify source, attach trigger factssource system contract
context_readyminimum subject and transaction context assembledentity resolution, CDD snapshot, graph retrievalentity confidence warning
queuedpriority and route assignedpriority explanation, SLA assignmentrouting policy owner
in_reviewanalyst opens caseevidence summary, timeline, gap checklistanalyst remains reviewer
needs_inforequired evidence missingtask suggestions, source lookuptask owner and SLA
escalatedanalyst/supervisor escalatesescalation packetsenior/AML owner assignment
sar_considerationhuman escalates to SAR considerationdraft packet and QA pre-checkBSA/AML owner decision
closed_no_sarhuman selects closureclosure rationale draftclosure reason and evidence
qa_sampledrisk-based or random QA selectionQA pre-check, defect hintsindependent QA reviewer
feedback_loggedQA/analyst/model issue capturedtaxonomy suggestionfeedback owner and change gate

3.3 Non-Goals

Non-goalReason
Replace transaction monitoring systemWorkbench consumes and contextualizes alerts; it should not hide source scenario governance.
Replace case management of recordWorkbench may integrate or extend CMS, but authoritative case status must be explicit.
Auto-file SARSAR filing decision and submission are controlled human/compliance workflows.
Replace sanctions screeningOFAC/sanctions systems have separate screening, disposition and blocking controls.
Replace Legal/Compliance interpretationApplicability and policy decisions remain owned by control functions.

4. Capability Map

CapabilityProduct functionArchitecture servicesControl owner
Alert intakenormalize alert and trigger factsevent ingestion, schema registry, source adapterAML Technology
Queue prioritizationroute and rank workpriority engine, SLA policy, assignment serviceAML Operations
Entity resolutionconnect subjects/accounts/counterpartiesidentity graph, match service, confidence scoringData/AML Analytics
Context graphreveal relationship and funds flowgraph store, path search, temporal aggregationAML Analytics
Evidence assemblyprepare cited evidence packetevidence ledger, source connector, provenance serviceAML Operations
Analyst copilotsummarize, ask, compare, check gapsRAG, LLM gateway, prompt registry, tool gatewayAI Product/Platform
Disposition supportshow options and rationaledecision support service, policy/rubric engineBSA/AML Compliance
SAR draft supportdraft with citations and guardrailscontrolled generation, citation validator, redactionBSA/AML Compliance
QA workbenchsample, score, calibrate, remediateQA workflow, defect taxonomy, issue trackerQA / Compliance Testing
Feedback looproute defects to tuning/data/trainingfeedback registry, change workflow, eval set builderModel Risk / Scenario Owner
Audit replayreconstruct who saw/did/approved whatappend-only event stream, evidence hash, version registryInternal Audit / Risk
Management reportingtrack capacity, quality, risk, controlsmetrics store, dashboard, KRI serviceAML Leadership

5. Reference Architecture Blueprint

5.1 Logical Layers

Experience
  Queue console | Investigation workbench | Copilot panel | QA console | Management dashboard

Workflow
  Case state machine | Assignment | SLA | Escalation | Maker-checker | SAR handoff | Feedback routing

AI decision support
  Priority model | Entity resolution | Graph analytics | Retrieval | LLM summarization
  Disposition options | SAR draft assist | QA judge | Eval harness

Evidence and context
  Evidence ledger | Timeline builder | CDD snapshot | Counterparty graph | Prior case index
  Typology/scenario link | Sanctions/fraud referral context

Data and integration
  Core banking | Transactions | KYC/CDD/EDD | Case management | Sanctions screening
  Fraud | CRM | Document store | External advisories/lists | User/IAM

Controls and platform
  RBAC/ABAC | SAR confidentiality | Audit log | Version registry | Data lineage
  Model inventory | Monitoring | Incident management | Retention | Encryption

5.2 Component Responsibilities

ComponentOwnsDoes not own
Alert ingestion servicesource normalization, schema validation, duplicate detectionsuspicious activity conclusion
Priority engineroute, SLA, priority band, driver explanationfinal case disposition
Entity graph servicematch, confidence, relationship edgeslegal identity conclusion when data conflicts
Evidence ledgerimmutable evidence references and provenancefree-text narrative truth
Copilot orchestratorRAG, prompt, LLM, tool routing, structured outputdirect filing or irreversible account action
Citation validatorchecks claims against evidence idssemantic legal sufficiency
Policy guardrail engineprohibited output/action blocks, role permissionspolicy interpretation without owner approval
Case workflow servicestate transitions, assignments, approvalsbypassing CMS of record without integration agreement
QA servicesample, rubric, defects, calibrationtuning deployment approval
Feedback registryclassify feedback and route changesautomatic retraining from raw outcomes
Model risk registryinventory, validation evidence, revalidation triggersoperational queue management
Audit event serviceappend-only trace and replaymanual reconstruction after missing logs

6. Queue Prioritization Playbook

6.1 Priority Policy

Queue priority is a policy-backed decision support output:

priority_result:
  alert_id: alert_2026_18491
  priority_band: P1
  route_to_queue: aml_senior_investigation
  SLA_due_at: 2026-07-01T17:00:00Z
  top_drivers:
    - driver: repeated below-threshold cash deposits
      evidence_ids: [ev_tx_001, ev_tx_002, ev_tx_003]
    - driver: new outbound wires to unrelated counterparties
      evidence_ids: [ev_tx_008, ev_counterparty_011]
    - driver: customer high-risk CDD profile
      evidence_ids: [ev_cdd_003]
  uncertainty:
    - entity link to counterparty cluster is medium confidence
  missing_context:
    - latest stated business purpose for new wire recipient
  prohibited_inference: "priority is not SAR filing decision"

6.2 Decision Table

ConditionPriority actionRouting actionControl note
Repeat alert on same subject with prior escalationIncrease prioritySenior investigator or continuing activity queueShow prior case metadata under access controls
High-risk customer with activity inconsistent with expected profileIncrease priorityEDD-aware queueCDD freshness and source must be visible
Alert triggered by weak entity link onlyDo not over-prioritizeEntity review or general queueGraph confidence must be explicit
Sanctions-related context presentHard escalation flagSanctions referral pathAML workbench does not clear sanctions hit
Missing critical source dataRoute to data exception or needs-infoOperations/data ownerAnalyst should not guess missing evidence
Aged lower-priority alert near SLA breachRaise operational priorityQueue manager reviewTimeliness is a control dimension
Low score but new typology/advisory coverageMaintain sample or elevated reviewCoverage monitoringAvoid starving emerging risks

6.3 Priority Eval

EvalMetric
Ranking usefulnesspercentage of high-risk QA/SAR-consideration cases surfaced in top bands
TimelinessP1/P2 SLA adherence, aged alert count
Coveragetypology/scenario distribution across priority bands
Fairness/segment stabilitypriority distribution by product, channel, customer type, geography
Analyst trustoverride rate with reason, agreement by scenario
Control safetyunsupported driver rate, missing evidence route accuracy

7. Entity Resolution and Graph Context Playbook

7.1 Entity Types

EntityExamplesInvestigation use
Personcustomer, beneficial owner, authorized signersubject identification and relationship context
Organizationbusiness customer, merchant, employer, shell company leadownership and activity purpose
Accountdeposit, loan, card, wallet, brokeragefunds flow and product behavior
CounterpartyACH originator, wire beneficiary, P2P recipient, check payeerelationship and network risk
Device/IP/addressdigital footprint, branch address, mailing addressmule/synthetic identity/context
Case/SAR-sensitive recordprior case, prior SAR metadata, QA findingrepeat activity and escalation context
External list itemsanctions list item, advisory entity, high-risk geographyreferral and control context

7.2 Match Governance

Match levelExampleUI treatmentDownstream use
Deterministic authoritativesame customer id or account idmerged by defaultcan support fact statement
Deterministic corroboratedtax id plus legal name plus datemerged with source badgecan support escalation fact
Probabilistic highstrong name/address/device combinationprobable linkcan support investigation lead with caveat
Probabilistic mediumshared address plus transaction patternreview linkcannot be stated as confirmed
Weak leadone shared phone/address/IPlead onlycannot drive final disposition alone
Conflictinconsistent identity sourceswarningrequires data review or analyst note

7.3 Graph Guardrails

RiskGuardrail
False mergeconfidence bands, explainable match features, analyst unlink path
Overexposureedge-level access class and SAR-sensitive filtering
Stale relationshipeffective date, last seen date, source freshness
Graph cluttertime-windowed paths, amount-weighted edges, typology-focused filters
Unsupported inferencedistinguish "observed connection" from "suspicious relationship"
Feedback contaminationgraph correction goes through data quality workflow

Graph card minimum:

edge_id: edge_9812
from_entity: ent_customer_031
to_entity: ent_counterparty_774
relationship_type: repeated_outbound_wire
first_seen: 2026-05-03
last_seen: 2026-06-18
amount_total: 84200.00
source_evidence_ids: [ev_tx_121, ev_tx_219, ev_tx_311]
confidence: high
access_class: aml_restricted
analyst_confirmed: false

8. Case Assembly and Evidence Workspace

8.1 Workspace Layout

RegionContentDesign principle
Headeralert id, subject, priority, SLA, route, status, owneroperational clarity before AI content
Trigger panelscenario/model trigger facts and source versionsource signal is always visible
Timelinetransactions, referrals, profile changes, prior alertschronology beats narrative
Graphsubject, accounts, counterparties, beneficial owners, devicesconfidence and source shown on edges
CDD/EDD contextexpected activity, risk profile, occupation/business, beneficial ownershipcontext for unusual activity, not static KYC display
Evidence cardscitations, raw facts, source record ids, freshnessevery AI claim must trace here
Copilot panelsummary, questions, gap checklist, draft notesassistant beside evidence
Disposition paneloptions, rationale, reason codes, required evidencehuman-owned decision
QA/control panelwarnings, missing fields, unsupported claims, SoD statecontrol state visible before close/escalate

8.2 Evidence Checklist

Evidence classMinimum checksDefect examples
Transaction sequenceamount, date/time, channel, originator, beneficiary, directionmissing counterparty, timezone mismatch
Customer profileCDD risk, expected activity, account purpose, occupation/businessstale profile, missing beneficial owner
Counterparty contextrelationship history, novelty, geography, recurrenceunresolved entity, unknown recipient
Prior activityprior alerts, cases, SAR-sensitive metadata under access controlinaccessible prior case, missing repeat activity
DocumentsKYC docs, invoices, account notes, analyst notesunverified document, OCR extraction error
External contextadvisories, sanctions referral, law-enforcement request indicatorrestricted content exposed to wrong role
Analyst rationalereason code, free-text explanation, evidence referencesconclusion without facts

8.3 Evidence Quality Score

Evidence quality score should guide review readiness, not replace analyst judgment.

DimensionScore question
CompletenessAre required evidence classes present for this scenario and customer segment?
FreshnessIs CDD/EDD and transaction data current enough under internal policy?
ConsistencyDo sources agree on identity, ownership, account status and activity purpose?
Citation readinessCan summary and draft claims cite source evidence ids?
Access validityIs evidence visible only to authorized roles?
ReplayabilityCan the same evidence packet be reconstructed later?

9. Analyst Copilot Design

9.1 Copilot Jobs

JobPrompt postureOutput contract
Explain alert"Summarize trigger facts only from evidence"trigger summary with evidence ids
Build chronology"Order observed facts by event time"timeline entries, not motive
Compare to expected activity"Contrast activity with CDD expected profile"differences, source freshness, missing context
Summarize graph"Describe confirmed and probable relationships separately"graph facts with confidence
Identify evidence gaps"List missing sources for review"task list by source owner
Suggest next steps"Offer investigation actions, not decisions"analyst-selectable checklist
Draft case note"Draft concise note with citations and uncertainty"editable cited note
Pre-check disposition"Find unsupported claims and missing rationale"QA warnings
Draft SAR packet"Prepare evidence-cited draft packet for human review"draft marked non-final

9.2 Prohibited Copilot Behaviors

BehaviorBlock
"This customer committed money laundering"Replace with observed facts and suspected indicators
"File SAR" as final instructionUse "escalate for SAR consideration" when evidence supports
"No SAR needed" as final conclusionPresent closure rationale option for human decision
Uncited factual assertionRequire evidence id or remove
Revealing SAR existence to unauthorized rolePermission filter and output block
Using sanctions hit to clear customerRoute to sanctions workflow
Creating customer communicationBlock unless separate approved workflow exists
Training from unreviewed analyst notesRoute through feedback quality gate

9.3 Output Rubric

DimensionPass criteria
GroundednessEvery factual claim maps to source evidence
BoundaryOutput states it is decision support
UncertaintyLow/medium confidence links are labeled
CompletenessMissing evidence is named
NeutralityNo criminal conclusion or accusatory wording
ActionabilitySuggested next step is role-appropriate
ConfidentialitySAR-sensitive and restricted data are handled by role

10. Disposition Recommendation Guardrails

10.1 Human-Owned Disposition Model

AI suggests options
  -> analyst reviews evidence
  -> analyst selects disposition
  -> required reason/evidence captured
  -> senior/compliance gate if escalation threshold reached
  -> QA sampling
  -> feedback registry

10.2 Disposition Options

OptionWhen suggestedRequired evidenceApproval
Close as no unusual activityactivity consistent with CDD/EDD and benign evidence existstrigger facts, benign rationale, CDD consistencyanalyst or policy-defined reviewer
Continue monitoringincomplete pattern or emerging activity without enough basisrepeat trigger, monitoring rationale, future review dateanalyst/supervisor
Request more informationmaterial evidence missingmissing source and owneroperations owner
Escalate investigationrisk drivers or unresolved gaps require deeper reviewcited risk drivers, gap listsenior investigator route
Refer to fraud/sanctions/EDDdomain-specific control signal presentreferral evidence and scopereceiving control function
SAR consideration packethuman sees sufficient suspicious activity concern for reviewchronology, subject data, evidence packetBSA/AML owner or internal policy role

10.3 Recommendation Explanation

Every recommendation should include:

  • option label。
  • evidence for。
  • evidence against。
  • missing evidence。
  • confidence boundary。
  • required human role。
  • downstream control gate。
  • why the option is not a final filing decision。

11. Suspicious Activity Escalation and SAR Draft Guardrails

11.1 Escalation Packet

escalation_packet:
  case_id: case_2026_771
  escalation_type: sar_consideration
  created_by: analyst_42
  ai_assisted: true
  basis:
    - observed rapid movement of funds
    - activity inconsistent with stated business purpose
    - repeated new counterparties
  evidence_ids:
    - ev_tx_01
    - ev_tx_02
    - ev_cdd_01
    - ev_graph_04
  unresolved_questions:
    - relationship to counterparty group
    - legitimate business explanation for wire corridor
  copilot_output_hash: hash_...
  human_owner: aml_supervisor_7

11.2 SAR Draft Workflow

StepAI roleHuman/control role
Evidence selectionsuggest relevant facts and transactionsanalyst selects and confirms
Draft chronologyorder cited factsanalyst edits
Narrative assistpropose neutral wording with citationsBSA/AML reviewer approves wording
Quality pre-checkflag missing fields, unsupported facts, risky wordingreviewer resolves or documents exception
Filing handoffprepare packet for approved filing processauthorized filer uses approved workflow
Recordkeepinglink draft, final packet id, evidence and approvalscompliance-owned retention process

11.3 SAR Draft Controls

ControlImplementation
No auto-fileNo model/tool permission can submit SAR; filing API absent or human-token gated
Citation requiredDraft sentences referencing facts require evidence ids
Unsupported claim blockerUnsupported claims are removed or marked for human evidence entry
SAR-sensitive retrievalRAG filters prior SAR content by role, case need and policy
Edit diffHuman edits to AI draft are versioned
Disclosure warningUI warns against inappropriate SAR disclosure and customer notification
Final decision labelDraft states "AI-assisted draft for human review"
Filing record linkWorkbench stores handoff id or acknowledgement metadata according to policy, not hidden copy leakage

12. QA and Quality Control

12.1 QA Sampling Strategy

Sample typePurpose
Random baselinemeasure overall quality and drift
Risk-based P1/P2check high-risk investigation quality
Closure sampledetect weak no-file / no-unusual-activity rationale
AI-heavy sampleinspect cases with high copilot reliance
Low-confidence entity sampledetect graph/identity errors
SAR draft samplecheck evidence, wording, completeness and confidentiality
Analyst outlier sampledetect training or incentive issues
Scenario tuning samplemeasure before/after control impact

12.2 QA Rubric

Rubric areaDefect
Evidence completenessrequired source missing without explanation
Citation qualitycase note or draft claim lacks evidence id
CDD usageexpected activity ignored or stale CDD not flagged
Entity reasoningprobable link stated as confirmed
Disposition rationaleclosure/escalation reason not tied to facts
SAR boundaryAI output treated as filing decision
Confidentialityrestricted content visible to unauthorized role
TimelinessSLA breach without documented reason
Copilot safetyhallucination, unsupported inference, unsafe wording
Workflow controlmaker-checker or approval bypass

12.3 QA Finding Record

qa_finding:
  finding_id: qaf_2026_1031
  case_id: case_2026_771
  sampled_reason: closure_sample
  defect_type: citation_quality
  severity: medium
  observation: closure rationale references counterparty relationship not supported by evidence
  required_action: reopen_note_for_correction
  owner: analyst_supervisor_3
  system_feedback:
    route: copilot_eval_set
    eligible_for_training: false
    reason: requires corrected human rationale first

13. Tuning Feedback and Continuous Improvement

13.1 Feedback Routes

Feedback sourceRouteNot allowed
Analyst disagreementeval enrichment, UI improvement, prompt reviewdirect model update without review
QA defecttraining, control remediation, eval caseshiding defect inside productivity metric
False-positive driverscenario tuning, CDD feature improvementsuppressing segment without coverage review
False-negative proxyscenario gap review, typology coverage updateignoring because no alert was generated
Entity resolution correctiondata quality and graph model tuningsilently overwriting historical graph without trace
SAR draft defectprompt/RAG/citation validator fixletting draft model self-certify
Data gapsource owner remediationasking analyst to work around permanent missing source
Adoption frictionworkflow redesignforcing usage through KPI alone

13.2 Tuning Change Gate

GateEvidence required
Business rationaleproblem statement, impacted scenario, expected benefit
Coverage checktypology/scenario link and blind-spot assessment
Data impactsource fields, lineage, DQ score, missingness by segment
Backtestbefore/after alert volume, precision proxy, sample review
QA reviewsample defects and closure quality
Model risk reviewmateriality, validation scope, independent challenge
Rollout planpilot cohort, monitoring, rollback criteria
Approvalscenario owner, BSA/AML owner, model risk or governance role as applicable

13.3 Anti-Gaming Guardrail

Any tuning proposal that only shows alert-volume reduction is incomplete. It must also show:

  • high-risk typology coverage impact。
  • sampled closure quality。
  • false-negative proxy or lookback check。
  • product/channel/customer segment slices。
  • data quality sensitivity。
  • operational capacity effect。
  • QA and audit replay readiness。

14. Model Risk, Eval and Validation

14.1 AI System Inventory

Inventory itemRequired fields
Priority enginemodel/rule version, features, route policy, owner, validation evidence
Entity resolutionalgorithm, sources, confidence calibration, false merge/missed link metrics
Graph analyticsgraph schema, edge confidence, temporal logic, access class
RAG retrieversource registry, index version, permission filter, freshness
LLM summarizermodel, prompt, output schema, prohibited behaviors, eval report
Disposition assistrubric, options, boundary statement, human gate
SAR draft assistantcitation validator, wording guardrails, no-file/no-submit control
QA judgerubric, human alignment, drift monitoring, false pass rate
Workflow enginestate transitions, SoD rules, audit events, fallback

14.2 Eval Suite

Eval categoryTest examplesRelease gate
Groundednessall factual claims cite correct evidencecritical failures block
Missing evidencesystem detects absent CDD/EDD/counterparty contextscenario-specific threshold
Entity resolutionfalse merge and missed link by segmenthigh-risk false merge hard stop
Queue rankingP1/P2 quality, timeliness and coverageno material coverage regression
SAR boundaryno auto-file, no final filing decision, no prohibited disclosurehard stop
Prompt injectionmalicious note/document attempts to override policyhard stop for tool or data leakage
ConfidentialitySAR-sensitive and restricted content filtered by rolehard stop
Human oversightreviewer can understand, challenge, override and escalateUAT plus QA sample
QA judgeautomated QA aligns with human QAjudge cannot be sole control
Driftproduction distribution, source freshness, output qualitymonitoring alert and review

14.3 Validation Questions

AreaChallenge question
Use case boundaryIs the system framed as decision support, or does workflow pressure make it de facto decisioning?
Conceptual soundnessWhy are graph/RAG/LLM appropriate for this AML workflow?
Data adequacyAre CDD, transactions, counterparty, sanctions/fraud referrals and case history complete enough?
Process verificationAre prompt/model/source/rule changes controlled and replayable?
Outcome analysisDoes quality improve without hidden coverage loss?
Human oversightCan analysts and reviewers challenge AI with enough evidence and time?
Independent challengeAre model risk/QA/internal audit able to test without builder conflict?

15. Audit Trail, Access and Segregation of Duties

15.1 Audit Event Schema

audit_event:
  event_id: aud_2026_88192
  event_type: disposition_selected
  case_id: case_2026_771
  alert_id: alert_2026_18491
  actor_id: analyst_42
  actor_role: aml_analyst
  timestamp: 2026-06-30T15:22:17Z
  action:
    disposition: escalate_for_sar_consideration
    reason_code: pattern_inconsistent_with_cdd
  evidence_ids:
    - ev_tx_01
    - ev_cdd_01
    - ev_graph_04
  system_versions:
    priority_model: prio_v2.1
    prompt: aml_summary_v1.8
    rag_index: aml_evidence_2026_06_29
    entity_resolution: er_v3.4
  output_hash: hash_...
  previous_state: in_review
  new_state: sar_consideration

15.2 Access Model

Data classAccess principle
Standard customer dataleast privilege, business purpose, logged access
AML restricted evidencerole-based AML need-to-know
SAR-sensitive contentstrict role and case need; no broad RAG exposure
NSL-sensitive indicatorspecial handling; avoid exposing content beyond policy
Sanctions hit detailsanctions role visibility and referral controls
Model/prompt logssensitive operational access; redact customer data where possible
QA findingsQA/compliance/model risk access with management reporting aggregation

15.3 SoD Matrix

Duty combinationControl
Analyst triages and independently QA-reviews same caseBlock; assign independent QA
Analyst creates SAR draft and approves filingRequire BSA/AML owner or filing role per policy
Model owner changes priority model and approves validationIndependent model risk challenge
Scenario owner suppresses alert type without compliance approvalTuning change gate and coverage review
Admin can alter evidence ledger and audit event streamImmutable logs, dual control, privileged access monitoring
Vendor supplies model and certifies production qualityInternal validation, QA sample and audit rights
Queue manager lowers priority due to staffing aloneDocument risk acceptance or reassign capacity

16. Controls and Evidence Checklist

Control objectiveEvidenceOwner
Alert sources are complete and approvedsource inventory, data contract, ingestion reconciliationAML Technology
Monitoring outputs are timelyalert generation logs, SLA dashboardAML Operations
Queue routing is explainablepriority record, driver evidence, route policyAML Operations
CDD/EDD context is usedCDD snapshot, freshness indicator, expected activity comparisonAML Compliance/Data Owner
Entity graph is controlledmatch report, confidence calibration, correction logData/Analytics
Copilot outputs are groundedcitation validation logs, eval report, sampled outputsAI Product/QA
SAR filing is human-ownedworkflow state, approval record, no-submit technical controlBSA/AML Compliance
SAR-sensitive data is protectedaccess logs, role matrix, retrieval filter testsSecurity/Compliance
QA is independentsample plan, QA findings, reviewer independence evidenceQA/Compliance Testing
Tuning is governedchange request, coverage regression, approvalScenario Owner/Model Risk
Model risk is managedinventory, validation plan, monitoring, revalidation recordModel Risk
Audit replay is possibleappend-only event log, version registry, evidence idsInternal Audit/Platform
Deficiencies are remediatedissue log, CAPA, closure evidence, management reportingControl Owner

17. Metrics Dashboard

17.1 Executive View

MetricHealthy signalRisk signal
Risk-adjusted investigation capacitymore high-risk cases reviewed with stable qualitythroughput up, QA quality down
Queue agingfewer aged P1/P2 alertsold high-risk alerts accumulating
Evidence completenessrequired evidence present before dispositionclosures with missing CDD/counterparty context
SAR consideration qualityfewer unsupported draft defectsmore narrative defects or late escalations
Coverage stabilityno material typology/channel blind spotalert suppression after tuning
Human oversightmeaningful overrides and reasoned disagreementsnear-100 percent acceptance of AI suggestions
Audit completenesstrace events completemissing model/prompt/evidence versions

17.2 Analyst Adoption View

MetricInterpretation
Evidence card usagewhether analysts rely on structured evidence
Copilot summary edit distancewhether drafts are useful without being rubber-stamped
Gap checklist completionwhether AI improves investigation completeness
Recommendation override reasonstrust calibration and model/product issues
Time in source systemsreduction indicates workbench value
Escalation packet reworkhigh rework indicates evidence or UI issues

17.3 Model/Control View

MetricInterpretation
Unsupported claim ratehallucination or citation failure
Retrieval miss rateRAG/source/index issue
Entity false merge rategraph risk
Priority override concentrationrouting/model bias or policy mismatch
QA judge false pass rateautomated QA cannot be trusted alone
SAR-sensitive access exceptionsconfidentiality control risk
Tuning rollback countchange quality and coverage impact

18. Implementation Roadmap

18.1 First 30 Days - Evidence-First MVP

WorkstreamDeliverable
Scopechoose 2-3 alert scenarios and 1-2 queues
Datasource inventory, CDD/transaction/case evidence schema, lineage
UXqueue console, evidence cards, timeline, basic disposition panel
AIretrieve-and-summarize with citation validator; no SAR draft yet
Controlsrole model, audit event schema, no-auto-SAR technical boundary
Evalgroundedness, citation accuracy, missing evidence detection
Opsanalyst pilot, feedback taxonomy, daily defect review

18.2 Days 31-60 - Graph and Controlled Copilot

WorkstreamDeliverable
Entity resolutionconfidence-banded entity graph and correction workflow
Queuedriver-based prioritization and SLA routing
Copilotgap checklist, investigation plan, case note draft
QAsample plan, rubric, QA console, defect taxonomy
Model risksystem inventory, validation scope, revalidation triggers
Metricsproductivity, quality, adoption and control dashboard

18.3 Days 61-90 - SAR Packet and Governance Loop

WorkstreamDeliverable
SAR assistevidence-cited draft packet, unsupported-claim blocker, human filing handoff
Tuningscenario feedback workflow, coverage regression, change approval
Independent challengemodel risk/QA validation report and issue log
Auditcase replay pack, event completeness test, privileged access review
Traininganalyst AI literacy, automation-bias drill, SAR confidentiality handling
Managementmonthly KRI/KPI pack and control remediation review

19. Implementation Guardrails

AreaGuardrail
Product scopeStart with evidence assembly and cited summaries before SAR narrative generation
DataDo not ingest unrestricted SAR-sensitive content into broad RAG index
UXDo not make AI recommendation the default selected disposition
WorkflowRequire human reason code and evidence references for close/escalate decisions
ModelVersion prompts, models, retrievers, rules, thresholds and graph algorithms
FeedbackSeparate raw analyst action from QA-reviewed training label
OperationsDo not tune alert volume to staffing capacity without risk acceptance
SecurityEnforce row/field/evidence-level access, not just page-level roles
AuditDesign trace schema before pilot; screenshots are not enough
VendorContract for logs, version notices, data handling, audit rights and exit

20. Anti-Patterns and Fixes

Anti-patternSymptomFix
LLM-first AMLimpressive summaries but weak evidenceevidence ledger first, copilot second
SAR draft as MVPnarrative looks good, facts not controlledbuild case assembly and citation validator first
Auto-close low scorelower backlog but higher blind-spot riskhuman closure, QA sample, coverage metrics
Graph magicdense network visual with no confidenceconfidence-banded edges and source cards
Hidden decisioningAI recommendation becomes default operational outcomeexplicit human gate and non-default UI
Productivity tunnel visionfaster cases but more QA defectsbalanced scorecard
Tuning by complaintsthreshold changes react to queue pressuregoverned change gate and regression testing
Self-validating AImodel/judge grades its own workindependent QA and human calibration
Audit after launchmissing versions and evidence idsevent schema and version registry from day one

21. PM / Architect Implications

21.1 PM Implications

PM concernPractical stance
Value propositionSell capacity plus quality plus auditability, not "AI replaces analysts"
MVPEvidence workspace and cited copilot before SAR draft
User adoptionMeasure edit distance, evidence card usage, override reasons and rework
Risk appetiteDefine which dispositions need senior review, QA and model risk gates
TrainingInclude automation bias, citation checking, confidentiality and escalation drills
Stakeholder alignmentOperations wants speed; Compliance wants quality; Model Risk wants validation; Audit wants replay

21.2 Architect Implications

Architecture concernPractical stance
Source of truthSource systems plus evidence ledger, not model text
Context engineEntity graph and timeline should be independently testable
AI orchestrationUse structured outputs, citation validation and tool permissions
SecuritySAR-sensitive and investigation data need fine-grained retrieval filters
ObservabilityLog user action, evidence ids, model/prompt/index versions and output hashes
ResilienceProvide manual fallback when AI/RAG/graph services degrade
GovernanceConnect system inventory, release gates, monitoring and revalidation triggers

22. Interview Pack

Q1: 如何设计 AML alert triage AI workbench?

30 秒版本:

我会把它设计成 evidence-first workbench, 不是聊天机器人。核心链路是 alert intake、entity resolution、CDD/EDD context、graph/timeline evidence assembly、risk-based queue prioritization、analyst copilot、human-owned disposition、SAR draft guardrails、QA、tuning feedback 和 audit replay。AI 辅助排序、总结、查缺口和起草, 但不做 SAR filing decision。

2 分钟版本:

架构上先建 evidence ledger 和 entity graph。每个 alert 保留 source scenario/model version、trigger facts、SLA 和 route。entity resolution 输出 confidence-banded subject graph, 图谱边都有 source、timestamp 和 access class。工作台里 analyst 看到 timeline、CDD expected activity、counterparty graph、prior case context 和 evidence cards。Copilot 只能基于证据输出结构化 summary、gap checklist、next step 和 disposition options, 每个事实必须有 evidence id。SAR 草稿只是 human-reviewed packet, 没有 auto-filing tool permission。上线控制包括 QA sampling、feedback taxonomy、scenario tuning change gate、model risk validation、SAR-sensitive access control 和 append-only audit trail。价值指标同时看调查时间、QA defect、queue aging、coverage、override 和 audit completeness。

Q2: 如何防止 AI 误导 analyst?

ControlExplanation
Evidence-first UIAnalyst 先看到证据和来源, 再看到 AI summary
Citation validatorUnsupported factual claim cannot pass
Non-default recommendationAI option cannot be one-click default decision
Uncertainty labelLow/medium confidence graph links clearly标识
QA sampleAI-heavy cases sampled more aggressively
TrainingAutomation bias and confidentiality drills
Override analyticsMonitor over-acceptance and disagreement quality

Q3: 如何处理 SAR draft?

PrincipleAnswer
BoundaryAI drafts evidence-cited packet, not filing decision
EvidenceEvery factual statement maps to transaction/CDD/case evidence
LanguageNeutral, observed facts, no criminal conclusion
ConfidentialitySAR-sensitive access and retrieval controls
ApprovalHuman BSA/AML owner reviews and filing process remains controlled
AuditStore model/prompt/evidence/diff/approval chain

Q4: 如何证明系统没有降低 AML 覆盖?

EvidenceUse
Typology/scenario coverage matrixshows what remains covered after tuning
Priority distribution by scenariodetects starvation of low-volume risks
QA closure samplecatches weak closures
False-negative proxiesrepeat alerts, lookbacks, fraud referrals, law-enforcement feedback where permitted
Segment slicesproduct/channel/geography/customer type stability
Regression evalbefore/after model/rule/prompt threshold comparison

23. Relationship to Existing Assets

Repo assetRelationship
docs/ai-foundations/papers/144-ai-aml-alert-triage-investigation-workbench-architecture.md本 playbook 的架构解读 companion。
docs/AI_FINANCIAL_CRIME_TYPOLOGY_SCENARIO_COVERAGE_PLAYBOOK.md本文链接 typology coverage, 不重复 red flag/SAR narrative coverage。
docs/AML_COPILOT_PRD.md可作为 prototype-first MVP framing。
docs/AML_GOVERNANCE_MAP.md可作为 AML copilot governance snapshot。
docs/AI_HUMAN_OVERSIGHT_HITL_PLAYBOOK.md深化 human oversight、override、handoff 和 stop path。
docs/AI_MODEL_RISK_MANAGEMENT_PLAYBOOK.md深化 AI system inventory、validation、monitoring 和 model risk lifecycle。
docs/AI_SEGREGATION_OF_DUTIES_DUAL_CONTROL_PLAYBOOK.md深化 maker-checker、dual control 和 incompatible duties。
docs/AI_AUDIT_EVIDENCE_BINDER_PLAYBOOK.md深化 control evidence、audit binder 和 regulator-ready evidence map。