AI AML Alert Triage / Investigation Workbench Architecture Playbook
定位: 面向高级 AI PM / Senior BA / Product Architect / AML Technology Architect / Financial Crime Transformation Lead / Model Risk Partner, 把 AML alert triage 和 case investigation 从“告警处理工具”升级为 evidence-first、human-owned、model-risk-controlled、audit-ready 的 AI 工作台体系。
适用范围: transaction monitoring alert queue、manual referral、fraud-to-AML referral、sanctions context handoff、CDD/EDD context refresh、entity resolution、graph investigation、case assembly、analyst copilot、SAR consideration packet、QA sampling、scenario tuning feedback、model risk validation、audit replay。
核心产出: reference architecture、capability map、queue prioritization policy、evidence workspace schema、copilot contract、disposition guardrail、SAR draft control、QA rubric、feedback taxonomy、model risk checklist、audit event schema、SoD matrix、metrics dashboard 和 30/60/90 implementation roadmap。
0. Disclaimer
本文是学习、作品集、架构训练和内部治理讨论材料。
本文不是法律意见、合规结论、SAR filing decision、suspicious activity determination、sanctions disposition、模型验证报告、审计报告或监管解释。
正式项目必须由 Legal、BSA/AML Compliance、Sanctions、Fraud、Risk、Model Risk、Privacy、Security、Operations、Internal Audit、Business Owner 和管理层结合机构类型、监管关系、司法辖区、产品、客户、渠道、数据、监管承诺和内部政策确认。
关键边界:
- AI 可以辅助 alert prioritization、evidence assembly、case summarization、graph explanation、gap detection、draft case note、QA pre-check 和 tuning insight。
- AI 不应替代 AML/BSA owner 的 suspicious activity determination 或 SAR filing decision。
- AI 不应自动 file SAR、自动关闭高风险 case、自动解除 sanctions hit、自动通知客户、自动联系 law enforcement 或自动改变账户限制。
- AI 不应把 red flag 推断成犯罪结论。
- AI 不应泄露 SAR-sensitive information、NSL-sensitive information、restricted investigation content 或超出 need-to-know 的客户数据。
1. Executive Framing
AML AI 的低成熟度目标通常是:
reduce false positives
summarize cases faster
draft SAR narratives
automate analyst work
这些目标单独看都有价值, 但不足以支撑生产级金融犯罪控制。高级目标应该是:
risk-adjusted investigation capacity
+ evidence completeness
+ human decision ownership
+ queue timeliness
+ typology/scenario coverage awareness
+ SAR quality support
+ QA and tuning feedback
+ model risk governance
+ audit replay
一句话:
AML investigation AI is an evidence and workflow control system before it is a language model productivity feature.
1.1 Product Principles
- Queue priority must be explainable, route-aware and SLA-aware。
- Entity graph must carry confidence, source and access class。
- Evidence workspace is the product core; copilot is an assistant attached to evidence。
- Disposition recommendation must preserve human ownership。
- SAR draft assistance must be evidence-cited and filing-gated。
- QA findings must feed training, rules, data, prompt, retrieval and staffing decisions through controlled routes。
- Tuning must be coverage-safe, not merely alert-volume-reducing。
- Every high-impact output must be replayable across evidence, model version, prompt version, user action and approval chain。
1.2 Strong Questions
| Question | Strong answer |
|---|
| Why is an alert first in queue? | priority band, top drivers, SLA, route, evidence and model/scenario version |
| What entity does this alert belong to? | entity-resolved subject graph with confidence and source lineage |
| What evidence supports the case summary? | cited evidence cards, source record ids, timestamps and freshness |
| What can AI recommend? | next investigation step, evidence gap, disposition options and draft language, not final SAR decision |
| Who owns the decision? | named human role and case state transition |
| How does QA improve the system? | defect taxonomy, feedback routing, change gate and regression eval |
| How can audit replay the case? | append-only event log with evidence ids, user ids, versions, outputs and approvals |
2. Source Anchors
架构映射原则:
official source anchor
-> control objective
-> product requirement
-> workflow state
-> evidence record
-> owner and review cadence
3. End-to-End Operating Model
3.1 Target Flow
1. Alert intake
rules, models, manual referrals, fraud referrals, sanctions context, law-enforcement request flags
2. Context assembly
entity resolution, CDD/EDD, expected activity, prior cases, transaction timeline, graph context
3. Queue prioritization
risk band, SLA, route, top drivers, missing evidence, capacity-aware assignment
4. Investigation workspace
evidence cards, timeline, graph, source documents, copilot, analyst tasks
5. Human disposition
close, continue monitoring, request more evidence, escalate, refer, SAR consideration packet
6. SAR draft support
cited draft packet, quality pre-check, human approval path, filing handoff outside AI autonomy
7. QA and oversight
sample selection, defect scoring, reviewer calibration, issue logging, training feedback
8. Tuning and governance
scenario tuning, model/prompt/RAG change, data remediation, coverage regression, validation
9. Audit replay
event stream, version history, evidence ledger, approval trail, management reporting
3.2 State Machine
| State | Entry condition | Allowed AI support | Human/control gate |
|---|
alert_received | alert from approved source | normalize, classify source, attach trigger facts | source system contract |
context_ready | minimum subject and transaction context assembled | entity resolution, CDD snapshot, graph retrieval | entity confidence warning |
queued | priority and route assigned | priority explanation, SLA assignment | routing policy owner |
in_review | analyst opens case | evidence summary, timeline, gap checklist | analyst remains reviewer |
needs_info | required evidence missing | task suggestions, source lookup | task owner and SLA |
escalated | analyst/supervisor escalates | escalation packet | senior/AML owner assignment |
sar_consideration | human escalates to SAR consideration | draft packet and QA pre-check | BSA/AML owner decision |
closed_no_sar | human selects closure | closure rationale draft | closure reason and evidence |
qa_sampled | risk-based or random QA selection | QA pre-check, defect hints | independent QA reviewer |
feedback_logged | QA/analyst/model issue captured | taxonomy suggestion | feedback owner and change gate |
3.3 Non-Goals
| Non-goal | Reason |
|---|
| Replace transaction monitoring system | Workbench consumes and contextualizes alerts; it should not hide source scenario governance. |
| Replace case management of record | Workbench may integrate or extend CMS, but authoritative case status must be explicit. |
| Auto-file SAR | SAR filing decision and submission are controlled human/compliance workflows. |
| Replace sanctions screening | OFAC/sanctions systems have separate screening, disposition and blocking controls. |
| Replace Legal/Compliance interpretation | Applicability and policy decisions remain owned by control functions. |
4. Capability Map
| Capability | Product function | Architecture services | Control owner |
|---|
| Alert intake | normalize alert and trigger facts | event ingestion, schema registry, source adapter | AML Technology |
| Queue prioritization | route and rank work | priority engine, SLA policy, assignment service | AML Operations |
| Entity resolution | connect subjects/accounts/counterparties | identity graph, match service, confidence scoring | Data/AML Analytics |
| Context graph | reveal relationship and funds flow | graph store, path search, temporal aggregation | AML Analytics |
| Evidence assembly | prepare cited evidence packet | evidence ledger, source connector, provenance service | AML Operations |
| Analyst copilot | summarize, ask, compare, check gaps | RAG, LLM gateway, prompt registry, tool gateway | AI Product/Platform |
| Disposition support | show options and rationale | decision support service, policy/rubric engine | BSA/AML Compliance |
| SAR draft support | draft with citations and guardrails | controlled generation, citation validator, redaction | BSA/AML Compliance |
| QA workbench | sample, score, calibrate, remediate | QA workflow, defect taxonomy, issue tracker | QA / Compliance Testing |
| Feedback loop | route defects to tuning/data/training | feedback registry, change workflow, eval set builder | Model Risk / Scenario Owner |
| Audit replay | reconstruct who saw/did/approved what | append-only event stream, evidence hash, version registry | Internal Audit / Risk |
| Management reporting | track capacity, quality, risk, controls | metrics store, dashboard, KRI service | AML Leadership |
5. Reference Architecture Blueprint
5.1 Logical Layers
Experience
Queue console | Investigation workbench | Copilot panel | QA console | Management dashboard
Workflow
Case state machine | Assignment | SLA | Escalation | Maker-checker | SAR handoff | Feedback routing
AI decision support
Priority model | Entity resolution | Graph analytics | Retrieval | LLM summarization
Disposition options | SAR draft assist | QA judge | Eval harness
Evidence and context
Evidence ledger | Timeline builder | CDD snapshot | Counterparty graph | Prior case index
Typology/scenario link | Sanctions/fraud referral context
Data and integration
Core banking | Transactions | KYC/CDD/EDD | Case management | Sanctions screening
Fraud | CRM | Document store | External advisories/lists | User/IAM
Controls and platform
RBAC/ABAC | SAR confidentiality | Audit log | Version registry | Data lineage
Model inventory | Monitoring | Incident management | Retention | Encryption
5.2 Component Responsibilities
| Component | Owns | Does not own |
|---|
| Alert ingestion service | source normalization, schema validation, duplicate detection | suspicious activity conclusion |
| Priority engine | route, SLA, priority band, driver explanation | final case disposition |
| Entity graph service | match, confidence, relationship edges | legal identity conclusion when data conflicts |
| Evidence ledger | immutable evidence references and provenance | free-text narrative truth |
| Copilot orchestrator | RAG, prompt, LLM, tool routing, structured output | direct filing or irreversible account action |
| Citation validator | checks claims against evidence ids | semantic legal sufficiency |
| Policy guardrail engine | prohibited output/action blocks, role permissions | policy interpretation without owner approval |
| Case workflow service | state transitions, assignments, approvals | bypassing CMS of record without integration agreement |
| QA service | sample, rubric, defects, calibration | tuning deployment approval |
| Feedback registry | classify feedback and route changes | automatic retraining from raw outcomes |
| Model risk registry | inventory, validation evidence, revalidation triggers | operational queue management |
| Audit event service | append-only trace and replay | manual reconstruction after missing logs |
6. Queue Prioritization Playbook
6.1 Priority Policy
Queue priority is a policy-backed decision support output:
priority_result:
alert_id: alert_2026_18491
priority_band: P1
route_to_queue: aml_senior_investigation
SLA_due_at: 2026-07-01T17:00:00Z
top_drivers:
- driver: repeated below-threshold cash deposits
evidence_ids: [ev_tx_001, ev_tx_002, ev_tx_003]
- driver: new outbound wires to unrelated counterparties
evidence_ids: [ev_tx_008, ev_counterparty_011]
- driver: customer high-risk CDD profile
evidence_ids: [ev_cdd_003]
uncertainty:
- entity link to counterparty cluster is medium confidence
missing_context:
- latest stated business purpose for new wire recipient
prohibited_inference: "priority is not SAR filing decision"
6.2 Decision Table
| Condition | Priority action | Routing action | Control note |
|---|
| Repeat alert on same subject with prior escalation | Increase priority | Senior investigator or continuing activity queue | Show prior case metadata under access controls |
| High-risk customer with activity inconsistent with expected profile | Increase priority | EDD-aware queue | CDD freshness and source must be visible |
| Alert triggered by weak entity link only | Do not over-prioritize | Entity review or general queue | Graph confidence must be explicit |
| Sanctions-related context present | Hard escalation flag | Sanctions referral path | AML workbench does not clear sanctions hit |
| Missing critical source data | Route to data exception or needs-info | Operations/data owner | Analyst should not guess missing evidence |
| Aged lower-priority alert near SLA breach | Raise operational priority | Queue manager review | Timeliness is a control dimension |
| Low score but new typology/advisory coverage | Maintain sample or elevated review | Coverage monitoring | Avoid starving emerging risks |
6.3 Priority Eval
| Eval | Metric |
|---|
| Ranking usefulness | percentage of high-risk QA/SAR-consideration cases surfaced in top bands |
| Timeliness | P1/P2 SLA adherence, aged alert count |
| Coverage | typology/scenario distribution across priority bands |
| Fairness/segment stability | priority distribution by product, channel, customer type, geography |
| Analyst trust | override rate with reason, agreement by scenario |
| Control safety | unsupported driver rate, missing evidence route accuracy |
7. Entity Resolution and Graph Context Playbook
7.1 Entity Types
| Entity | Examples | Investigation use |
|---|
| Person | customer, beneficial owner, authorized signer | subject identification and relationship context |
| Organization | business customer, merchant, employer, shell company lead | ownership and activity purpose |
| Account | deposit, loan, card, wallet, brokerage | funds flow and product behavior |
| Counterparty | ACH originator, wire beneficiary, P2P recipient, check payee | relationship and network risk |
| Device/IP/address | digital footprint, branch address, mailing address | mule/synthetic identity/context |
| Case/SAR-sensitive record | prior case, prior SAR metadata, QA finding | repeat activity and escalation context |
| External list item | sanctions list item, advisory entity, high-risk geography | referral and control context |
7.2 Match Governance
| Match level | Example | UI treatment | Downstream use |
|---|
| Deterministic authoritative | same customer id or account id | merged by default | can support fact statement |
| Deterministic corroborated | tax id plus legal name plus date | merged with source badge | can support escalation fact |
| Probabilistic high | strong name/address/device combination | probable link | can support investigation lead with caveat |
| Probabilistic medium | shared address plus transaction pattern | review link | cannot be stated as confirmed |
| Weak lead | one shared phone/address/IP | lead only | cannot drive final disposition alone |
| Conflict | inconsistent identity sources | warning | requires data review or analyst note |
7.3 Graph Guardrails
| Risk | Guardrail |
|---|
| False merge | confidence bands, explainable match features, analyst unlink path |
| Overexposure | edge-level access class and SAR-sensitive filtering |
| Stale relationship | effective date, last seen date, source freshness |
| Graph clutter | time-windowed paths, amount-weighted edges, typology-focused filters |
| Unsupported inference | distinguish "observed connection" from "suspicious relationship" |
| Feedback contamination | graph correction goes through data quality workflow |
Graph card minimum:
edge_id: edge_9812
from_entity: ent_customer_031
to_entity: ent_counterparty_774
relationship_type: repeated_outbound_wire
first_seen: 2026-05-03
last_seen: 2026-06-18
amount_total: 84200.00
source_evidence_ids: [ev_tx_121, ev_tx_219, ev_tx_311]
confidence: high
access_class: aml_restricted
analyst_confirmed: false
8. Case Assembly and Evidence Workspace
8.1 Workspace Layout
| Region | Content | Design principle |
|---|
| Header | alert id, subject, priority, SLA, route, status, owner | operational clarity before AI content |
| Trigger panel | scenario/model trigger facts and source version | source signal is always visible |
| Timeline | transactions, referrals, profile changes, prior alerts | chronology beats narrative |
| Graph | subject, accounts, counterparties, beneficial owners, devices | confidence and source shown on edges |
| CDD/EDD context | expected activity, risk profile, occupation/business, beneficial ownership | context for unusual activity, not static KYC display |
| Evidence cards | citations, raw facts, source record ids, freshness | every AI claim must trace here |
| Copilot panel | summary, questions, gap checklist, draft notes | assistant beside evidence |
| Disposition panel | options, rationale, reason codes, required evidence | human-owned decision |
| QA/control panel | warnings, missing fields, unsupported claims, SoD state | control state visible before close/escalate |
8.2 Evidence Checklist
| Evidence class | Minimum checks | Defect examples |
|---|
| Transaction sequence | amount, date/time, channel, originator, beneficiary, direction | missing counterparty, timezone mismatch |
| Customer profile | CDD risk, expected activity, account purpose, occupation/business | stale profile, missing beneficial owner |
| Counterparty context | relationship history, novelty, geography, recurrence | unresolved entity, unknown recipient |
| Prior activity | prior alerts, cases, SAR-sensitive metadata under access control | inaccessible prior case, missing repeat activity |
| Documents | KYC docs, invoices, account notes, analyst notes | unverified document, OCR extraction error |
| External context | advisories, sanctions referral, law-enforcement request indicator | restricted content exposed to wrong role |
| Analyst rationale | reason code, free-text explanation, evidence references | conclusion without facts |
8.3 Evidence Quality Score
Evidence quality score should guide review readiness, not replace analyst judgment.
| Dimension | Score question |
|---|
| Completeness | Are required evidence classes present for this scenario and customer segment? |
| Freshness | Is CDD/EDD and transaction data current enough under internal policy? |
| Consistency | Do sources agree on identity, ownership, account status and activity purpose? |
| Citation readiness | Can summary and draft claims cite source evidence ids? |
| Access validity | Is evidence visible only to authorized roles? |
| Replayability | Can the same evidence packet be reconstructed later? |
9. Analyst Copilot Design
9.1 Copilot Jobs
| Job | Prompt posture | Output contract |
|---|
| Explain alert | "Summarize trigger facts only from evidence" | trigger summary with evidence ids |
| Build chronology | "Order observed facts by event time" | timeline entries, not motive |
| Compare to expected activity | "Contrast activity with CDD expected profile" | differences, source freshness, missing context |
| Summarize graph | "Describe confirmed and probable relationships separately" | graph facts with confidence |
| Identify evidence gaps | "List missing sources for review" | task list by source owner |
| Suggest next steps | "Offer investigation actions, not decisions" | analyst-selectable checklist |
| Draft case note | "Draft concise note with citations and uncertainty" | editable cited note |
| Pre-check disposition | "Find unsupported claims and missing rationale" | QA warnings |
| Draft SAR packet | "Prepare evidence-cited draft packet for human review" | draft marked non-final |
9.2 Prohibited Copilot Behaviors
| Behavior | Block |
|---|
| "This customer committed money laundering" | Replace with observed facts and suspected indicators |
| "File SAR" as final instruction | Use "escalate for SAR consideration" when evidence supports |
| "No SAR needed" as final conclusion | Present closure rationale option for human decision |
| Uncited factual assertion | Require evidence id or remove |
| Revealing SAR existence to unauthorized role | Permission filter and output block |
| Using sanctions hit to clear customer | Route to sanctions workflow |
| Creating customer communication | Block unless separate approved workflow exists |
| Training from unreviewed analyst notes | Route through feedback quality gate |
9.3 Output Rubric
| Dimension | Pass criteria |
|---|
| Groundedness | Every factual claim maps to source evidence |
| Boundary | Output states it is decision support |
| Uncertainty | Low/medium confidence links are labeled |
| Completeness | Missing evidence is named |
| Neutrality | No criminal conclusion or accusatory wording |
| Actionability | Suggested next step is role-appropriate |
| Confidentiality | SAR-sensitive and restricted data are handled by role |
10. Disposition Recommendation Guardrails
10.1 Human-Owned Disposition Model
AI suggests options
-> analyst reviews evidence
-> analyst selects disposition
-> required reason/evidence captured
-> senior/compliance gate if escalation threshold reached
-> QA sampling
-> feedback registry
10.2 Disposition Options
| Option | When suggested | Required evidence | Approval |
|---|
| Close as no unusual activity | activity consistent with CDD/EDD and benign evidence exists | trigger facts, benign rationale, CDD consistency | analyst or policy-defined reviewer |
| Continue monitoring | incomplete pattern or emerging activity without enough basis | repeat trigger, monitoring rationale, future review date | analyst/supervisor |
| Request more information | material evidence missing | missing source and owner | operations owner |
| Escalate investigation | risk drivers or unresolved gaps require deeper review | cited risk drivers, gap list | senior investigator route |
| Refer to fraud/sanctions/EDD | domain-specific control signal present | referral evidence and scope | receiving control function |
| SAR consideration packet | human sees sufficient suspicious activity concern for review | chronology, subject data, evidence packet | BSA/AML owner or internal policy role |
10.3 Recommendation Explanation
Every recommendation should include:
- option label。
- evidence for。
- evidence against。
- missing evidence。
- confidence boundary。
- required human role。
- downstream control gate。
- why the option is not a final filing decision。
11. Suspicious Activity Escalation and SAR Draft Guardrails
11.1 Escalation Packet
escalation_packet:
case_id: case_2026_771
escalation_type: sar_consideration
created_by: analyst_42
ai_assisted: true
basis:
- observed rapid movement of funds
- activity inconsistent with stated business purpose
- repeated new counterparties
evidence_ids:
- ev_tx_01
- ev_tx_02
- ev_cdd_01
- ev_graph_04
unresolved_questions:
- relationship to counterparty group
- legitimate business explanation for wire corridor
copilot_output_hash: hash_...
human_owner: aml_supervisor_7
11.2 SAR Draft Workflow
| Step | AI role | Human/control role |
|---|
| Evidence selection | suggest relevant facts and transactions | analyst selects and confirms |
| Draft chronology | order cited facts | analyst edits |
| Narrative assist | propose neutral wording with citations | BSA/AML reviewer approves wording |
| Quality pre-check | flag missing fields, unsupported facts, risky wording | reviewer resolves or documents exception |
| Filing handoff | prepare packet for approved filing process | authorized filer uses approved workflow |
| Recordkeeping | link draft, final packet id, evidence and approvals | compliance-owned retention process |
11.3 SAR Draft Controls
| Control | Implementation |
|---|
| No auto-file | No model/tool permission can submit SAR; filing API absent or human-token gated |
| Citation required | Draft sentences referencing facts require evidence ids |
| Unsupported claim blocker | Unsupported claims are removed or marked for human evidence entry |
| SAR-sensitive retrieval | RAG filters prior SAR content by role, case need and policy |
| Edit diff | Human edits to AI draft are versioned |
| Disclosure warning | UI warns against inappropriate SAR disclosure and customer notification |
| Final decision label | Draft states "AI-assisted draft for human review" |
| Filing record link | Workbench stores handoff id or acknowledgement metadata according to policy, not hidden copy leakage |
12. QA and Quality Control
12.1 QA Sampling Strategy
| Sample type | Purpose |
|---|
| Random baseline | measure overall quality and drift |
| Risk-based P1/P2 | check high-risk investigation quality |
| Closure sample | detect weak no-file / no-unusual-activity rationale |
| AI-heavy sample | inspect cases with high copilot reliance |
| Low-confidence entity sample | detect graph/identity errors |
| SAR draft sample | check evidence, wording, completeness and confidentiality |
| Analyst outlier sample | detect training or incentive issues |
| Scenario tuning sample | measure before/after control impact |
12.2 QA Rubric
| Rubric area | Defect |
|---|
| Evidence completeness | required source missing without explanation |
| Citation quality | case note or draft claim lacks evidence id |
| CDD usage | expected activity ignored or stale CDD not flagged |
| Entity reasoning | probable link stated as confirmed |
| Disposition rationale | closure/escalation reason not tied to facts |
| SAR boundary | AI output treated as filing decision |
| Confidentiality | restricted content visible to unauthorized role |
| Timeliness | SLA breach without documented reason |
| Copilot safety | hallucination, unsupported inference, unsafe wording |
| Workflow control | maker-checker or approval bypass |
12.3 QA Finding Record
qa_finding:
finding_id: qaf_2026_1031
case_id: case_2026_771
sampled_reason: closure_sample
defect_type: citation_quality
severity: medium
observation: closure rationale references counterparty relationship not supported by evidence
required_action: reopen_note_for_correction
owner: analyst_supervisor_3
system_feedback:
route: copilot_eval_set
eligible_for_training: false
reason: requires corrected human rationale first
13. Tuning Feedback and Continuous Improvement
13.1 Feedback Routes
| Feedback source | Route | Not allowed |
|---|
| Analyst disagreement | eval enrichment, UI improvement, prompt review | direct model update without review |
| QA defect | training, control remediation, eval cases | hiding defect inside productivity metric |
| False-positive driver | scenario tuning, CDD feature improvement | suppressing segment without coverage review |
| False-negative proxy | scenario gap review, typology coverage update | ignoring because no alert was generated |
| Entity resolution correction | data quality and graph model tuning | silently overwriting historical graph without trace |
| SAR draft defect | prompt/RAG/citation validator fix | letting draft model self-certify |
| Data gap | source owner remediation | asking analyst to work around permanent missing source |
| Adoption friction | workflow redesign | forcing usage through KPI alone |
13.2 Tuning Change Gate
| Gate | Evidence required |
|---|
| Business rationale | problem statement, impacted scenario, expected benefit |
| Coverage check | typology/scenario link and blind-spot assessment |
| Data impact | source fields, lineage, DQ score, missingness by segment |
| Backtest | before/after alert volume, precision proxy, sample review |
| QA review | sample defects and closure quality |
| Model risk review | materiality, validation scope, independent challenge |
| Rollout plan | pilot cohort, monitoring, rollback criteria |
| Approval | scenario owner, BSA/AML owner, model risk or governance role as applicable |
13.3 Anti-Gaming Guardrail
Any tuning proposal that only shows alert-volume reduction is incomplete. It must also show:
- high-risk typology coverage impact。
- sampled closure quality。
- false-negative proxy or lookback check。
- product/channel/customer segment slices。
- data quality sensitivity。
- operational capacity effect。
- QA and audit replay readiness。
14. Model Risk, Eval and Validation
14.1 AI System Inventory
| Inventory item | Required fields |
|---|
| Priority engine | model/rule version, features, route policy, owner, validation evidence |
| Entity resolution | algorithm, sources, confidence calibration, false merge/missed link metrics |
| Graph analytics | graph schema, edge confidence, temporal logic, access class |
| RAG retriever | source registry, index version, permission filter, freshness |
| LLM summarizer | model, prompt, output schema, prohibited behaviors, eval report |
| Disposition assist | rubric, options, boundary statement, human gate |
| SAR draft assistant | citation validator, wording guardrails, no-file/no-submit control |
| QA judge | rubric, human alignment, drift monitoring, false pass rate |
| Workflow engine | state transitions, SoD rules, audit events, fallback |
14.2 Eval Suite
| Eval category | Test examples | Release gate |
|---|
| Groundedness | all factual claims cite correct evidence | critical failures block |
| Missing evidence | system detects absent CDD/EDD/counterparty context | scenario-specific threshold |
| Entity resolution | false merge and missed link by segment | high-risk false merge hard stop |
| Queue ranking | P1/P2 quality, timeliness and coverage | no material coverage regression |
| SAR boundary | no auto-file, no final filing decision, no prohibited disclosure | hard stop |
| Prompt injection | malicious note/document attempts to override policy | hard stop for tool or data leakage |
| Confidentiality | SAR-sensitive and restricted content filtered by role | hard stop |
| Human oversight | reviewer can understand, challenge, override and escalate | UAT plus QA sample |
| QA judge | automated QA aligns with human QA | judge cannot be sole control |
| Drift | production distribution, source freshness, output quality | monitoring alert and review |
14.3 Validation Questions
| Area | Challenge question |
|---|
| Use case boundary | Is the system framed as decision support, or does workflow pressure make it de facto decisioning? |
| Conceptual soundness | Why are graph/RAG/LLM appropriate for this AML workflow? |
| Data adequacy | Are CDD, transactions, counterparty, sanctions/fraud referrals and case history complete enough? |
| Process verification | Are prompt/model/source/rule changes controlled and replayable? |
| Outcome analysis | Does quality improve without hidden coverage loss? |
| Human oversight | Can analysts and reviewers challenge AI with enough evidence and time? |
| Independent challenge | Are model risk/QA/internal audit able to test without builder conflict? |
15. Audit Trail, Access and Segregation of Duties
15.1 Audit Event Schema
audit_event:
event_id: aud_2026_88192
event_type: disposition_selected
case_id: case_2026_771
alert_id: alert_2026_18491
actor_id: analyst_42
actor_role: aml_analyst
timestamp: 2026-06-30T15:22:17Z
action:
disposition: escalate_for_sar_consideration
reason_code: pattern_inconsistent_with_cdd
evidence_ids:
- ev_tx_01
- ev_cdd_01
- ev_graph_04
system_versions:
priority_model: prio_v2.1
prompt: aml_summary_v1.8
rag_index: aml_evidence_2026_06_29
entity_resolution: er_v3.4
output_hash: hash_...
previous_state: in_review
new_state: sar_consideration
15.2 Access Model
| Data class | Access principle |
|---|
| Standard customer data | least privilege, business purpose, logged access |
| AML restricted evidence | role-based AML need-to-know |
| SAR-sensitive content | strict role and case need; no broad RAG exposure |
| NSL-sensitive indicator | special handling; avoid exposing content beyond policy |
| Sanctions hit detail | sanctions role visibility and referral controls |
| Model/prompt logs | sensitive operational access; redact customer data where possible |
| QA findings | QA/compliance/model risk access with management reporting aggregation |
15.3 SoD Matrix
| Duty combination | Control |
|---|
| Analyst triages and independently QA-reviews same case | Block; assign independent QA |
| Analyst creates SAR draft and approves filing | Require BSA/AML owner or filing role per policy |
| Model owner changes priority model and approves validation | Independent model risk challenge |
| Scenario owner suppresses alert type without compliance approval | Tuning change gate and coverage review |
| Admin can alter evidence ledger and audit event stream | Immutable logs, dual control, privileged access monitoring |
| Vendor supplies model and certifies production quality | Internal validation, QA sample and audit rights |
| Queue manager lowers priority due to staffing alone | Document risk acceptance or reassign capacity |
16. Controls and Evidence Checklist
| Control objective | Evidence | Owner |
|---|
| Alert sources are complete and approved | source inventory, data contract, ingestion reconciliation | AML Technology |
| Monitoring outputs are timely | alert generation logs, SLA dashboard | AML Operations |
| Queue routing is explainable | priority record, driver evidence, route policy | AML Operations |
| CDD/EDD context is used | CDD snapshot, freshness indicator, expected activity comparison | AML Compliance/Data Owner |
| Entity graph is controlled | match report, confidence calibration, correction log | Data/Analytics |
| Copilot outputs are grounded | citation validation logs, eval report, sampled outputs | AI Product/QA |
| SAR filing is human-owned | workflow state, approval record, no-submit technical control | BSA/AML Compliance |
| SAR-sensitive data is protected | access logs, role matrix, retrieval filter tests | Security/Compliance |
| QA is independent | sample plan, QA findings, reviewer independence evidence | QA/Compliance Testing |
| Tuning is governed | change request, coverage regression, approval | Scenario Owner/Model Risk |
| Model risk is managed | inventory, validation plan, monitoring, revalidation record | Model Risk |
| Audit replay is possible | append-only event log, version registry, evidence ids | Internal Audit/Platform |
| Deficiencies are remediated | issue log, CAPA, closure evidence, management reporting | Control Owner |
17. Metrics Dashboard
17.1 Executive View
| Metric | Healthy signal | Risk signal |
|---|
| Risk-adjusted investigation capacity | more high-risk cases reviewed with stable quality | throughput up, QA quality down |
| Queue aging | fewer aged P1/P2 alerts | old high-risk alerts accumulating |
| Evidence completeness | required evidence present before disposition | closures with missing CDD/counterparty context |
| SAR consideration quality | fewer unsupported draft defects | more narrative defects or late escalations |
| Coverage stability | no material typology/channel blind spot | alert suppression after tuning |
| Human oversight | meaningful overrides and reasoned disagreements | near-100 percent acceptance of AI suggestions |
| Audit completeness | trace events complete | missing model/prompt/evidence versions |
17.2 Analyst Adoption View
| Metric | Interpretation |
|---|
| Evidence card usage | whether analysts rely on structured evidence |
| Copilot summary edit distance | whether drafts are useful without being rubber-stamped |
| Gap checklist completion | whether AI improves investigation completeness |
| Recommendation override reasons | trust calibration and model/product issues |
| Time in source systems | reduction indicates workbench value |
| Escalation packet rework | high rework indicates evidence or UI issues |
17.3 Model/Control View
| Metric | Interpretation |
|---|
| Unsupported claim rate | hallucination or citation failure |
| Retrieval miss rate | RAG/source/index issue |
| Entity false merge rate | graph risk |
| Priority override concentration | routing/model bias or policy mismatch |
| QA judge false pass rate | automated QA cannot be trusted alone |
| SAR-sensitive access exceptions | confidentiality control risk |
| Tuning rollback count | change quality and coverage impact |
18. Implementation Roadmap
18.1 First 30 Days - Evidence-First MVP
| Workstream | Deliverable |
|---|
| Scope | choose 2-3 alert scenarios and 1-2 queues |
| Data | source inventory, CDD/transaction/case evidence schema, lineage |
| UX | queue console, evidence cards, timeline, basic disposition panel |
| AI | retrieve-and-summarize with citation validator; no SAR draft yet |
| Controls | role model, audit event schema, no-auto-SAR technical boundary |
| Eval | groundedness, citation accuracy, missing evidence detection |
| Ops | analyst pilot, feedback taxonomy, daily defect review |
18.2 Days 31-60 - Graph and Controlled Copilot
| Workstream | Deliverable |
|---|
| Entity resolution | confidence-banded entity graph and correction workflow |
| Queue | driver-based prioritization and SLA routing |
| Copilot | gap checklist, investigation plan, case note draft |
| QA | sample plan, rubric, QA console, defect taxonomy |
| Model risk | system inventory, validation scope, revalidation triggers |
| Metrics | productivity, quality, adoption and control dashboard |
18.3 Days 61-90 - SAR Packet and Governance Loop
| Workstream | Deliverable |
|---|
| SAR assist | evidence-cited draft packet, unsupported-claim blocker, human filing handoff |
| Tuning | scenario feedback workflow, coverage regression, change approval |
| Independent challenge | model risk/QA validation report and issue log |
| Audit | case replay pack, event completeness test, privileged access review |
| Training | analyst AI literacy, automation-bias drill, SAR confidentiality handling |
| Management | monthly KRI/KPI pack and control remediation review |
19. Implementation Guardrails
| Area | Guardrail |
|---|
| Product scope | Start with evidence assembly and cited summaries before SAR narrative generation |
| Data | Do not ingest unrestricted SAR-sensitive content into broad RAG index |
| UX | Do not make AI recommendation the default selected disposition |
| Workflow | Require human reason code and evidence references for close/escalate decisions |
| Model | Version prompts, models, retrievers, rules, thresholds and graph algorithms |
| Feedback | Separate raw analyst action from QA-reviewed training label |
| Operations | Do not tune alert volume to staffing capacity without risk acceptance |
| Security | Enforce row/field/evidence-level access, not just page-level roles |
| Audit | Design trace schema before pilot; screenshots are not enough |
| Vendor | Contract for logs, version notices, data handling, audit rights and exit |
20. Anti-Patterns and Fixes
| Anti-pattern | Symptom | Fix |
|---|
| LLM-first AML | impressive summaries but weak evidence | evidence ledger first, copilot second |
| SAR draft as MVP | narrative looks good, facts not controlled | build case assembly and citation validator first |
| Auto-close low score | lower backlog but higher blind-spot risk | human closure, QA sample, coverage metrics |
| Graph magic | dense network visual with no confidence | confidence-banded edges and source cards |
| Hidden decisioning | AI recommendation becomes default operational outcome | explicit human gate and non-default UI |
| Productivity tunnel vision | faster cases but more QA defects | balanced scorecard |
| Tuning by complaints | threshold changes react to queue pressure | governed change gate and regression testing |
| Self-validating AI | model/judge grades its own work | independent QA and human calibration |
| Audit after launch | missing versions and evidence ids | event schema and version registry from day one |
21. PM / Architect Implications
21.1 PM Implications
| PM concern | Practical stance |
|---|
| Value proposition | Sell capacity plus quality plus auditability, not "AI replaces analysts" |
| MVP | Evidence workspace and cited copilot before SAR draft |
| User adoption | Measure edit distance, evidence card usage, override reasons and rework |
| Risk appetite | Define which dispositions need senior review, QA and model risk gates |
| Training | Include automation bias, citation checking, confidentiality and escalation drills |
| Stakeholder alignment | Operations wants speed; Compliance wants quality; Model Risk wants validation; Audit wants replay |
21.2 Architect Implications
| Architecture concern | Practical stance |
|---|
| Source of truth | Source systems plus evidence ledger, not model text |
| Context engine | Entity graph and timeline should be independently testable |
| AI orchestration | Use structured outputs, citation validation and tool permissions |
| Security | SAR-sensitive and investigation data need fine-grained retrieval filters |
| Observability | Log user action, evidence ids, model/prompt/index versions and output hashes |
| Resilience | Provide manual fallback when AI/RAG/graph services degrade |
| Governance | Connect system inventory, release gates, monitoring and revalidation triggers |
22. Interview Pack
Q1: 如何设计 AML alert triage AI workbench?
30 秒版本:
我会把它设计成 evidence-first workbench, 不是聊天机器人。核心链路是 alert intake、entity resolution、CDD/EDD context、graph/timeline evidence assembly、risk-based queue prioritization、analyst copilot、human-owned disposition、SAR draft guardrails、QA、tuning feedback 和 audit replay。AI 辅助排序、总结、查缺口和起草, 但不做 SAR filing decision。
2 分钟版本:
架构上先建 evidence ledger 和 entity graph。每个 alert 保留 source scenario/model version、trigger facts、SLA 和 route。entity resolution 输出 confidence-banded subject graph, 图谱边都有 source、timestamp 和 access class。工作台里 analyst 看到 timeline、CDD expected activity、counterparty graph、prior case context 和 evidence cards。Copilot 只能基于证据输出结构化 summary、gap checklist、next step 和 disposition options, 每个事实必须有 evidence id。SAR 草稿只是 human-reviewed packet, 没有 auto-filing tool permission。上线控制包括 QA sampling、feedback taxonomy、scenario tuning change gate、model risk validation、SAR-sensitive access control 和 append-only audit trail。价值指标同时看调查时间、QA defect、queue aging、coverage、override 和 audit completeness。
Q2: 如何防止 AI 误导 analyst?
| Control | Explanation |
|---|
| Evidence-first UI | Analyst 先看到证据和来源, 再看到 AI summary |
| Citation validator | Unsupported factual claim cannot pass |
| Non-default recommendation | AI option cannot be one-click default decision |
| Uncertainty label | Low/medium confidence graph links clearly标识 |
| QA sample | AI-heavy cases sampled more aggressively |
| Training | Automation bias and confidentiality drills |
| Override analytics | Monitor over-acceptance and disagreement quality |
Q3: 如何处理 SAR draft?
| Principle | Answer |
|---|
| Boundary | AI drafts evidence-cited packet, not filing decision |
| Evidence | Every factual statement maps to transaction/CDD/case evidence |
| Language | Neutral, observed facts, no criminal conclusion |
| Confidentiality | SAR-sensitive access and retrieval controls |
| Approval | Human BSA/AML owner reviews and filing process remains controlled |
| Audit | Store model/prompt/evidence/diff/approval chain |
Q4: 如何证明系统没有降低 AML 覆盖?
| Evidence | Use |
|---|
| Typology/scenario coverage matrix | shows what remains covered after tuning |
| Priority distribution by scenario | detects starvation of low-volume risks |
| QA closure sample | catches weak closures |
| False-negative proxies | repeat alerts, lookbacks, fraud referrals, law-enforcement feedback where permitted |
| Segment slices | product/channel/geography/customer type stability |
| Regression eval | before/after model/rule/prompt threshold comparison |
23. Relationship to Existing Assets
| Repo asset | Relationship |
|---|
docs/ai-foundations/papers/144-ai-aml-alert-triage-investigation-workbench-architecture.md | 本 playbook 的架构解读 companion。 |
docs/AI_FINANCIAL_CRIME_TYPOLOGY_SCENARIO_COVERAGE_PLAYBOOK.md | 本文链接 typology coverage, 不重复 red flag/SAR narrative coverage。 |
docs/AML_COPILOT_PRD.md | 可作为 prototype-first MVP framing。 |
docs/AML_GOVERNANCE_MAP.md | 可作为 AML copilot governance snapshot。 |
docs/AI_HUMAN_OVERSIGHT_HITL_PLAYBOOK.md | 深化 human oversight、override、handoff 和 stop path。 |
docs/AI_MODEL_RISK_MANAGEMENT_PLAYBOOK.md | 深化 AI system inventory、validation、monitoring 和 model risk lifecycle。 |
docs/AI_SEGREGATION_OF_DUTIES_DUAL_CONTROL_PLAYBOOK.md | 深化 maker-checker、dual control 和 incompatible duties。 |
docs/AI_AUDIT_EVIDENCE_BINDER_PLAYBOOK.md | 深化 control evidence、audit binder 和 regulator-ready evidence map。 |