返回 Papers
AI 扩展计划 / Playbooks

AI Financial Crime Typology / Scenario Coverage Playbook

本文是学习、作品集、架构训练和内部治理讨论材料。

799AI_FINANCIAL_CRIME_TYPOLOGY_SCENARIO_COVERAGE_PLAYBOOK.md

AI Financial Crime Typology / Scenario Coverage / SAR Evidence Architecture Playbook

定位: 面向高级 AI PM / Senior BA / Product Architect / AML Technology Architect / Financial Crime Transformation Lead, 把 AML / fraud / sanctions / scam / SAR assist AI 从单点检测模型升级为 typology-driven、coverage-aware、evidence-backed、human-owned 的生产控制体系。 适用范围: transaction monitoring、AML alert triage、fraud-to-AML referral、KYC/CDD refresh、case investigation copilot、SAR narrative assistant、FinCEN advisory response、scenario coverage review、model/rule/LLM triage governance。 核心产出: typology object model、scenario library governance、red flag evidence mapping、coverage measurement、synthetic and real-case eval、SAR narrative evidence bundle、alert-to-case-to-SAR traceability、operating model、KRI dashboard 和 30-day portfolio lab。


0. Disclaimer

本文是学习、作品集、架构训练和内部治理讨论材料。

本文不是法律意见、合规结论、SAR filing decision、suspicious activity determination、监管解释、模型验证报告或审计报告。

正式项目必须由 Legal、BSA/AML Compliance、Fraud、Sanctions、Risk、Model Risk、Privacy、Security、Operations、Internal Audit、Business Owner 和管理层结合机构类型、司法辖区、产品、客户、渠道、数据、监管承诺和内部政策确认。

关键边界:

  • AI 可以辅助识别 signal、组织 evidence、检查缺口、生成 analyst summary、草拟 narrative、做 QA pre-check。
  • AI 不应替代人类 AML / BSA / Compliance owner 的 SAR decision。
  • AI 不应把 red flag 推断成犯罪结论。
  • AI 不应把 SAR-sensitive information 暴露给无权限用户或客户。
  • AI 不应决定是否告知客户、是否关闭账户、是否联系 law enforcement 或是否提交 SAR。
  • SAR confidentiality、supporting documentation、retention、access 和 disclosure 必须按适用规则和机构政策处理。

1. Executive Framing

金融犯罪 AI 的常见误区是把目标缩成一个指标:

reduce false positives
increase alert precision
draft SAR narratives faster
automate analyst work

这些目标有价值, 但不够。高级架构目标应该是:

typology coverage
  + scenario effectiveness
  + evidence completeness
  + human decision ownership
  + SAR quality
  + audit replay
  + continuous threat update

一句话:

Financial crime AI is a coverage and evidence architecture before it is a model optimization problem.

1.1 Product Principles

  1. Typology library 是 product control asset。
  2. Scenario inventory 是 architecture asset。
  3. SAR evidence bundle 是 compliance decision asset。
  4. LLM triage 是 assistant capability, 不是 decision authority。
  5. Coverage dashboard 应该进入 management review。
  6. QA findings 必须回流到 typology、scenario、data、prompt、model、training 和 staffing。

1.2 Strong Questions

QuestionStrong answer
哪些 typologies 在 scopetypology registry with source anchors and owners
哪些 products/channels/customers 被覆盖coverage matrix by segment and data dependency
哪些 red flags 可观察red flag evidence map
哪些 scenario 是 active / partial / manual / gapscenario inventory
哪些 data quality 会造成 blind spotfeature contract and DQ score
哪些 case 可 replayevidence ledger and trace id
哪些 SAR narrative 有 evidence defectSAR QA rubric and sample findings

2. Source Anchors

AnchorOfficial link本文使用方式
FFIEC BSA/AML Suspicious Activity Reporting Overviewhttps://bsaaml.ffiec.gov/manual/SuspiciousActivityReporting/01;current manual path: https://bsaaml.ffiec.gov/manual/AssessingComplianceWithBSARegulatoryRequirements/04组织 suspicious activity identification、alert management、SAR decision making、SAR completion、supporting documentation、confidentiality、continuing activity 和 board/management notification 的证据链
FFIEC BSA/AML Appendix F Red Flagshttps://bsaaml.ffiec.gov/manual/Appendices/08;current Appendix F path: https://bsaaml.ffiec.gov/manual/Appendices/07组织 red flag taxonomy、additional scrutiny、key terms、management focus 和 typology-to-scenario mapping;/Appendices/08 当前显示 Appendix G Structuring, 实务引用应核对官方路径
FinCEN Advisories / Bulletins / Fact Sheetshttps://www.fincen.gov/resources/advisoriesbulletinsfact-sheets作为 emerging typology、red flag、key term、sector threat 和 scenario refresh 的 source feed
FinCEN BSA Filing Informationhttps://www.fincen.gov/resources/filing-information约束 BSA E-Filing、electronic filing instructions、SAR/CTR filing resources 和 filing operations handoff
FATF Recommendationshttps://www.fatf-gafi.org/en/publications/Fatfrecommendations/Fatf-recommendations.html参考 international AML/CFT/CPF risk-based framework、CDD、recordkeeping、suspicious transaction reporting 和 effectiveness orientation
NIST AI RMFhttps://www.nist.gov/itl/ai-risk-management-framework用 Govern / Map / Measure / Manage 组织 AI risk ownership、scenario mapping、eval、monitoring、KRI 和 continuous improvement

架构设计不要把这些来源混成单一规则。更好的路径是:

source anchor -> control objective -> scenario requirement -> evidence requirement
  -> human owner -> review cadence -> jurisdiction / entity applicability check

3. Typology Object Model

Typology 是金融犯罪模式的业务和风险对象, 不是一个 alert rule, 也不是一个 SAR category。

3.1 Typology Entity

typology_id: typ_mule_account_digital_onboarding
name: Digital mule account with rapid movement
risk_family: money_laundering_fraud
source_anchors: [FinCEN_advisory, FFIEC_red_flags, internal_risk_assessment]
customer_segments: [consumer, student, gig_worker]
products_channels: [digital_account_opening, ACH, debit_card, P2P, crypto_on_ramp]
red_flags: [rapid_movement, unrelated_counterparties, shared_device_cluster, thin_profile]
evidence_requirements: [account_open_date, transaction_sequence, counterparty_graph, device_network, KYC_profile, analyst_notes]
SAR_relevance: possible_key_terms_and_narrative_prompts
owner: AML Typology Owner
review_cadence: monthly_high_risk_or_quarterly_standard
status: active

3.2 Required Fields

FieldPurpose
typology_idStable identity for coverage and evidence
risk_familyML, TF, sanctions, fraud, scam, elder exploitation, cyber, TBML
source_anchorFFIEC, FinCEN, FATF, internal risk assessment, audit issue, law enforcement feedback
business_definitionPlain-language description for PM / BA / analyst
customer_scopeCustomer segments where it applies
product_channel_scopeProducts and channels where it can manifest
red_flag_idsObservable indicators
scenario_idsDetection and review scenarios
evidence_requirementsMinimum evidence for investigation and SAR consideration
false_positive_driversLegitimate explanations or benign patterns
human_review_questionsQuestions analyst must answer
SAR_relevancePotential SAR categories, key terms, narrative prompts
control_ownerAccountable owner
coverage_statusactive, partial, manual, gap, retired

3.3 Typology Families

FamilyExamplesDesign focus
Structuringcash below threshold, monetary instruments, multi-branch patterntransaction sequence and reporting avoidance context
Mule accountsrapid movement, pass-through, network clustergraph, velocity, onboarding, counterparty
Scamsromance scam, investment scam, impersonationcustomer narrative, payment destination, intervention
Elder exploitationpressure, confusion, new beneficiary, unusual transfervulnerability protection and escalation
Check fraudaltered item, stolen mail, duplicate depositimage, return code, payee and fraud referral
Cyber-enabled crimeaccount takeover, ransomware payment, BECdevice, IP, beneficiary, urgent payment narrative
TBMLinvoice mismatch, high-risk corridors, shell trade entitiestrade documents and specialist review
Sanctions evasionfront companies, intermediary layeringsanctions screening and network intelligence
Terrorist financingsmall-value patterns, high-risk geography, NPO misuseFATF-informed risk and enhanced review

3.4 Relationship Design

Typologies overlap. Elder exploitation may involve romance scam, mule accounts and crypto exit. Object model should allow primary typology、secondary typology、related typology、advisory-driven update 和 superseded typology。


4. Scenario Library Governance

Scenario 是 typology 的可执行覆盖方式。它可以是 rule、threshold、anomaly model、graph detector、supervised model、LLM triage rubric、manual review protocol 或 hybrid orchestration。

4.1 Scenario Record

scenario_id: scn_structuring_cash_multi_branch_001
typology_id: typ_structuring_cash
name: Multi-branch below-threshold cash deposit pattern
detection_method: rule_plus_profile_contrast
business_logic: Detect repeated cash deposits below reporting threshold across branches within rolling window, contrasted with expected customer activity.
data_dependencies: [cash_amount, branch_id, teller_channel, customer_id, account_id, transaction_date, expected_cash_activity]
evidence_required: [transaction_timeline, branch_pattern, customer_profile, prior_cash_activity, business_nature]
human_review_questions: [expected_for_customer, legitimate_purpose, reporting_avoidance_pattern]
false_positive_drivers: [seasonal_cash_business, event_driven_cash_receipts]
coverage_status: active
owner: AML Scenario Owner

4.2 Lifecycle

candidate -> design -> data feasibility -> calibration -> UAT with historical cases
  -> pilot -> production -> QA monitoring -> tuning -> retired or replaced

4.3 Governance Questions

StageQuestions
CandidateWhich typology and red flags does it cover?
DesignWhat data and evidence are required?
CalibrationWhat is expected false positive and false negative risk?
PilotDid analysts find the evidence useful?
ProductionAre alerts routed to the right queue?
QAAre dispositions consistent and documented?
TuningDoes threshold change reduce coverage?
RetirementWhat compensating control replaces it?

4.4 Coverage States

StateMeaningExample
Active automatedproduction rule/model detects and routes alertscash structuring rule
Active hybridmodel plus analyst reviewmule graph cluster
Manual-onlypolicy requires manual referralelder exploitation branch referral
Partialonly some products/channels coveredwire covered, RTP not covered
Gapknown typology without adequate scenarionew scam advisory not implemented
Retiredno longer active with replacement documentedold threshold replaced by model
Suppresseddisabled with risk acceptancetemporary data quality incident

4.5 Review Triggers

TriggerAction
New FinCEN advisorytypology and red flag impact review
Product launchproduct/channel coverage assessment
New payment railscenario feasibility review
Model/rule tuningcoverage regression check
SAR QA defect spikescenario evidence review
Law enforcement feedbacktypology update
Audit findingcontrol remediation
Data source changedata dependency retest

5. Red Flag / Evidence Mapping

Red flag 是 potential indicator, 不是 conclusion。设计目标是把 red flag 转成 observable evidence 和 review question。

5.1 Red Flag Object

red_flag_id: rf_rapid_movement_of_funds
name: Rapid movement of funds through account
observable_data: [inbound_amount, outbound_amount, time_between_flows, account_age, counterparty_count]
evidence_questions:
  - Is this expected for the customer's profile and purpose?
  - Are counterparties related, known, or high risk?
  - Is there a legitimate documented explanation?
typologies: [mule_account, scam_proceeds, layering]
false_positive_drivers: [payroll_processor, marketplace_seller, escrow_like_activity]
severity: medium_high

5.2 Evidence Mapping Table

Red flagObservable dataInvestigation questionEvidence artifact
Activity inconsistent with profileexpected vs actual behaviorWhat changed and why?KYC/CDD profile, transaction history
Multiple small cash depositsamount, branch, frequencyIs pattern consistent with reporting avoidance?cash timeline, branch map
Rapid movementinbound/outbound timingIs account pass-through?transaction sequence
New beneficiary high-value wirebeneficiary, amount, relationshipIs beneficiary expected or suspicious?beneficiary profile, customer notes
Shared device/address clusterdevice, IP, address, phoneAre accounts connected?graph cluster
High-risk geographyorigin/destination, corridorIs there plausible business reason?wire details, customer business
Scam narrativecustomer notes, payment purposeIs customer being coached or deceived?call transcript, chat extract
Check alterationimage, MICR, return codeIs item altered or stolen?item image, return notice

5.3 Evidence Quality Levels

LevelDescription
E0AI statement only, no source evidence
E1Single transaction or note, weak context
E2Transaction timeline plus customer profile contrast
E3Multi-source evidence: transactions, KYC, counterparty, notes
E4Multi-source evidence plus analyst research and alternative explanations
E5Complete bundle with supporting documents, source versions, human decision and QA readiness

Minimum target: alert triage can start with E1/E2; case escalation should reach E3; SAR consideration should target E4/E5 where available。


6. Coverage Measurement

Coverage measurement answers:

Are the risks we say we monitor actually represented by scenarios,
data, evidence, review capacity and quality feedback?

6.1 Coverage Dimensions

DimensionQuestion
TypologyIs the typology in the registry and active?
ProductWhich products are covered?
ChannelBranch, ATM, mobile, online, ACH, wire, RTP, card, P2P, crypto on-ramp
Customer segmentConsumer, SMB, MSB, NPO, private banking, senior customer
DataAre required fields available and reliable?
ScenarioIs detection automated, hybrid, manual or gap?
EvidenceDoes the alert provide required evidence?
Review capacityCan analysts review within SLA?
QAAre outcomes sampled and defects remediated?
Change controlDoes tuning preserve coverage?

6.2 Coverage Matrix

TypologyProductChannelSegmentScenarioDataEvidenceCoverage
Structuringdepositbranch cashconsumer / SMBscn_cash_multi_branchgoodE3active
Mule accountDDAdigital + ACHconsumerscn_mule_graph_velocitypartial deviceE3active hybrid
Elder exploitationdepositwire / branchseniormanual referral + wire red flaggoodE2/E3partial
TBMLcommercialwire / tradeSMBspecialist reviewinvoice partialE2manual-only
Scam crypto exitdepositACH / crypto on-rampconsumerpayment purpose + velocitypartialE2gap/partial

6.3 Coverage Score

coverage_score =
  typology_scope_score
  * data_availability_score
  * scenario_activation_score
  * evidence_quality_score
  * review_capacity_score
  * QA_feedback_score
ScoreMeaning
0.00no known coverage
0.25manual or ad hoc coverage
0.50partial scenario or weak data
0.75active scenario with usable evidence
1.00active scenario, strong evidence, QA and change control

6.4 Dashboard Metrics

MetricWhy it matters
Typology coverage percentageBoard / management view of risk coverage
Scenario active vs partial vs gapShows control maturity
Coverage by product/channelFinds new rail blind spots
Evidence completeness by typologyMeasures investigation readiness
Alert-to-case conversion by scenarioShows triage usefulness
Case-to-SAR consideration by typologyShows escalation behavior
SAR QA defect by typologyFinds narrative/evidence issues
Scenario stale ageFinds outdated thresholds
Advisory-to-scenario SLAMeasures threat responsiveness

7. Synthetic vs Real-Case Eval

Financial crime AI eval 必须平衡 confidentiality、realism、coverage 和 repeatability。

7.1 Synthetic Eval

Synthetic cases are useful for typology coverage testing、red flag recognition、edge case generation、prompt regression、model comparison、analyst training 和 SAR narrative rubric calibration。

Strengths: repeatable, safe to share, can cover rare typologies, can create known ground truth, can stress specific evidence gaps。

Weaknesses: too clean, weak operational noise, may miss real customer ambiguity, may overrepresent obvious red flags。

7.2 Real-Case Eval

Real cases are useful for operational realism、data quality issues、analyst workflow friction、source-system gaps、alert disposition variation、SAR narrative quality 和 false positive drivers。

Controls needed:

  • SAR confidentiality review。
  • data minimization。
  • de-identification / tokenization。
  • access control。
  • retention policy。
  • sampling approval。
  • exclusion of prohibited data from external model training。

7.3 Eval Set Composition

BucketExamplePurpose
Clear suspiciousobvious structuring patternrecall baseline
Benign lookalikeseasonal cash businessfalse positive control
Ambiguousrapid movement with partial explanationuncertainty handling
Missing evidenceno expected activity profilegap detection
Multi-typologyelder scam through mule accountrelationship reasoning
Advisory-specificnew FinCEN typologyupdate responsiveness
Negative controlnormal payroll / merchant settlementover-alert prevention
Confidentiality trapprompt asks for SAR existence disclosuresafety control

7.4 Eval Metrics

MetricMeaning
Red flag recallDid AI identify relevant indicators?
False red flag rateDid AI invent unsupported indicators?
Evidence groundingAre claims linked to source records?
Gap detectionDid AI identify missing evidence?
Typology classification accuracyDid AI map to correct typology family?
Narrative completenessWho, what, when, where, why, how
Human decision respectDid AI avoid file/no-file recommendation?
Confidentiality complianceDid AI avoid impermissible disclosure?

7.5 Acceptance Criteria

  • unsupported red flag invention <= 2% on gold set。
  • SAR confidentiality breach = 0 tolerated in release test。
  • evidence grounding >= 95% for cited transaction facts。
  • gap detection recall >= 90% for intentionally incomplete cases。
  • file/no-file directive phrases blocked at 100% in SAR decision context。
  • analyst usefulness score improves without lowering evidence completeness。

8. SAR Narrative Evidence Bundle

SAR narrative support must be evidence-first。LLM draft should not be promoted unless the supporting evidence index is complete enough for human review。

8.1 Bundle Structure

case_id: case_aml_2026_00421
trace_id: trace_00421
alert_ids: [alert_7781]
typologies: [typ_structuring_cash]
scenarios: [scn_cash_multi_branch_001]
red_flags: [rf_multiple_below_threshold_cash_deposits, rf_activity_inconsistent_with_profile]
evidence_index:
  transaction_ids: [txn_001, txn_002]
  customer_profile_ref: cdd_2026_03
  branch_notes_ref: note_884
AI_assistance:
  model_version: llm_gateway_aml_summary_v3
  prompt_version: sar_evidence_gap_check_v2
  output_hash: hash_value
human_review:
  analyst_id: analyst_17
  reviewer_id: sar_committee_03
  decision_owner: BSA_Officer_delegate
  decision: SAR_considered
confidentiality:
  retention_class: SAR_sensitive
  access_policy: AML_need_to_know

8.2 Narrative Building Blocks

BlockEvidence source
Subject identityKYC/CDD, account records
Transaction chronologytransaction timeline
Unusual activity explanationcustomer profile contrast
Red flagsscenario output and analyst notes
Amounts and datessource transactions
Counterpartiesbeneficiary/payee/counterparty records
Customer explanationcall notes, branch notes, written response
Analyst researchcase notes, external permitted research
Uncertaintymissing evidence and limitations
Key termsapplicable FinCEN advisory guidance if relevant

8.3 LLM Guardrails

LLM may summarize transaction chronology, extract candidate red flags, map facts to typology candidates, list evidence gaps, draft analyst-facing narrative sections, check narrative against rubric and identify unsupported claims。

LLM must not decide SAR filing, state that a crime occurred, notify customer of SAR existence, invent explanations, hide uncertainty, remove adverse evidence to make narrative cleaner, bypass human review or access SAR evidence outside need-to-know policy。


9. Model / Rule / LLM Triage Comparison

MethodStrengthWeaknessBest use
Deterministic ruleexplainable, stable, easy to auditthreshold brittleness, high false positivesknown red flags and regulatory-sensitive patterns
Statistical anomalyfinds deviationshard to explain, can driftunusual customer behavior vs baseline
Supervised modellearns patterns from labelslabel bias, concept driftprioritization with strong labels
Graph modelnetwork detectiondata integration complexitymule networks and entity clusters
LLM triagetext synthesis, evidence organizationhallucination, confidentiality, automation biasanalyst summary, gap check, narrative assist
Hybrid orchestrationcombines strengthsgovernance complexityhigh-risk typology coverage

Architecture pattern:

rules and models produce signals
  -> graph links related entities
  -> LLM summarizes only approved evidence
  -> policy gate blocks decision overreach
  -> human analyst investigates
  -> evidence ledger records versions and actions

Decision boundary:

DecisionRule/model/LLM roleHuman role
Alert generationrule/model may triggermonitoring owner approves scenario
Alert prioritizationmodel may rankanalyst reviews
Evidence summaryLLM may summarizeanalyst validates
Case escalationworkflow may recommendanalyst/supervisor decides per policy
SAR narrative draftLLM may draft from evidenceSAR owner reviews and edits
SAR filing decisionno autonomous AI decisionauthorized compliance owner
Account closureAI may provide factsbusiness/compliance decision owner

10. Alert-to-Case-to-SAR Traceability

Traceability is the spine of the architecture。

10.1 Trace Chain

source event -> feature calculation -> scenario trigger -> alert id
  -> queue assignment -> analyst actions -> case id -> evidence collection
  -> disposition -> SAR consideration -> filing/non-filing decision
  -> SAR package or decision record -> QA sample / audit replay

10.2 Minimum Event Schema

FieldPurpose
trace_idend-to-end correlation
source_event_idstransactions, notes, KYC events
feature_versionreproducible feature calculation
scenario_idscenario that triggered
scenario_versionrule/model version
alert_idalert object
case_idinvestigation object
typology_idsmapped typologies
red_flag_idsobserved indicators
AI_assist_idsLLM/model assistance logs
analyst_idhuman investigator
reviewer_idchecker / supervisor
dispositionclose, escalate, SAR considered, continuing review
decision_rationale_refstructured rationale
SAR_package_refif applicable and access-restricted

10.3 Acceptance Criteria

  • Every alert maps to scenario id and source events。
  • Every case maps to alert or manual referral。
  • Every LLM summary maps to source evidence and output hash。
  • Every SAR consideration maps to human decision owner。
  • Every non-filing decision has documented rationale per internal policy。
  • Every scenario change can be tied to coverage impact。
  • Every SAR-sensitive object has access control and audit log。
  • Audit can replay a sampled case from source events to final disposition。

11. Operating Model

11.1 RACI

ActivityAML ComplianceFraudSanctionsAI PMSenior BAArchitectData/MLModel RiskOperationsLegalAudit
Typology library ownershipA/RCCCRCCCCCI
Scenario inventoryACCRRA/RRCCCI
Red flag evidence mapA/RRRCRCCCCCI
Data contractsCCCCRA/RRCCCI
Model/rule developmentCCCCCRA/RCCII
LLM triage designCCCRRA/RRCCCI
SAR decisionA/RCCIIIIICCI
SAR evidence access policyA/RCCIRRCCCA/RI
QA samplingA/RCCCRCCCRCC
Audit replayCCCIICCCCCA/R

R = Responsible, A = Accountable, C = Consulted, I = Informed。

11.2 Governance Forums

ForumCadenceAgenda
Typology reviewmonthly / event-drivenFinCEN advisories, source updates, emerging threats
Scenario performance reviewmonthlyalert volume, conversion, evidence, QA defects
SAR quality reviewmonthlynarrative defects, supporting documentation, key terms
AI model/rule change boardper releasecoverage regression, validation, risk acceptance
Operations capacity reviewweeklyqueue SLA, backlog, analyst workload
Management risk reviewquarterlycoverage gaps, KRI trend, investment needs
Audit / assurance reviewrisk-basedreplay evidence, control design, operating effectiveness

11.3 Human Oversight Design

Human oversight is not a button。It requires authority to disagree, access to source evidence, enough time, reason codes, escalation routes, QA feedback, training on AI limitations and automation bias monitoring。

Reviewer UI should show typology candidates、red flags and source records、AI summary with grounding links、evidence gaps、customer expected activity、counterparty graph、scenario version、prior dispositions、available actions and required rationale fields。


12. Metrics / KRIs

12.1 Executive Metrics

MetricMeaning
Typology coverage scoreAre priority risks covered?
Coverage gaps by product/channelWhere are blind spots?
High-risk scenario healthAre key scenarios functioning?
Advisory response SLAHow fast threats enter controls?
Evidence completenessCan cases support decisions?
SAR quality defect rateAre narratives and evidence adequate?
Case backlog by risk tierIs review capacity sufficient?
AI assist usage with QA resultIs AI improving work without degrading quality?
SAR confidentiality incidentsIs sensitive information protected?
Open remediation agingAre defects fixed?

12.2 Product Metrics

MetricProduct question
Analyst time to evidence packetDoes AI reduce prep time?
Gap detection rateDoes AI find missing evidence?
Unsupported claim rateDoes LLM invent or overstate?
Reviewer edit distanceAre drafts usable?
Reviewer challenge rateAre humans exercising judgment?
Case reopen rateAre closures weak?
False positive drivers by scenarioWhat causes noise?
Alerts with complete typology mappingIs taxonomy embedded?
Manual referral conversionAre frontline signals useful?

12.3 Risk KRIs

KRIYellowRed
Critical typology without active coveragepartial/manual onlyno owner or no plan
Scenario stale agereview overduehigh-risk scenario overdue and active
Evidence completenessbelow targetSAR-considered cases missing source evidence
Unsupported AI claimsrecurring low severitymaterial unsupported facts in narrative
File/no-file AI languagedetected and blockedreached reviewer or filing package
SAR confidentiality accessunusual accessunauthorized access or disclosure
QA defect repeatsame scenario repeatedno remediation owner
Coverage regression after tuningsmall decrease acceptedmaterial gap without approval

13. Financial Retail Scenario Patterns

13.1 Structuring

Risk pattern: repeated cash deposits below reporting threshold, multiple branches or ATMs, business profile does not support cash level。 Evidence: cash timeline, branch pattern, customer expected activity, business profile, explanation and source of funds if available。 AI assist: summarize pattern, compare expected vs actual, flag missing CDD。 Control: red flag triggers scrutiny, not conclusion; human determines escalation or SAR consideration。

13.2 Mule Account

Risk pattern: new account, inbound funds from unrelated parties, rapid outbound movement, shared device / address / phone cluster。 Evidence: account age, velocity, counterparty graph, device network, KYC attributes。 AI assist: graph explanation, similar case retrieval, narrative chronology。 Control: AI cannot close no-suspicion cluster; cluster-level review and QA sampling。

13.3 Elder Exploitation

Risk pattern: senior customer, unusual wire or withdrawal, new beneficiary, pressure/confusion/third-party control。 Evidence: customer history, branch/call notes, beneficiary profile, transaction purpose。 AI assist: summarize protective concerns and separate customer care path from SAR evidence path。 Control: customer protection protocols are jurisdiction-specific; SAR consideration remains human-owned。

13.4 Check Fraud And Mail Theft

Risk pattern: altered check, duplicate deposit, payee mismatch, return codes and account clusters。 Evidence: check image, deposit channel, return reason, payee/account relationship, fraud case notes。 AI assist: cross-case pattern recognition and fraud-to-AML referral summary。 Control: fraud loss recovery and AML suspicion evaluation are related but distinct。

13.5 Scam Proceeds And Crypto Exit

Risk pattern: customer coerced or deceived, repeated transfers to new beneficiary, crypto on-ramp or high-risk platform。 Evidence: payment purpose, customer messages where permitted, beneficiary, transaction sequence, scam report or complaint。 AI assist: extract scam indicators, map to advisory key terms, draft customer-protection evidence summary。 Control: do not disclose SAR existence; payment intervention, complaint, fraud claim and SAR review have different owners。

13.6 TBML And Sanctions Evasion

Risk pattern: invoice mismatch, unusual routing, shell counterparties, high-risk corridor, front companies or intermediary layering。 Evidence: invoice, shipping records, counterparty profile, wire details, screening result, ownership/network data。 AI assist: document comparison and entity relationship summary。 Control: specialist review and document provenance are critical; sanctions controls are distinct from AML SAR evidence。


14. Templates

TemplateRequired fields
Typology Cardtypology id, risk family, source anchors, business definition, customer segments, products/channels, red flags, scenario coverage, evidence requirements, false positive drivers, SAR relevance, owner, review cadence
Scenario Cardscenario id, typology id, detection method, business logic, data dependencies, threshold/model configuration, alert routing, expected evidence, false positive drivers, human review questions, coverage state, QA plan
Red Flag Evidence Mapred flag id, description, observable data, evidence question, source artifact, typology
SAR Evidence Bundlecase id, trace id, typologies, scenarios, red flags, source events, supporting documents, profile reference, AI assistance trace, human review, confidentiality policy
Coverage Review Memodecision needed, scope, coverage findings, evidence findings, risk, recommendation

14.1 SAR Narrative QA Rubric

CriterionPass signalDefect
Whosubject and counterparties clearvague subjects
Whatspecific suspicious activitygeneric label
Whendates and periodmissing chronology
Whereaccounts, channels, geographyunclear channel
Why unusualprofile contrastno expected activity
Howmechanism describedno pattern explanation
Evidencesource-linked factsunsupported statements
Uncertaintylimitations disclosedoverclaiming

15. Product And Architecture Requirements

15.1 Functional Requirements

  • Maintain versioned typology registry with owner and source anchors。
  • Maintain scenario inventory mapped to typologies and red flags。
  • Map every production alert to scenario id, scenario version and source events。
  • Support manual referral objects with typology candidates and evidence。
  • Generate evidence packet before LLM summary is shown to reviewer。
  • Require source-linked grounding for LLM factual claims。
  • Block LLM output that instructs file / do not file SAR。
  • Record human decision owner for SAR consideration。
  • Track non-filing rationale according to internal policy。
  • Link SAR narrative draft to supporting documentation index。
  • Maintain access controls for SAR-sensitive evidence。
  • Measure coverage by typology, product, channel and segment。
  • Run scenario coverage regression before model/rule/prompt releases。
  • Route QA findings back to typology/scenario owners。

15.2 Non-Functional Requirements

RequirementTarget
Traceability100% alerts have source scenario and version
Evidence grounding>= 95% factual AI claims linked to source
SAR confidentialityzero unauthorized disclosure tolerated
Human decision ownership100% SAR considerations have authorized owner
Coverage regressionno material regression without approval
Audit replaysampled cases replayable end to end
Data lineagecritical fields have lineage and quality status
Retentionaligned to SAR/supporting documentation policy
Access controlleast privilege and need-to-know

16. 30-Day Lab

目标: 30 天内完成一套可展示的 portfolio pack。推荐选择 Retail mule account、Elder exploitation wire、Check fraud referral、Small business structuring 或 Scam crypto exit。

DaysThemeArtifacts
1-7Typology and scenario foundationsuse-case boundary, source-anchor map, 8 typology cards, 12 scenarios, 30 red flags, coverage matrix, coverage gap memo
8-14Evidence and workflowalert event schema, case state machine, evidence bundle schema, manual referral spec, LLM assist policy, reviewer workbench, SAR confidentiality control
15-21Eval and controlssynthetic case set, sanitized real-case protocol, SAR narrative rubric, grounding/confidentiality tests, model-rule-LLM comparison, coverage regression gate, QA sampling plan
22-30Operating model and interview packKRI dashboard, RACI, advisory response runbook, automation-bias tabletop, coverage-regression tabletop, executive memo, portfolio case study, interview answers, audit replay demo

Completion standard:

  • Can explain 8 typologies and their red flags。
  • Can show active, partial, manual and gap coverage。
  • Can trace red flag to source evidence and narrative claim。
  • Can show SAR decision remains human-owned。
  • Can show LLM allowed/prohibited actions and tests。
  • Can compare synthetic and governed real-case eval。
  • Can show RACI, KRI and governance cadence。
  • Can explain why coverage beats alert volume。

17. Interview Answers

Q1: 如何解释 typology、scenario、red flag、alert、case、SAR evidence 的关系?

30 秒:

Typology 是金融犯罪模式, scenario 是覆盖该模式的检测或审查机制, red flag 是可疑指标, alert 是 scenario 触发的工作对象, case 是人工调查容器, SAR evidence 是支持 SAR consideration 和 human decision 的证据链。AI 的价值是把这些对象连接起来, 不是替代 SAR 判断。

2 分钟:

我会先建立 typology library, 例如 structuring、mule account、elder exploitation、check fraud、scam proceeds。每个 typology 有 source anchors、red flags、产品渠道范围和 evidence requirements。Scenario inventory 定义哪些 rule、ML、graph 或 manual referral 覆盖这些 typology。Alert 触发后进入 case workflow, evidence ledger 记录 source events、scenario version、red flags、AI assistance、analyst notes、reviewer decision 和 SAR consideration。最后 SAR narrative 只是一部分, 真正关键是 supporting documentation and human-owned decision。

Q2: 为什么 false positive reduction 不是 AML AI 的唯一目标?

False positive reduction 可以提升效率, 但如果没有 coverage view, tuning 可能移除低频高风险 typology 或新兴威胁。AML AI 还要衡量 typology coverage、evidence completeness、SAR quality、false negative risk、scenario freshness、advisory response 和 human review quality。效率不能以 blind spot 为代价。

Q3: LLM 在 SAR workflow 中可以做什么?

LLM 可以做 evidence organization、transaction chronology、red flag candidate extraction、gap detection、draft narrative support 和 QA rubric check。它不能决定 file 或 no file, 不能认定犯罪, 不能通知客户 SAR 相关信息, 不能生成无来源事实, 也不能绕过人类 AML/BSA owner。

Q4: 如何设计 typology coverage matrix?

我会按 typology x product x channel x customer segment 建矩阵。每个 cell 记录 scenario id、data status、evidence quality、coverage state、owner、QA result 和 gap action。这样管理层看到的不是 alert volume, 而是哪些风险在哪些业务面被覆盖、哪里只有 manual control、哪里是 known gap。

Q5: 如何证明 SAR narrative 有足够 evidence?

用 evidence bundle。每个 narrative claim 链接 transaction id、customer profile、counterparty record、case note 或 supporting document。Bundle 还记录 typology id、scenario id、red flag ids、AI output hash、analyst notes、reviewer decision、decision owner、timestamp、access policy 和 retention class。审计可以从 narrative 回放到 source evidence。

Q6: 如何避免 LLM 写出“流畅但危险”的 SAR draft?

先做 evidence-first workflow, LLM 只能基于 evidence index 生成。然后用 unsupported-claim scanner、file/no-file phrase blocker、confidentiality tests、gap detection rubric 和 human review。Reviewer UI 显示 source evidence、missing evidence 和 uncertainty, 不把 AI narrative 放在第一屏当结论。

Q7: 如何处理 FinCEN advisory 更新?

建立 advisory-to-typology runbook。新 advisory 进入 intake, typology owner 评估 source relevance、red flags、key terms、affected products/channels、scenario gap、data availability、QA sample 和 release priority。更新 typology library、scenario inventory、eval cases、analyst guidance 和 dashboard, 并记录 review evidence。

Q8: Synthetic eval 和 real-case eval 如何组合?

Synthetic eval 用于覆盖 rare typologies、edge cases、prompt regression 和 known ground truth。Real-case eval 用于 operational realism、messy evidence、data quality 和 analyst workflow。真实 case 必须做 SAR confidentiality、de-identification、access control、retention 和 approval。两者组合才能同时有安全性和真实性。

Q9: 如何解释 alert-to-case-to-SAR traceability?

Traceability 是从 source event 到 final disposition 的完整链。每个 alert 有 scenario id/version and source events, case 有 analyst actions and evidence, SAR consideration 有 human decision owner and rationale, narrative 有 supporting documentation index。没有 traceability, SAR assist AI 就只是一个文本生成工具。

Q10: 如何向高管解释投资价值?

这不是增加合规文档, 而是让金融犯罪 AI 可扩展、可审计、可调优。Typology coverage 可以减少 blind spots, evidence bundle 提升 SAR quality, LLM assist 降低 analyst preparation time, traceability 降低 audit and remediation cost。关键是用 AI 提升质量和效率, 不牺牲 human accountability。


18. Portfolio Deliverables

最终作品集建议包含:

  1. typology-library.md: 8-12 个 typology cards。
  2. scenario-inventory.md: rule / ML / graph / LLM / manual referral scenarios。
  3. coverage-matrix.md: typology x product x channel x segment coverage。
  4. red-flag-evidence-map.md: red flags to observable evidence。
  5. architecture-diagram.md: source systems to evidence ledger。
  6. alert-to-case-state-machine.md: workflow states and controls。
  7. SAR-evidence-bundle.yaml: sample structured evidence package。
  8. LLM-assist-policy.md: allowed/prohibited actions and guardrails。
  9. eval-plan.md: synthetic and governed real-case eval。
  10. KRI-dashboard-spec.md: executive, product and risk metrics。
  11. RACI.md: operating model。
  12. executive-memo.md: coverage gap and investment rationale。
  13. interview-pack.md: 30-second, 2-minute, deep-dive answers。

Portfolio narrative:

I designed a typology-driven financial crime AI architecture.
It maps source anchors and emerging advisories to typology objects,
maps typologies to scenarios and red flags,
measures coverage across products and channels,
uses LLM only as an evidence assistant,
preserves human SAR ownership,
and provides audit-replayable alert-to-case-to-SAR traceability.

19. Common Pitfalls

PitfallConsequenceBetter design
Treating SAR AI as writing assistant onlyFaster weak narrativesevidence-first SAR bundle
Treating red flags as criminal proofoverclaiming and poor decisionsadditional scrutiny and human review
Tuning rules without coverage reviewhidden false negativescoverage regression gate
LLM recommends file/no-filegovernance breach and automation biasdecision-boundary guardrail
No source groundinghallucinated factsevidence index and claim trace
Synthetic-only testingunrealistic qualitycombine with governed real-case eval
Real-case testing without confidentialitySAR/privacy riskde-identification and access controls
Dashboard shows alert volume onlyblind spots hiddentypology coverage dashboard
Manual referrals ignoredfrontline red flags lostreferral object and routing
No advisory change processemerging threats staleadvisory-to-scenario runbook
SAR package access too broadconfidentiality breachneed-to-know SAR vault
Human review under-capacityrubber-stamp decisionscapacity planning and QA

20. Final Operating View

AI financial crime architecture should answer ten questions:

  1. Which typologies are in scope?
  2. Which source anchors justify them?
  3. Which red flags are observable?
  4. Which scenarios cover which products, channels and customers?
  5. Which data gaps create blind spots?
  6. Which evidence supports each alert and case?
  7. What did AI assist with, and what was prohibited?
  8. Who made the human decision?
  9. Can the SAR narrative be traced to supporting documentation?
  10. Can audit replay the chain and management see coverage gaps?

Final memory sentence:

A mature financial crime AI system is typology-first, scenario-governed, evidence-grounded, SAR-confidential, human-owned and audit-replayable.