返回 Papers
AI 扩展计划 / Playbooks

AI Document Intelligence / Unstructured Data / Evidence Quality Playbook

核心判断:

785AI_DOCUMENT_INTELLIGENCE_UNSTRUCTURED_DATA_EVIDENCE_QUALITY_PLAYBOOK.md

AI Document Intelligence / Unstructured Data / Evidence Quality Architecture Playbook

定位: 面向 CBAP+ Senior BA、高级 AI PM、Product Architect、Enterprise Architect、Operations Architect、Records / Information Governance、Fraud Risk、Model Risk、KYC/KYB、Claims、Disputes、Loan and Insurance Servicing 负责人, 把 document intelligence 从“自动读文档”落地为 evidence-grade, workflow-ready, records-aware, auditable operating system。 适用范围: bank statements、paystubs、claims packages、payment disputes、KYC/KYB documents、insurance and loan servicing documents、complaints、operational correspondence、mailroom automation、agent assist and workflow automation。 核心产出: executive framing、taxonomy、decision gates、target architecture、required artifacts、RACI/operating model、implementation roadmap、evidence pack、release checklists、metrics、anti-patterns、tabletop scenarios and portfolio deliverables。

核心判断:

Financial document AI succeeds when the institution can rely on extracted evidence, not when the model can read more PDFs.


0. Disclaimer

本文是学习、作品集、架构训练和内部治理讨论材料, 不构成法律意见、合规结论、记录保留结论、e-discovery 建议、KYC/KYB 充分性判断、贷款或保险承保结论、消费者争议处置意见、欺诈处置指令、模型验证报告或供应商推荐。

正式项目必须由 Legal、Compliance、Privacy、Records Management、Information Governance、Model Risk、Fraud Risk、Financial Crime、Operations、Product、Architecture、Information Security、Data Governance、Vendor Management、Internal Audit 和相关业务 owner 共同判断。记录、证据、法律保留、客户通知、KYC/KYB、信贷、保险、投诉、争议、索赔、跨境数据和 e-discovery 的具体适用性, 取决于 product、record type、jurisdiction、retention schedule、legal hold status、customer segment、channel、policy、contract 和 Legal / Compliance / Records interpretation。

边界原则:

  • OCR result 是 extracted text, 不是 verified fact。
  • LLM summary 是 productivity aid, 不是原始证据替代品。
  • Confidence score 是 routing/control input, 不是业务结论。
  • Fraud/tamper model output 是 risk signal, 不是欺诈结论。
  • Records retention、legal hold、e-discovery、KYC/KYB、lending、insurance 和 dispute obligations 的具体适用性必须由对应治理职能解释。

1. Executive Framing

高管常见目标:

Reduce manual document review.
Accelerate onboarding, claims, disputes and servicing.
Use AI to summarize and extract unstructured documents.

真正的项目目标应改写为:

Create an evidence-grade document intelligence capability
that automates low-risk extraction,
routes ambiguity to skilled review,
prevents unsupported decisions,
preserves records and legal hold controls,
and can replay every material document-driven action.

如果没有 evidence architecture, document AI 会引入隐藏风险:

  • bank statement 被 OCR 错读, income calculation 批量错误。
  • paystub 伪造模板被高置信度抽取, 进入 underwriting workflow。
  • claim photo 被复用, fraud signal 未传给 adjuster。
  • dispute evidence summary 省略关键商户证明, 导致不当处置。
  • KYB 文件显示 authorized representative 不清楚, 但 AI prefill 直接推进开户。
  • legal hold 已触发, 但 OCR JSON、summary、review notes 或 vendor copies 被清理。
  • 客户投诉时, 团队找不到“模型看了什么、谁复核了什么、为什么采取该动作”。

Executive one-liner:

Document AI is a controlled evidence supply chain for operations, risk and records.

1.1 Steering Committee Questions

  1. 哪些 document classes 和 fields 可以自动抽取, 哪些只能辅助人工?
  2. 哪些字段会影响客户权利、资金、身份、保险、信贷、争议或合规?
  3. 如何证明字段来自哪份文档、哪一页、哪个区域、哪个模型版本?
  4. 置信度阈值如何校准, reviewer overturn 如何反馈?
  5. records retention、legal hold、vendor copies 和 derived AI artifacts 如何治理?
  6. 出现投诉、审计、监管问询或诉讼保全时, case 能否完整 replay?

2. Source Anchors

AnchorOfficial link本 playbook 使用方式
NIST AI Risk Management Frameworkhttps://www.nist.gov/itl/ai-risk-management-framework用 Govern / Map / Measure / Manage 组织 document AI 风险、eval、monitoring、human oversight、incident and evidence controls
NIST Privacy Frameworkhttps://www.nist.gov/privacy-framework用 privacy risk management、data minimization、purpose limitation、processing controls 和 monitoring 设计文档数据边界
NARA Records Managementhttps://www.archives.gov/records-mgmt用 records lifecycle、disposition、records program 和 accountability 设计 records governance interface
NARA Electronic Records Managementhttps://www.archives.gov/records-mgmt/policy/transfer-guidance-tables.html用 electronic records metadata、format、transfer/readiness guidance 设计电子记录可保存、可检索、可迁移的架构讨论
CFPB Consumer Complaint Databasehttps://www.consumerfinance.gov/data-research/consumer-complaints/用 consumer complaint operations 视角设计 complaint linkage、evidence replay、root cause and remediation learning loop
FFIEC Authentication and Access to Financial Institution Services and Systemshttps://www.ffiec.gov/press/pr081121.htm用 financial institution access/authentication and layered security 思路设计 reviewer access、document workflow action、privileged operation controls
ISO/IEC 42001 overviewhttps://www.iso.org/standard/42001用 AI management system、roles、operation、performance evaluation、internal audit and continual improvement 建立 operating model

Source-to-control pattern:

source anchor -> control objective -> product decision
  -> workflow requirement -> evidence artifact -> owner -> metric

3. Taxonomy

3.1 Document Use Case Classes

ClassExamplesPrimary decision impact
Income and affordabilitybank statements, paystubs, payroll summaries, benefit lettersincome evidence, servicing treatment, affordability package
Identity and entityIDs, business registration, ownership docs, licenses, utility billsKYC/KYB evidence, authority, onboarding routing
Claims and lossclaim forms, photos, invoices, police reports, medical billsclaim triage, payout support, fraud review
Payment disputesreceipts, shipping proof, merchant correspondence, customer statementsdispute reason code, evidence package, SLA
Account authorityPOA, court order, death certificate, consent forms, corporate resolutionaccount access, maintenance, servicing authority
Complaints and correspondenceletters, emails, transcripts, attachmentscomplaint classification, response evidence, RCA
Internal operationsbranch scans, mailroom forms, agent notes, back office formstask routing, QA, operational control

3.2 Evidence Classes

Evidence classMeaningGovernance need
Raw artifact原始文件或图像hash, source, access, retention
Rendered artifact系统生成的 page image / normalized PDFrenderer version, lineage
OCR/layout artifacttext, table, reading order, bounding boxessource anchoring, versioning
Extracted fieldschema-bound field and valueconfidence, validation, review
Derived factcalculations or reconciled factsformula, input trace, policy use
AI summarysource-linked narrative for reviewerprohibited conclusions, citation
Human decisionreviewer correction, acceptance, escalationreason code, role, timestamp
Workflow actioncase update, payment, request-for-info, escalationpolicy decision id, evidence link
Records metadatarecord class, retention rule, legal hold flagRecords/Legal governance

3.3 Criticality Levels

LevelExamplesDefault treatment
Lowdocument title, non-decision routing labelauto-populate with sampling
Mediumaddress, product type, non-material datevalidation and review on conflict
Highincome amount, claim amount, policy number, account holder, dispute reasonfield-level threshold, source link, validations, review triggers
Restricted / specialistPOA, court order, beneficial ownership, medical record, fraud signal, legal holdspecialized queue, access controls, policy review

4. Target Operating Architecture

channel intake
  -> document provenance and storage
  -> normalization / rendering / safety scan
  -> classification and package splitting
  -> OCR / layout / table extraction
  -> multimodal field and entity extraction
  -> normalization and validation
  -> confidence calibration
  -> fraud and tamper checks
  -> evidence policy gates
  -> human review workbench
  -> workflow orchestration
  -> records retention and legal hold integration
  -> evidence ledger, complaint linkage, monitoring and governance

Architecture capabilities:

CapabilityWhat it must do
Document provenanceAssign document id, hash, source channel, received time, custody events
Document classificationIdentify type/subtype, package boundaries, language, quality, ambiguity
Layout understandingPreserve page, coordinates, reading order, tables, checkboxes, signatures
Schema-constrained extractionExtract only governed fields with allowed values and source anchors
Entity normalizationNormalize names, dates, addresses, monetary values, IDs, account masks
Evidence validationRun arithmetic, chronology, cross-document and system-of-record checks
Confidence and triageRoute by calibrated confidence, field criticality and customer impact
Fraud/tamper detectionDetect altered files, duplicates, fake templates, metadata anomalies, prompt injection
Human reviewSource-first review, structured overrides, QA sampling and dual control
Workflow integrationFeed KYC, claims, disputes, servicing, complaints and ops queues
Records/hold integrationApply record class, retention metadata, legal hold propagation and disposition controls
Governance monitoringTrack evals, model drift, review outcomes, complaints, incidents and CAPA

5. Decision Gates

Gate 0: Use Case Boundary

QuestionPass condition
Which journey is in scope?one workflow named: KYC, KYB, claims, disputes, servicing, complaints or operations
Which documents are accepted?document class/subtype list and unsupported formats
What decisions may use extracted evidence?allowed workflow actions and prohibited actions defined
Is AI summarizing, extracting, classifying or recommending?AI role and decision boundary documented
Does the use case touch records, legal hold, privacy, KYC/KYB, lending, insurance or dispute obligations?governance functions identified

Gate 1: Evidence Schema

QuestionPass condition
What fields are required?field dictionary with definitions, data types, source requirements
Which fields are high impact?criticality and customer harm assessment
What source anchor is required?page/coordinate/table cell/paragraph anchor rule
What validations are required?arithmetic, chronology, cross-document, system match
What summaries are allowed?source-linked summary rules and prohibited conclusions

Gate 2: Confidence and Review Policy

QuestionPass condition
Are confidence scores calibrated?calibration evidence by field and document class
Are thresholds field-specific?threshold matrix by criticality and journey risk
What routes to human review?review triggers and queue ownership
Is QA sampling defined?sample rules for auto-processed and reviewed cases
Are reviewer overrides captured?structured reason codes and source-linked corrections

Gate 3: Fraud, Security and Access

QuestionPass condition
Can the system detect tamper and duplicate patterns?metadata, visual, duplicate and validation checks
Are prompt injection and malicious files controlled?isolation, safety scan, schema validation
Are reviewer and vendor access controlled?role-based access, logging, least privilege
Are privileged workflow actions protected?step-up/dual control where appropriate
Are fraud signals routed without leaking sensitive rules?reviewer-only signal display and customer-safe language
QuestionPass condition
Are raw and derived artifacts classified?record class and retention metadata assigned
Does legal hold propagate?raw, OCR, extraction, summaries, review notes, exports and vendor copies checked
Can disposition be audited?disposition workflow and exception log
Can a complaint or audit replay the case?evidence pack with document, model, review, workflow and final message
Are vendor obligations tracked?retention, deletion, access, audit and exit evidence

Gate 5: Production Readiness

QuestionPass condition
Are eval sets representative?document quality, channel, language, layout, fraud pattern, customer segment covered
Are monitoring dashboards live?extraction, confidence, review, records, fraud, complaints and incidents
Are operations trained?reviewer playbook and QA plan completed
Is incident response ready?model defect, vendor outage, legal hold miss, fraud pattern, complaint spike playbooks
Is go/no-go tied to evidence quality?release decision uses balanced scorecard, not automation rate alone

6. Required Artifacts

ArtifactWhat it proves
Use Case Boundary Card明确 journey、documents、AI role、business actions、risk and governance owners
Document Class Taxonomy证明文档类别、子类型、unsupported documents and routing
Evidence Field Dictionary证明字段定义、source anchor、data type、criticality、allowed uses
Extraction and Validation Spec证明 OCR/layout/extraction/normalization/validation controls
Confidence and Review Policy证明何时自动化、何时人工复核、何时双控
Fraud/Tamper Threat Model覆盖 altered PDF、fake template、duplicate、deepfake、insider and prompt injection
Records and Legal Hold Mapping证明 raw and derived artifacts 如何分类、保留、hold and disposition
Workflow Contract证明 evidence payload 如何进入 case/task/decision systems
Human Review Workbench Spec证明 reviewer 能看到 source, confidence, conflicts, reason codes and history
AI Model and Prompt Inventory证明模型、ruleset、prompt、vendor、versions and allowed uses
Eval and QA Scenario Suite证明按文档类别、字段、语言、质量、欺诈和流程结果测试
Evidence Bundle Schema证明投诉、审计、监管问询时可以 replay
RACI and Governance Cadence证明 Product、Ops、Records、Legal、Model Risk、Fraud、Architecture 的责任边界

6.1 Evidence Field Dictionary Pattern

FieldExample
field_namestatement_period_end_date
document_classbank statement
source_anchorpage + coordinate + OCR text
data_typedate
criticalityhigh for income verification, medium for routing
normalizationISO date with locale parsing
validationsperiod continuity, not future date, matches page header
review_triggerslow confidence, missing page, conflict with upload metadata
allowed_usesstatement completeness, income package preparation
prohibited_usesfinal lending decision by itself
retention_linkderived field linked to statement artifact

6.2 Workflow Contract Pattern

Workflow:
Trigger:
Accepted document classes:
Required fields:
Optional fields:
Excluded data:
Confidence thresholds:
Validation rules:
Fraud/tamper routes:
Human review triggers:
Records metadata:
Legal hold behavior:
Case update payload:
Customer communication constraints:
Monitoring metrics:
Owner:

7. RACI / Operating Model

ActivityAccountableResponsibleConsultedInformed
Use case prioritizationProduct OwnerAI PM / Senior BAOperations, Risk, CXSteering Committee
Document taxonomyOperations OwnerBA / Process ArchitectRecords, Legal, ComplianceProduct
Evidence field dictionaryProduct / OpsBA / Data ProductCompliance, Model Risk, FraudReview teams
Extraction architectureEnterprise ArchitectureEngineering / AI PlatformSecurity, Data Governance, Vendor MgmtOperations
Confidence policyBusiness Risk OwnerAI PM / Model Risk / Ops QAFraud, Compliance, ProductAudit
Human review designOperations OwnerOps Lead / UX / BACompliance, Accessibility, FraudProduct
Fraud/tamper modelFraud RiskFraud Analytics / SecurityProduct, Model Risk, LegalOps
Records mappingRecords ManagementInformation Governance / DataLegal, Compliance, ArchitectureProduct
Legal hold propagationLegal / RecordsPlatform / Case SystemsCompliance, Vendor MgmtAudit
Privacy controlsPrivacy / Data GovernanceProduct / EngineeringLegal, Security, CXOps
Model inventory and evalModel Risk / AI GovernanceAI Platform / AI PMProduct, Compliance, OpsAudit
Workflow integrationOperations / ProductEngineering / Case PlatformArchitecture, Records, FraudCX
Complaint linkageComplaint OpsCase Management / DataCompliance, Product, Model RiskAudit
Independent assuranceInternal AuditAudit TeamRisk, Legal, TechnologyBoard Committee

Governance cadence:

CadenceForumOutput
WeeklyPilot operations reviewreview backlog, extraction defects, queue SLA
WeeklyFraud/tamper reviewemerging patterns, false positives, confirmed fraud yield
MonthlyEvidence quality councilfield accuracy, calibration, overturns, downstream rework
MonthlyRecords and hold reviewmetadata completeness, hold propagation tests, vendor attestations
MonthlyAI governance revieweval results, prompt/ruleset changes, incidents
QuarterlyProduct/risk steering committeescale, restrict, redesign, retire decisions
QuarterlyComplaint learning loopcomplaint themes, RCA, CAPA and policy changes
SemiannualTabletop exerciselegal hold miss, vendor outage, model defect, fraud pattern

8. Implementation Roadmap

Days 1-30: Baseline and Scope

Day rangeWorkArtifact
1-3Select one bounded workflow, e.g. paystub income extraction, claims invoice review, dispute evidence packageUse Case Boundary Card
4-7Inventory document classes, sources, channels, volumes, current defectsDocument Class Taxonomy
8-12Define high-impact fields, source anchors, allowed and prohibited usesEvidence Field Dictionary
13-16Map records, retention, legal hold and vendor copy requirements for raw and derived artifactsRecords and Legal Hold Mapping
17-20Design extraction, validation, confidence and review thresholdsExtraction and Review Policy
21-24Threat model tamper, duplicate, prompt injection, insider and vendor risksFraud/Tamper Threat Model
25-27Define workflow payload, case states, reason codes and customer communication constraintsWorkflow Contract
28-30Define metrics, eval sets, QA sampling and evidence bundleControl Dashboard Spec

Days 31-60: Controlled Build and Pilot

Day rangeWorkArtifact
31-35Implement provenance, hash, storage, rendering and source anchoringProvenance Test Report
36-40Configure classification, OCR/layout and schema-constrained extractionExtraction Test Report
41-45Add validation, confidence routing and review queuesReview Routing Report
46-50Build human review workbench with source-linked fields and override captureReviewer QA Record
51-54Integrate fraud/tamper checks and security/access controlsFraud Control Report
55-57Connect records metadata, retention and legal hold flagsRecords Integration Report
58-60Pilot with limited scope and manual QA samplingPilot Evidence Quality Report

Days 61-90: Scale Decision and Assurance

Day rangeWorkArtifact
61-65Analyze field accuracy, confidence calibration, review overturn and complaintsOutcome Review
66-70Tune thresholds, validations, queues and customer request-for-info languageChange Control Record
71-75Test legal hold propagation, disposition exception and vendor deletion evidenceHold and Disposition Test
76-80Run model defect, vendor outage and fraud pattern tabletopTabletop Decision Log
81-85Complete model risk, privacy, records and compliance reviewGovernance Review Pack
86-90Decide scale, restrict, redesign or retireGo/No-Go Decision Record

9. Evidence Pack

Minimum evidence fields:

FieldPurpose
case_idoperational case reference
workflow_idKYC, KYB, claims, disputes, servicing, complaints
document_idunique document reference
document_versionraw and derived artifact version
source_channelupload, branch scan, mail, email, API, vendor
received_timestampintake time
raw_file_hashintegrity check
document_classtype and subtype
classification_resultmodel/rule output and confidence
page_countcompleteness signal
quality_resultblur, missing pages, render errors
processing_lineageOCR/layout/extraction model and rule versions
extracted_fieldsfield values with source anchors
validation_resultsrules, cross-document and system checks
confidence_resultsfield and case scores with calibration bucket
fraud_tamper_signalsduplicate, metadata, visual, arithmetic, prompt injection
human_review_recordsreviewer decisions, corrections, reason codes
policy_decisionaccept evidence, request more info, review, reject use
workflow_actioncase update, payment, escalation, request, communication
ai_summary_run_idmodel summary trace when used
records_metadatarecord class, retention rule, disposition state
legal_hold_flaghold status and propagation reference
access_eventsreviewer/system/vendor access
customer_final_message_idfinal communication or request-for-info
complaint_idlinked complaint if applicable
capa_idcorrective action when defect found

Evidence rules:

  • Store raw artifacts and derived artifacts with clear lineage, not as one mutable blob。
  • Preserve source anchors for every material extracted field。
  • Treat missing source anchor for high-impact field as a control defect。
  • Separate reviewer-facing summaries from official records and decision evidence unless Records/Legal approves their role。
  • Make legal hold state visible to deletion, training-data selection, vendor purge and export jobs。
  • Capture final customer/counterparty communication, because disputes often turn on what was said and when。

10. Workflow Playbooks

10.1 Bank Statement / Paystub Income Evidence

StepControl
Intakeverify page count, statement/pay period, source channel and raw hash
Extractemployer, employee, period, gross/net/YTD, deposits, balances, account mask
Validatearithmetic, period continuity, YTD consistency, name/account match, duplicate check
Confidencehigh-impact amount and identity fields use stricter thresholds
Reviewroute mismatches, missing pages, fake-template signals, high amount outliers
Workflowcreate income evidence package, not final affordability conclusion by itself
Recordsattach raw and derived artifacts to case record class and hold state

10.2 Claims Package

StepControl
Intakesplit package into forms, invoices, photos, reports, correspondence
Extractclaimant, policy, loss date, invoice totals, provider, photo metadata
Validatetimeline consistency, duplicate invoice/photo, coverage period, amount totals
Confidencephoto and invoice authenticity signals influence triage
Reviewadjuster sees source-linked timeline and conflicting evidence
Workflowroute to payout support, investigation, request-for-info or denial review per policy
Recordspreserve package, adjuster notes, AI summary and final communication linkage

10.3 Payment Dispute Evidence

StepControl
Intakeclassify cardholder evidence, merchant evidence, shipping proof, screenshots
Extracttransaction, merchant, date, amount, reason code, delivery/tracking details
Validatematch transaction system, reason-code evidence checklist, SLA deadline
Confidencesummary cannot replace required proof element
Reviewroute weak evidence or conflicting claims to dispute specialist
Workflowcreate source-linked representment or response package
Recordspreserve submitted and outgoing evidence with final response

10.4 KYC/KYB Documents

StepControl
Intakeidentify individual, entity, license, ownership, authority and address documents
Extractnames, entity IDs, owners, roles, license status, address, dates
Validatefreshness, entity resolution, authority scope, conflicting ownership
Confidencehigh-impact authority and ownership fields require stricter handling
Reviewroute beneficial ownership, authorization ambiguity and stale documents
Workflowevidence package for compliance/ops review, not automatic compliance conclusion
Recordsretention and hold handling per product, record type and governance interpretation

10.5 Complaints and Operational Correspondence

StepControl
Intakedetect complaint indicators, product, customer, attachments, urgency
Extractissue theme, harm, requested resolution, dates, referenced transactions
Validatecase/customer match, prior interactions, response deadlines
Confidencelow confidence complaint classification goes to complaint ops
Reviewpreserve customer voice and avoid summary-only handling
Workflowcreate complaint case, route RCA, attach evidence and final response
Recordslink documents, AI runs, agent notes, response and CAPA

11. Checklists

11.1 Release Checklist

CheckPassing evidence
Use case and decision boundary documentedUse Case Boundary Card
Document classes and unsupported formats definedDocument Class Taxonomy
High-impact fields identifiedEvidence Field Dictionary
Source anchor required for material fieldsextraction schema test
Confidence thresholds calibratedcalibration report
Human review queues configuredworkflow test
Reviewer override reasons capturedreview workbench test
Fraud/tamper checks activethreat model and test cases
Records metadata assignedrecords integration test
Legal hold propagation testedhold propagation evidence
Vendor retention and access controls reviewedvendor control record
Privacy minimization and redaction reviewedprivacy assessment evidence
Complaint replay path testedsimulated complaint evidence pack
Monitoring dashboard liveproduction readiness dashboard

11.2 Extraction QA Checklist

CheckPassing evidence
OCR/layout preserves page and coordinatessource anchor sample
Table extraction handles merged cells and totalstable validation result
Date and currency normalization testedlocale test cases
Entity resolution does not overwrite raw valueraw and normalized field pair
Low-quality images routedquality gate log
Missing pages detectedcompleteness rule
Model output schema enforcedinvalid output rejection
Prompt injection ignoredadversarial document test
High-confidence wrong fields sampledQA sampling report

11.3 Human Review Checklist

CheckPassing evidence
Reviewer sees original source next to extracted fieldUI test
Source coordinate click worksworkbench test
Confidence and validation failures visiblereviewer screenshot/spec
Sensitive fraud details protectedrole-based display
Override captures reason and corrected valuereview log
Dual control applied where requiredapproval record
QA samples auto-pass and manual-pass casesQA plan
Reviewer feedback does not train models without approvaldata governance control
CheckPassing evidence
Raw document has record metadatarecord metadata sample
OCR/layout/extraction artifacts classifiedderived artifact inventory
AI summaries mapped to retention treatmentRecords/Legal decision record
Legal hold propagates to all linked artifactspropagation test
Vendor copies checked for hold and deletionvendor attestation/control
Disposition has approval and auditdisposition log
Training-data selection checks hold and use restrictionsdata pipeline control
Retrieval supports case/audit/complaint replaysearch/replay test

11.5 Workflow Integration Checklist

CheckPassing evidence
Case system receives structured evidence payloadintegration test
Policy reason codes captureddecision log
Request-for-info language approvedCX/Compliance review
Final customer message linked to evidencecommunication id
SLA timers consider document ambiguityworkflow config
Downstream systems do not treat extraction as final decisioncontract test
Exceptions and fallbacks are operationally staffedqueue capacity plan

12. Metrics and KRIs

MetricWhy it matters
Field-level accuracy by criticalityavoids hiding high-impact errors
High-confidence wrong field ratedetects calibration failure
Source-anchor completenessmeasures auditability
Classification ambiguity rateshows routing risk
Missing page detection rateprotects completeness
Human review overturn ratereveals extraction/control quality
QA defect rate after auto-passestimates residual risk
Straight-through processing rate by document classproductivity with risk context
Review queue SLAoperational capacity and customer impact
Downstream rework ratequality of evidence entering workflow
Duplicate/tamper signal yieldfraud control effectiveness
False positive fraud review ratecustomer friction and ops burden
Records metadata completenessrecords readiness
Legal hold propagation successpreservation control
Vendor deletion/retention exceptionsthird-party risk
Complaint document trace completenesscomplaint/audit readiness
AI unsupported conclusion defectsmodel governance
Accessibility defect rateinclusive operations

Balanced scorecard:

Productivity: fewer manual minutes per case.
Evidence quality: critical fields accurate, anchored and validated.
Risk: fraud, records, privacy and legal hold controls operate.
Customer outcome: fewer unnecessary document requests and complaint defects.
Governance: every material document-driven action is replayable.

13. Anti-Patterns

Anti-patternWhy it failsBetter pattern
“OCR all docs, then ask LLM”loses layout/source control and invites hallucinationschema-constrained, source-linked extraction
One confidence threshold for all fieldstreats routing label and income amount the samethreshold by field criticality and workflow risk
Summary as evidenceunsupported omissions or invented conclusionssource-linked summary plus structured evidence
Auto-approve based on high confidenceignores validation, fraud, records and policyevidence policy gate
Human review with no source contextreviewer cannot verify efficientlysource-first workbench
Reviewer override as free text onlyhard to monitor and improvereason codes + structured corrections
No raw artifact hashcannot prove integrityimmutable raw artifact and lineage
No derived-artifact records mappingOCR/summary/review notes become governance blind spotartifact inventory and records metadata
Legal hold handled manually outside AI pipelinedeletion/training/vendor purge may miss artifactshold-aware data and workflow graph
Vendor black box extractionweak evidence and model governanceversion, lineage, eval and audit obligations
Fraud model makes final decisionfalse positives and explainability riskfraud signal plus policy/human review
Automation rate as north-star metrichides customer harm and evidence defectsbalanced scorecard

14. Tabletop Scenarios

Scenario 1: High-Confidence Paystub Error

The model extracts gross pay correctly but misreads pay period and YTD amount.
Confidence is high. The income package is about to move forward.

Expected decisions: field criticality threshold, arithmetic validation, reviewer route, downstream case hold, model defect capture。

A legal hold is applied to a servicing case. Raw PDFs are preserved,
but OCR JSON, AI summaries and reviewer notes are scheduled for deletion.

Expected decisions: hold propagation graph, disposition stop, vendor copy check, Records/Legal escalation, CAPA。

Scenario 3: Dispute Summary Omits Merchant Evidence

The AI summary says the customer provided strong evidence,
but the merchant delivery proof contradicts the customer statement.

Expected decisions: source-linked summary defect, dispute specialist review, reason-code evidence checklist, eval update。

Scenario 4: Fake Bank Statement Template

A bank statement matches a known visual template but metadata and transaction pattern are inconsistent.
The extraction fields look clean.

Expected decisions: fraud/tamper signal route, no customer-facing rule leakage, manual review, duplicate/template detection improvement。

Scenario 5: KYB Authority Ambiguity

A business registration document and board resolution are uploaded.
The model extracts an officer name but cannot establish authority to open the product.

Expected decisions: authority field specialist review, KYC/KYB interpretation boundary, request-for-info wording, evidence pack。

Scenario 6: Complaint Replay Failure

A customer complains that an AI rejected their claim documents.
The team can find the raw PDF but not the model run, reviewer override or final message.

Expected decisions: evidence pack gap, incident classification, complaint remediation, logging and workflow contract update。


15. Portfolio Deliverables

DeliverableWhat it demonstrates
Executive one-pager你能把 document AI 讲成 evidence operating system, 不只是 OCR automation
Use case boundary card你能控制 scope and decision impact
Document class taxonomy你能把文档、流程、风险和记录治理连接
Evidence field dictionary你能定义 source-linked, criticality-aware extraction
Confidence and review policy你能设计 calibrated automation and human oversight
Records/legal hold mapping你能把 raw and derived artifacts 纳入治理
Fraud/tamper threat model你能覆盖 fake documents, duplicates, prompt injection and insider risks
Reviewer workbench spec你能把 human review 设计成有效控制
Workflow contract你能让 AI evidence 正确进入 case systems
Evidence bundle schema你能支持 complaint, audit and regulatory replay
Metrics dashboard你能平衡效率、质量、风险、客户和治理

Portfolio storyline:

I designed an AI document intelligence architecture for financial retail operations.
It converts unstructured documents into source-linked evidence,
uses calibrated confidence and validation to route work,
integrates human review, fraud/tamper checks, records retention and legal hold,
and preserves a replayable evidence trail from intake to final workflow action.

16. Interview Answers

Q1: 如何向高管解释 document intelligence 的边界?

30 秒:

Document intelligence 不是 OCR 自动化, 而是 evidence supply chain。它把文档采集、provenance、layout、field extraction、confidence、validation、human review、fraud checks、workflow action、records retention 和 legal hold 连成一套可审计系统。能自动化的只是低风险、高质量、可验证字段; 高影响或冲突证据必须复核。

Q2: 什么样的字段可以 straight-through processing?

30 秒:

要满足五个条件: 字段风险低或中等、source anchor 完整、confidence 已校准、业务验证通过、没有 fraud/tamper/legal/authority trigger。高影响字段如 income、claim amount、beneficial ownership、POA、dispute reason 通常不能只靠模型分数推进。

Q3: 如何设计 human-in-the-loop 才不是形式主义?

30 秒:

reviewer 必须看到原文和字段定位、confidence、validation failures、conflicts、history and risk signals, 并用 structured reason code 记录接受、纠正或升级。还要做 QA sampling、overturn monitoring、dual control and feedback governance。否则 human review 很容易变成 rubber-stamp。

30 秒:

因为系统不只保存原 PDF, 还会生成 OCR text、layout JSON、extracted fields、AI summaries、review notes、exports and final decisions。哪些是 records、保留多久、legal hold 是否覆盖, 取决于 product、record type、jurisdiction、retention schedule、hold status 和 Legal/Compliance/Records interpretation。架构必须让这些 artifact 可分类、可保留、可冻结、可检索。

Q5: 如何衡量 document AI 是否真的可用?

30 秒:

看 field-level accuracy、high-confidence wrong rate、source-anchor completeness、review overturn、QA defects、downstream rework、fraud/tamper yield、records metadata completeness、legal hold propagation、complaint replay success and customer impact。只看 OCR accuracy 或 automation rate 会误导决策。


17. Practical Templates

17.1 Use Case Boundary Card

Use case:
Workflow:
Product:
Customer / business segment:
Jurisdiction / policy scope:
Channels:
Document classes:
Unsupported documents:
AI role:
Allowed workflow actions:
Prohibited workflow actions:
High-impact fields:
Fraud risks:
Records owner:
Legal hold considerations:
Human review triggers:
Evidence replay requirement:
Product owner:
Risk owner:

17.2 Document Class Card

Document class:
Subtypes:
Typical source channels:
Expected pages / sections:
Required fields:
Optional fields:
High-impact fields:
Known fraud/tamper patterns:
Quality gates:
Classification ambiguity handling:
Retention metadata:
Legal hold propagation:
Workflow route:
Specialist review triggers:

17.3 Evidence Acceptance Rule

Rule ID:
Workflow:
Document class:
Field:
Criticality:
Required source anchor:
Minimum confidence:
Required validations:
Cross-document checks:
Fraud/tamper exclusions:
Human review required when:
Auto-populate allowed:
Policy acceptance state:
Customer-facing reason:
Evidence fields:
Owner:
Review cadence:

17.4 Reviewer Decision Record

Review task ID:
Case ID:
Document ID:
Field / evidence item:
Model value:
Model confidence:
Source anchor:
Validation result:
Fraud/tamper signal:
Reviewer decision:
Corrected value:
Reason code:
Rationale:
Second approval:
Workflow action:
Customer communication:
Timestamp:
QA sample flag:

17.5 Evidence Replay Script

Case:
Question to answer:
Raw documents:
Derived artifacts:
Extraction model versions:
Field source anchors:
Validation results:
Fraud/tamper results:
Human reviews:
Policy decisions:
Workflow actions:
Customer/counterparty messages:
Records metadata:
Legal hold status:
Complaint / audit / CAPA links:
Replay conclusion:

18. Final Operating Principle

这套 playbook 的成熟度可以用一个问题检验:

When a bank statement, paystub, claim package, dispute file, KYB document
or servicing letter changes a customer or business outcome,
can the institution prove exactly what document was used,
what fields were extracted,
where they came from,
how confidence and validation were handled,
who reviewed exceptions,
how fraud and records controls applied,
and why the workflow action was appropriate at that time?

如果答案不清楚, 不是缺一个更强的 OCR vendor。问题是 document intelligence、operations workflow、records governance、fraud controls、model risk and evidence quality 还没有成为同一套 operating model。