返回 Papers
AI 扩展计划 / Playbooks

AI Records / Retention / Legal Hold / eDiscovery Playbook

AI 系统正在把传统业务行为拆成很多可观察但容易丢失的数字片段。

798AI_RECORDS_RETENTION_LEGAL_HOLD_EDISCOVERY_PLAYBOOK.md

AI Records / Retention / Legal Hold / eDiscovery Architecture Playbook

定位: 面向高级 AI PM / Senior BA / AI Product Architect / Enterprise Architect / Records Management Lead / Legal Operations Partner / Compliance Technology Lead / Risk Technology Lead / Internal Audit Partner, 把 AI records 从“日志保存”升级为可分类、可保留、可保全、可检索、可生产、可删除、可审计的金融零售生产能力。 适用范围: AI copilot、agentic workflow、customer service RAG、payment dispute assistant、credit underwriting copilot、AML / fraud assistant、complaint remediation agent、wealth advisory assistant、AI platform gateway、model evaluation and incident management。 重要说明: 本文是学习、作品集和内部架构训练材料, 不是法律意见、合规结论、监管解释、eDiscovery strategy、records schedule approval 或 litigation hold advice。正式项目必须由 Legal、Compliance、Privacy、Records Management、Risk、Model Risk、Security、Internal Audit、Business Owner、Operations Owner 和外部律师在适用场景下确认。适用性取决于 entity、product、jurisdiction、customer type、communication channel、record category、regulator、contract 和 litigation posture。


1. Executive Framing

AI 系统正在把传统业务行为拆成很多可观察但容易丢失的数字片段。

一个客户服务 RAG 回答, 可能包含 customer prompt、identity context、uploaded document metadata、system prompt、retrieval query、retrieved chunks、model output、agent edit、approval decision、final customer message、QA evaluation 和 incident flag。

这些片段不一定都要长期保存。 但其中一部分可能是 business record、regulatory record、litigation-relevant ESI、model governance evidence、complaint evidence 或 audit evidence。

本 playbook 的核心判断:

AI recordkeeping is not log retention.
It is obligation-aware evidence architecture.

中文表达:

AI 记录治理不是日志多存几天, 而是把 AI 影响过的业务事实变成可治理证据。

1.1 成熟能力要回答的问题

QuestionMature answer
什么是 AI recordrecord object taxonomy, not raw log dump
为什么保存retention authority, business need, legal / regulatory / audit need
保存多久retention matrix by record category and trigger event
谁能冻结legal hold authority and approval workflow
冻结哪里app DB, object store, vector store, log store, warehouse, vendor store
如何搜索indexed metadata, scoped search, privilege and sensitivity filters
如何生产chain-of-custody, redaction log, hash manifest, export package
如何删除disposition engine with hold and regulatory conflict checks
如何证明immutable audit trail and replayable evidence

1.2 设计目标

  • 让 AI records 有明确 owner。
  • 让 record creation 靠近真实事件。
  • 让 retention class 在 record-object level 决定。
  • 让 legal hold 能跨系统传播。
  • 让 deletion 永远先检查 active hold。
  • 让监管、审计、诉讼和内部调查可以得到可解释 production。
  • 让隐私最小化和 records obligation 同时被管理。

2. Source Anchors

以下官方来源是本文的架构锚点。本文不解释法律义务本身, 而是把这些来源转成 AI records、retention、legal hold、audit trail 和 production capability 的设计语言。

AnchorOfficial link本文使用方式
FINRA Rule 4511https://www.finra.org/rules-guidance/rulebooks/finra-rules/4511用 books and records、preserve、six-year default and SEA Rule 17a-4 reference 说明金融记录保存需要正式 mapping
SEC Rule 17a-4 electronic recordkeeping overviewhttps://www.sec.gov/investment/amendments-electronic-recordkeeping-requirements-broker-dealers用 WORM option、audit-trail alternative、third-party recordkeeping 和 prompt production 设计电子记录保存层
CFTC 17 CFR 1.31https://www.ecfr.gov/current/title-17/chapter-I/part-1/subject-group-ECFR26e2c365a191fa7/section-1.31用 authenticity、reliability、metadata、system inventory、readily accessible 和 production 设计监管记录能力
Federal Rule of Civil Procedure 37(e)https://www.law.cornell.edu/rules/frcp/rule_37用 ESI preservation、reasonable steps、routine operation 和 loss consequences 设计 legal hold control
NARA records management guidancehttps://www.archives.gov/records-mgmt/policy用 records lifecycle、electronic records modernization、ERM requirements 和 disposition governance 设计 records program
NIST AI RMFhttps://www.nist.gov/itl/ai-risk-management-framework用 Govern / Map / Measure / Manage 组织 AI record risk、evidence quality、measurement 和 continuous improvement

2.1 Regulatory Nuance

  • FINRA / SEC / CFTC anchors 主要覆盖特定受监管实体和记录类别, 不能机械套用到所有金融零售 AI。
  • 银行、信用卡、支付、保险、经纪、投资顾问、期货、加密、RWA 和 vendor AI 的义务可能不同。
  • 客户沟通、广告、投诉、信贷、AML、trading、investment advice、privacy 和 employment records 可能有不同 schedule。
  • Legal hold 来源可能是 litigation、regulatory inquiry、government investigation、subpoena、internal investigation 或 reasonable anticipation。
  • Privacy deletion request 不等于可以删除所有相关记录。
  • Retention period 到期不等于可以在 active legal hold 下销毁。
  • Vendor 持有 AI records 不等于机构满足 prompt production 或 audit replay 能力。

2.2 NIST AI RMF Mapping

FunctionAI records questionEvidence
Govern谁拥有 record taxonomy、schedule、hold policy、export approvalRACI, policy, governance minutes
Map哪些 AI use case 产生 regulated, business, legal or audit recordsrecord inventory, data flow, system inventory
Measurecapture 是否完整, hold 是否传播, export 是否可重放completeness metrics, audit replay, exception report
Manage记录缺失、误删、hold 失败、production 失败如何处置incident runbook, CAPA, management report

3. Record Object Taxonomy For AI Systems

3.1 Core Terms

TermPractical meaning
AI recordAI workflow 中有业务、监管、法律、审计或模型治理意义的 record object
Raw telemetry用于 debugging / observability 的技术事件, 不必天然成为长期 record
ESIelectronically stored information, litigation / discovery 语境下的电子信息
Record class决定 owner、retention、access、hold eligibility、export profile 的类别
Retention triggerretention clock 启动事件, 如 created、case closed、account closed、transaction maturity
System inventory说明哪些系统维护 record, 如何 access / search / produce
Legal hold暂停相关 records 的 routine disposition, 以满足 preservation obligation
Disposition到期后的 deletion、archive、transfer 或 destruction certification
Chain-of-custody记录从 capture 到 review 到 export 的 custody evidence
Immutable audit trail可证明记录创建、修改、删除、访问、hold 和 export 的 tamper-evident trail

3.2 Top-Level Record Families

Record familyExamplesTypical owner
Interaction recordsprompt, chat transcript, uploaded file metadata, final responseChannel / business owner
Retrieval recordsRAG query, retrieved chunks, source ids, ranking, source hashKnowledge / platform owner
Generation recordssystem prompt version, model output, refusal, confidence, citationAI platform / product owner
Decision recordsrecommendation, human decision, approval, override, action hashBusiness process owner
Execution recordstool call, API request, side effect, tool response, errorTool owner / operations
Communication recordscustomer email, chat, letter, SMS, in-app noticeChannel / communications owner
Model governance recordseval dataset, score, defect, validation, release approvalModel risk / AI governance
Incident recordsalert, triage, impact, containment, notification, CAPAIncident / risk owner
Audit recordsaccess, modification, deletion, hold, export, replaySecurity / records / audit

3.3 Obligation-Based Categories

Obligation categoryDescriptionExamples
Business record支撑业务决策和客户服务dispute decision packet, complaint resolution note
Regulatory record被法规、监管规则或监管预期要求保存customer communication, broker-dealer books and records
Litigation-relevant ESI与现有或合理预期争议相关prompt and output for disputed decision
Model governance evidence证明模型设计、测试、监控和变更控制eval results, approval memo, monitoring report
Security / operational evidence证明访问、工具调用、incident handlingaudit event, incident timeline
Transient telemetry短期 observability, 没有业务结论latency traces, token counts without content

3.4 Classification Dimensions

DimensionExamples
record_typeuser_prompt, rag_retrieval, tool_call, approval, eval_result
business_domainpayments, lending, AML, fraud, complaints, wealth, servicing
business_objectcase id, account id, application id, transaction id
actor_chainuser, agent, model, service account, approver
customer_impactnone, internal only, customer-visible, financial action, adverse action
legal_sensitivitynormal, privileged candidate, litigation hold, investigation
privacy_classno PII, PII, financial data, sensitive data, confidential
retention_classtransient, business, regulatory, model governance, incident
source_authoritypolicy, contract, regulatory rule, business schedule, investigation
export_profileaudit, regulator, litigation, privacy request, management

3.5 Record Lifecycle

event occurs
  -> capture
  -> classify
  -> enrich metadata
  -> assign retention and access policy
  -> persist in retention-capable store
  -> monitor hold eligibility
  -> search / review / export when authorized
  -> disposition when eligible and no hold conflict
  -> disposition certificate

4.1 High-Level Architecture

AI channels and agents
  -> Record Capture SDK
  -> Metadata Enrichment Service
  -> AI Record Classifier
  -> Retention Policy Decision Point
  -> Records Registry
  -> Retention-Capable Storage
  -> Legal Hold Service
  -> Search and Review Workbench
  -> eDiscovery / Regulator Export Service
  -> Disposition Engine
  -> Immutable Audit Trail

4.2 Component Responsibilities

ComponentResponsibility
Record Capture SDKstandard event capture from chat, RAG, tool, approval, eval and incident flows
Metadata Enrichmentadd actor, business object, model, source, channel, privacy and obligation metadata
AI Record Classifierclassify record type, retention class, legal sensitivity and hold eligibility
Policy Decision Pointapply retention schedule, access rule, hold rule and export restriction
Records Registrysearchable catalog of records, locations, owners and lifecycle state
Retention Storepreserve payload and metadata under WORM or audit-trail capable controls
Legal Hold Servicecreate, propagate, monitor, release holds and block disposition
Review Workbenchlegal / compliance / audit search, review, tagging, redaction and privilege workflow
Export Servicecontrolled production package with hashes, load files and chain-of-custody
Disposition Enginedelete, archive or transfer only when eligible and no conflict exists
Immutable Audit Trailtamper-evident history of create, read, update, delete, hold, export and policy decisions

4.3 Storage Pattern

Metadata ledger: structured, immutable or append-only
Payload store: encrypted prompt / output / document / eval payloads
Index store: search index with rebuild lineage
Vector store: retrieval embeddings and chunk pointers
Archive store: customer communications and regulatory records
Evidence binder: exportable bundle for case, audit, regulator or litigation

4.4 WORM And Audit-Trail Capable Design

AI 架构语言应关注:

  • original record preservation。
  • modification and deletion history。
  • actor identity and trusted timestamp。
  • reason code and policy version。
  • ability to recreate original record。
  • integrity controls, signatures or hash chain。
  • split duties for log administration。

Implementation patterns:

  • object lock with retention mode。
  • append-only event store。
  • cryptographic hash chain。
  • externalized audit ledger。
  • periodic integrity verification。

4.5 Cross-System Hold Propagation

Legal hold 不能只设置在一个主数据库。 AI 相关 records 可能分散在 application database、chat transcript store、object store、vector database、search index、data warehouse、model observability platform、ticketing system、case management system、communication archive、vendor AI platform、backup and disaster recovery store。

Hold propagation evidence 应记录 hold id、scope definition、target systems、propagated time、acknowledgement、failed targets、retry status、exception owner 和 release time。


5. Prompt / RAG / Tool / Approval / Output / Eval / Incident Records

5.1 Prompt Records

Prompt records can include user message、system prompt version、developer prompt version、hidden policy instruction id、uploaded file metadata、identity context、channel metadata and safety classification。

Design questions:

  • Is this prompt a customer communication, employee workpaper, privileged request or transient instruction?
  • Was it used to produce a customer-impacting output?
  • Does a legal hold need to find it by customer, product, model version or topic?

Controls:

  • field-level minimization where lawful and usable。
  • payload encryption。
  • retention class at message level。
  • prompt template version control。

5.2 RAG Records

RAG records include normalized query、retrieval query、repository、document id、chunk id、chunk text hash、source version、rank、score、filters and citation。

Design questions:

  • Can the organization prove which policy or document version was retrieved?
  • Can it distinguish retrieved-but-not-used from cited-and-used?
  • Can legal hold freeze a source document version and retrieval metadata?

Controls:

  • immutable source snapshots for regulated knowledge。
  • source version id and content hash。
  • retrieval event linked to generation event。
  • index rebuild audit trail。

5.3 Tool Call Records

Tool records include requested tool、schema version、parameters、approval id、action hash、service account、execution time、response payload、side effect id、error and rollback result。

Design questions:

  • Did the tool change money, account status, customer communication or case closure?
  • Was approval required and bound to exact parameters?
  • Is the tool call a record even if it failed?

Controls:

  • approval-before-action。
  • single-use approval token。
  • immutable execution log。
  • parameter hash。
  • dual-control evidence for high-impact actions。

5.4 Approval And Override Records

Approval records include reviewer identity、role、authority、evidence viewed、decision、reason code、edits、override flag、time spent、conflict check and approval expiration。

Design questions:

  • Was the reviewer independent of maker?
  • Did the reviewer see original evidence or only AI summary?
  • Was approval attached to final output or only to general intent?

Controls:

  • evidence-first review UI。
  • mandatory reason code。
  • action-bound approval。
  • conflict-of-interest rule。
  • rubber-stamp monitoring。

5.5 Output Records

Output records include model output、edited output、final sent output、delivery channel、recipient、timestamp、citation list、refusal or escalation and required language。

Design questions:

  • Is this customer-visible?
  • Does it contain financial advice, credit decision, complaint response or fee information?
  • Is final sent output different from model draft?

Controls:

  • channel archive integration。
  • final-output hash。
  • pre-send review for sensitive categories。
  • source citation preservation。

5.6 Evaluation Records

Eval records include dataset id、test case、expected behavior、model version、prompt version、score、defect category、reviewer note and release gate decision。

Design questions:

  • Is eval evidence part of model governance retention?
  • Does it include real customer data?
  • Can audit reproduce the eval?

Controls:

  • dataset lineage。
  • sampling protocol。
  • reviewer independence。
  • test result immutability。
  • release approval linkage。

5.7 Incident Records

Incident records include detection alert、timeline、affected model / prompt / tool、customer impact、records impact、containment、legal / compliance notification、recovery and CAPA。

Design questions:

  • Did the incident involve lost records, wrong deletion, missing hold or improper output?
  • Does it trigger legal hold or regulator production?
  • Are incident communications themselves records?

Controls:

  • incident evidence binder。
  • preservation snapshot。
  • impacted-record query。
  • management signoff。
  • post-incident audit replay。

6. Retention Matrix

以下是设计模板, 不是正式保存期限。实际期限由 Legal / Compliance / Records Management 按适用义务批准。

Record classExamplesTriggerDesign retention approachHold eligible
Transient observabilitylatency, token count, non-content tracesevent createdshort operational windowlimited
Prompt with no business impactinternal brainstorming promptsession closedshort business needpossible
Customer communication promptcustomer asks product / account questionmessage createdalign to communication scheduleyes
Customer-visible AI outputfinal chat / email / lettermessage sentalign to channel and product rulesyes
RAG retrieval evidencepolicy chunks used in answeroutput generatedalign to related output / decisionyes
Business recommendationdispute, credit, complaint recommendationrecommendation issuedalign to case recordyes
Human approvalapprove / reject / overridedecision madealign to underlying actionyes
Tool executionpayment, account status, case closureexecution attemptedalign to transaction / caseyes
Model evaltest result, release gate evidenceeval completedmodel governance scheduleyes
Incident recordAI error, missing record, wrong outputincident opened / closedincident scheduleyes
Audit trailcreate, modify, delete, hold, exportevent createdlinked record period by policyyes
Legal hold noticehold notice and acknowledgementhold issueduntil release plus policy periodyes

6.1 Retention Trigger Examples

TriggerUse case
record_createdprompt, retrieval, tool event
output_sentcustomer communication
case_closeddispute, complaint, AML alert
account_closedaccount servicing records
loan_paid_off_or_deniedlending records
model_retiredmodel governance records
incident_closedincident and CAPA
hold_releasedpreserved ESI disposition review

6.2 Retention Classifier Output

{
  "record_id": "air_20260630_119_0001",
  "record_type": "rag_retrieval",
  "business_domain": "complaints",
  "business_object": "complaint_case_8814",
  "customer_impact": "customer_visible",
  "privacy_class": "financial_data",
  "retention_class": "business_regulatory_case_record",
  "retention_trigger": "case_closed",
  "hold_eligible": true,
  "export_profile": ["audit", "regulator", "litigation"],
  "policy_version": "records-ai-2026.06.30"
}

6.3 Retention Clock Controls

  • clock_start_event must be explicit。
  • system must prevent silent trigger changes。
  • retention policy version must be preserved。
  • derived records must link to source records。
  • deletion eligibility must be calculated, not guessed。
  • disposition must be blocked by active hold。
  • manual deletion must require authorized exception and audit trail。

7.1 Trigger Sources

Legal hold may be triggered by filed litigation、reasonably anticipated litigation、regulatory inquiry、subpoena、civil investigative demand、enforcement investigation、internal investigation、customer dispute escalation、whistleblower report、data breach、AI incident、product-wide complaint pattern or employment dispute involving AI monitoring。

7.2 Hold Scope Dimensions

DimensionExamples
custodiansemployee, supervisor, model owner, operations analyst
systemschat, RAG, case management, tool gateway, archive, vendor platform
date rangebefore and after trigger
business objectsaccount, loan, dispute case, complaint, alert
record typesprompt, output, retrieval, approval, tool, eval, incident
model contextmodel id, prompt version, policy version
topics / keywordsproduct, fee, merchant, policy, protected term
channelsemail, chat, SMS, call transcript, in-app message
sensitivityprivileged, PII, SAR-sensitive, confidential

7.3 Hold Workflow

trigger intake
  -> legal review and matter id
  -> scope definition
  -> system inventory mapping
  -> hold creation
  -> hold propagation
  -> custodian notice and acknowledgement
  -> preservation verification
  -> collection and review
  -> periodic scope refresh
  -> release approval
  -> disposition review

7.4 Preservation Verification

ControlEvidence
target system acknowledgementhold propagation receipt
deletion job blockeddisposition job conflict log
records sampledsample ids and preserved status
vendor confirmed holdvendor acknowledgement and SLA evidence
backup policy reviewedbackup exception note
custodian notice completedacknowledgement log
scope refreshedupdated query and delta collection

8. Deletion vs Hold Conflict Handling

当 deletion request、retention expiry、privacy minimization、regulatory retention 和 legal hold 冲突时, 产品不能在 UI 层做简单判断。

推荐 priority decision 由 policy service 处理:

active legal hold
  -> block disposition and preserve
regulatory retention still active
  -> block deletion unless legal authority permits a controlled transformation
business retention still active
  -> preserve until schedule permits disposition
privacy deletion request
  -> evaluate exemptions, delete eligible non-held data, defer blocked data
retention expired and no hold
  -> disposition workflow

8.1 Conflict Decision Record

FieldExample
request_idprivacy or deletion request
record_idaffected AI record
decisiondeleted, deferred, denied, transformed, archived
reasonactive hold, regulatory retention, business need, no longer needed
authoritypolicy version and legal / privacy reviewer
next_review_datedate for release or re-evaluation
evidencedecision packet and audit event

8.2 Design Rules

  • A user-facing delete confirmation must not imply records under legal hold were destroyed.
  • Deferral reason should be understandable but not disclose privileged matter details unnecessarily.
  • Disposition jobs must query hold service in real time or use a verified hold snapshot.
  • Hold release does not automatically delete; it triggers disposition eligibility review.
  • Backups need a documented approach: restore controls, overwrite cycle or legal exception.
  • Derived records must be checked; deleting prompt payload may not delete final business record.

9. eDiscovery / Export Workflow

9.1 Workflow

request intake
  -> authority verification
  -> scope definition
  -> search plan
  -> collection
  -> de-duplication and threading
  -> sensitivity and privilege review
  -> redaction
  -> quality control
  -> export package creation
  -> chain-of-custody signoff
  -> production delivery
  -> post-production audit

9.2 Search Scope

Search should support matter id、customer id or hashed customer reference、account / loan / transaction / case id、date range、channel、record type、model id、prompt template version、policy version、source document id、tool id、approver id、output text、keyword and semantic query、legal sensitivity tag and retention class。

9.3 Export Package

Package elementPurpose
production manifestlist of exported record ids
metadata dictionaryexplains fields and values
native payloadsprompts, outputs, attachments, source snapshots
rendered viewshuman-readable transcript or case packet
load fileimport into review platform
hash manifestintegrity verification
redaction logwhy content was redacted
privilege log supportwhere applicable, managed by legal process
chain-of-custodywho collected, reviewed, exported and delivered
completeness statementauthorized certification of scope

9.4 AI-Specific Export Issues

IssueDesign response
Prompt contains multiple customersfield-level segmentation or review tagging
RAG chunk source changed latersource version snapshot and content hash
Model output regeneratedpreserve each generation and final selected output
Human edited AI draftkeep draft, edit diff and final sent version
Tool call produced side effectexport approval packet and execution result
Vector store cannot produce textstore chunk id and source snapshot outside vector index
Search index was rebuiltindex rebuild audit and source-of-truth registry
Privileged legal promptprivilege workflow and restricted access

10. Immutable Audit Trail

10.1 Required Events

Audit trail should capture record created、record classified、retention policy assigned、record accessed、record modified、record deleted or dispositioned、legal hold applied、legal hold released、export searched、export collected、export redacted、export produced、policy changed and vendor export received。

10.2 Event Fields

FieldDescription
event_idunique event id
event_typecreate, classify, access, modify, delete, hold, export
record_idaffected record
actorhuman, service, agent or system
actor_rolerole and authority
timestamptrusted time source
before_hash / after_hashprior and new payload or metadata hash
policy_versionpolicy applied
reason_codebusiness or control reason
correlation_idworkflow, matter or case id

10.3 Tamper-Evidence Patterns

  • append-only event store。
  • object lock。
  • hash chain。
  • periodic notarization or external hash anchoring。
  • admin action monitoring。
  • split role for audit configuration。
  • immutable export of audit logs。
  • automated integrity checks。

10.4 Audit Replay Test

For one customer-impacting AI decision, audit should reconstruct:

customer input
  -> retrieved sources and versions
  -> model and prompt versions
  -> AI recommendation
  -> human edit / approval / override
  -> tool execution
  -> customer-visible output
  -> retention class
  -> legal hold status
  -> access and export history

11. Regulator / Audit Production

11.1 Production Principles

  • Produce from governed workflow, not ad hoc database dumps。
  • Preserve original metadata and export metadata。
  • Include chain-of-custody。
  • Explain record taxonomy and system inventory。
  • Separate responsive, non-responsive, privileged and sensitive materials。
  • Maintain repeatability of search and collection。
  • Record all redaction and exclusion decisions。
  • Validate export completeness before delivery。

11.2 Regulator Readiness Pack

ArtifactContent
AI records inventorysystems, record types, owners, locations
retention matrixcategory, trigger, period, authority, disposition
system inventorywhere AI records live and how they are produced
legal hold proceduretrigger, propagation, verification, release
sample replay binderone end-to-end decision chain
immutable audit designWORM or audit-trail capability
vendor recordkeeping scheduledata retention, export SLA, audit rights
incident record protocolrecord loss, missed hold, production failure

12. Operating Model

12.1 RACI

ActivityBusinessAI PMSenior BAArchitectLegalCompliancePrivacyRecordsSecurityModel RiskOpsAudit
Define AI record taxonomyARRCCCCA/RCCCI
Map use case recordsARRCCCCRCCRI
Approve retention scheduleCCCCA/RA/RCA/RICCI
Design capture architectureCRCA/RCCCCCCCI
Operate legal holdICCCA/RCCRCICI
Configure dispositionCCCRCCCA/RCIRI
Produce recordsCCCRA/RCCRCCCI
Test replayCCCRCCCRCCCA/R

R = Responsible, A = Accountable, C = Consulted, I = Informed.

12.2 Governance Cadence

ReviewFrequencyOutput
AI records inventory reviewmonthly for active rolloutnew record types and systems
Retention policy reviewquarterly or on regulatory / product changeupdated matrix and approvals
Legal hold operations reviewmonthlyactive holds, exceptions, propagation failures
Disposition reviewmonthlyeligible records, blocked deletions, certificates
Vendor recordkeeping reviewquarterlyexport tests and SLA evidence
Audit replay samplequarterly or risk-basedreplay result and CAPA
Incident reviewevent-drivenrecord loss, wrong deletion, missed hold

12.3 Role-Specific Focus

RoleFocus
AI PMPRD record requirements, customer impact, archive needs, vendor constraints
Senior BAprocess state, record triggers, taxonomy, evidence fields, legal hold workflow
Architectcapture SDK, evidence ledger, policy engine, immutable store, export and disposition
Legal / Compliance / Recordsschedule approval, hold procedure, production protocol, conflict decisions
Security / Privacyaccess, encryption, minimization, deletion request workflow
Internal Auditindependent replay, control design and operating effectiveness testing

13. Metrics And KRIs

13.1 Completeness Metrics

MetricPurpose
capture success ratedetect dropped AI record events
unclassified record rateidentify taxonomy gaps
records missing business objectprevent orphan records
RAG events missing source versionensure replayability
tool calls missing approval iddetect control bypass
outputs missing channel archive iddetect communication gaps
eval records missing model versionensure governance evidence
audit trail integrity pass rateprove tamper evidence
MetricPurpose
hold propagation SLAspeed from hold creation to system freeze
failed target systemsidentify preservation gaps
deletion blocks due to holdprove control operation
custodian acknowledgement rateoperational compliance
vendor hold acknowledgement timethird-party risk
export package defect ratequality of production
chain-of-custody completenessdefensibility
audit replay successevidence integrity

13.3 Risk KRIs

KRIYellowRed
high-impact output without retained source contextisolated exceptionrepeated or customer-impacting pattern
active hold propagation failureone retry neededany system not frozen within required SLA
deletion attempted on held recordblocked and explainedexecuted deletion or missing evidence
vendor cannot export AI recordsexport delayedno contractual or technical access
audit trail gapmissing non-critical metadatacannot reconstruct create / modify / delete
RAG source version missingsample gapsystemic missing source version
eDiscovery export manual workaroundcontrolled exceptionunrepeatable production
over-retention of sensitive promptslimited scopebroad indefinite retention without authority

14. Financial Retail Scenario Patterns

ScenarioRecord chainKey controls
Payment dispute assistantintake, transaction retrieval, policy retrieval, AI recommendation, approval, provisional credit tool call, customer noticesource version, approval packet, action hash, channel archive
Credit underwriting copilotapplication data, policy citations, AI memo draft, underwriter edits, decision, override reason, evaldistinguish draft from decision, preserve adverse action support, restrict sensitive data
AML / fraud assistantalert data, transaction summary, red flag analysis, investigation notes, closure or escalation, QA samplerestricted access, human closure evidence, investigation hold, access audit
Complaint remediation agentcomplaint intake, severity, policy/account evidence, remediation recommendation, review, response, closureescalation triggers, source context, closure checklist, topic-wide hold
Wealth / advisory assistantclient question, suitability context, product retrieval, AI explanation, advisor review, final communicationdistinguish education from recommendation, preserve advisor edits, archive final communication

15. Vendor And Third-Party Design

Vendor arrangements should address record ownership、retention periods、legal hold propagation、export SLA、format and metadata、audit rights、deletion certificates、subprocessor handling、incident notification、data residency and privileged / confidential content。

RiskControl
Vendor stores prompt but cannot exportrequire API export and periodic test
Vendor deletes traces before institution scheduleinstitution captures required copy or vendor aligns retention
Vendor cannot apply granular holdroute sensitive workflows to controlled store
Vendor changes model without evidencepreserve model id and release metadata
Vendor support accesses recordsaccess logs and approval workflow
Subprocessor holds copiescontract inventory and deletion evidence

Vendor exit must include full record export、metadata dictionary、hold-preserved records、deletion or transfer certificate、hash manifest validation、continuity of legal hold and archive import test。


16. 30-Day Lab

目标: 30 天内完成一套可展示的 AI Records / Retention / Legal Hold / eDiscovery architecture portfolio pack。

Week 1: Inventory And Taxonomy

DayArtifactTask
1use-case-boundary-card.mdDefine product, entity assumption, channel, customer impact and AI role
2ai-record-inventory.mdList prompt, RAG, tool, approval, output, eval, incident and audit records
3record-object-taxonomy.mdDefine record types, owners, privacy class and business objects
4system-inventory.mdMap chat, RAG, vector, object, case, archive, warehouse and vendor stores
5obligation-map.mdMap business, regulatory, legal, privacy and audit reasons for preservation
6authoritative-copy-map.mdDecide source of truth for payload, metadata and final business record
7taxonomy-review.mdReview orphan records, missing source versions and vendor-held records

Week 2: Retention And Hold Design

DayArtifactTask
8retention-matrix.mdDefine class, trigger, approach, owner and disposition
9retention-classifier-spec.mdDefine classifier inputs, outputs, review override and policy version
10legal-hold-trigger-table.mdDefine triggers and scope dimensions
11legal-hold-workflow.mdDraw intake, propagation, verification, release and disposition review
12deletion-conflict-rules.mdDefine privacy deletion, expiry, legal hold and regulatory conflict rules
13hold-propagation-map.mdMap each system to hold API, manual process or vendor process
14disposition-certificate.mdDefine deletion evidence and approval fields

Week 3: Evidence And Production

DayArtifactTask
15evidence-ledger-schema.mdDefine create, classify, access, modify, delete, hold and export events
16immutable-audit-design.mdSelect WORM, audit trail, hash chain or object lock pattern
17rag-replay-spec.mdDefine source id, chunk id, content hash and index rebuild evidence
18tool-call-replay-spec.mdDefine approval id, action hash, execution result and rollback evidence
19ediscovery-export-spec.mdDefine search, collection, redaction, manifest, load file and custody
20regulator-production-binder.mdBuild sample package for one AI decision
21vendor-recordkeeping-addendum.mdDefine vendor retention, hold, export and audit requirements

Week 4: Operations, Metrics And Interview Pack

DayArtifactTask
22records-raci.mdBuild operating model
23kri-dashboard-spec.mdDefine capture, hold, export, deletion and vendor KRIs
24incident-runbook-record-loss.mdDefine response to missed hold, wrong deletion or export failure
25audit-replay-test.mdReconstruct one customer-impacting AI workflow end to end
26tabletop-privacy-vs-hold.mdRun customer deletion request during litigation hold
27tabletop-vendor-export-failure.mdRun vendor cannot produce prompt records scenario
28executive-memo.mdSummarize architecture, risk, controls, residual risk and investment
29interview-qa.mdPrepare 30-second and 2-minute answers
30portfolio-index.mdPackage artifacts into a role-ready portfolio story

17. Interview Answers

Q1: What is the difference between AI logs and AI records?

30 秒:

Logs describe technical events. AI records are obligation-aware evidence objects that may support business decisions, customer communications, regulatory books and records, litigation preservation, model governance or audit replay.

2 分钟:

I would not treat every token trace as a long-term record, but I also would not keep only the final answer. A financial retail AI workflow may create prompts, RAG retrieval evidence, model outputs, approvals, tool calls, final customer messages, eval results and incident events. Each object needs classification: business domain, customer impact, privacy class, retention class, legal hold eligibility and export profile.

Q2: How would you build retention for AI systems?

Start with record inventory and taxonomy, then assign retention classes at record-object level. The retention clock may start at creation, output sent, case closed, account closed, model retired or incident closed. Storage should support WORM or audit-trail capabilities where required, and the disposition engine must query legal hold before deletion.

Legal hold means the organization must preserve relevant ESI and stop routine deletion for scoped records. For AI, scope can include prompts, outputs, RAG sources, vector metadata, approvals, tool calls, evals, incidents and vendor-held records. I would implement it as a cross-system service with matter id, scope, propagation, acknowledgement, verification and release workflow.

Route the request to a policy decision service. Records under active legal hold or mandatory retention are deferred or otherwise handled according to Legal, Privacy and Compliance direction. Eligible unrelated records can be deleted. The system records the decision, reason, authority, next review date and evidence.

Q5: What is needed for eDiscovery export of AI records?

A defensible export needs scoped search, collection protocol, metadata dictionary, native payloads, rendered views, redaction log, hash manifest, chain-of-custody and production manifest. AI-specific metadata includes prompt version, model version, RAG source ids, chunk hashes, human edits, approvals, tool calls, final sent output and policy decisions.

Q6: How do you prove an AI answer used the correct RAG source?

Preserve retrieval event details: query, repository, document id, chunk id, source version, content hash, rank, score and citation. The export package should include source snapshot or a verifiable reference to the authoritative source as it existed at the time of generation.

Q7: How would you explain WORM vs audit-trail alternative in architecture terms?

WORM prevents rewriting and erasure for preserved records. An audit-trail capable approach maintains complete time-stamped evidence of creation, modification and deletion so the original record can be recreated if changed. The exact regulatory acceptability depends on the entity and rule, but architecturally both require integrity, metadata, access controls and production readiness.

Q8: What would you put in PRD acceptance criteria?

Every customer-impacting AI output must link to prompt, model version, RAG source version, final sent message, human edits, approval if required and retention class. Legal hold must block disposition across all in-scope stores. Export must produce a manifest, metadata, payloads, hashes and chain-of-custody. Deletion must create a disposition certificate or conflict decision record.


18. Portfolio Deliverables

DeliverableShows
Executive memoability to frame risk and business value
AI records inventoryability to discover record-producing events
Record taxonomyability to classify beyond generic logs
Retention matrixability to connect business process to lifecycle
Legal hold architectureability to design preservation controls
eDiscovery workflowability to support production and chain-of-custody
Immutable audit trail schemaability to prove integrity
Deletion conflict decision treeability to balance privacy, retention and hold
Vendor recordkeeping requirementsability to manage third-party AI risk
Audit replay binderability to prove end-to-end evidence
KRI dashboard specability to operate and measure controls
Interview answer packability to communicate at senior level

19. Common Pitfalls

PitfallWhy it failsBetter design
Treating AI records as chat history onlymisses RAG, tool, approval, eval and incident evidencerecord object taxonomy
Keeping everything foreverincreases privacy, breach, cost and discovery exposureschedule by obligation and business need
Keeping too littleloses decision evidence and preservation abilityclassify customer-impacting and regulated records
Legal hold only in case systemprompt, vector, archive and vendor records may be deletedcross-system hold propagation
No RAG source versioncannot replay why answer was generatedsource id, version and hash
Tool calls not retainedbusiness action cannot be reconstructedapproval-bound tool execution log
Approval separate from outputreviewer approved a different artifact than final sentaction hash and final output linkage
Manual eDiscovery exportsunrepeatable, weak custodycontrolled export workflow
Vendor data inaccessibleproduction failurecontract export SLA and periodic test
Deletion ignores holdspoliation and regulatory riskdisposition conflict check
Audit trail mutable by adminsevidence credibility weakimmutable or tamper-evident audit
No record ownerpolicies drift and exceptions accumulateRACI and governance cadence

20. Final Operating Principle

AI recordkeeping is a product and architecture capability, not a compliance afterthought.

For advanced AI PM / Senior BA / Architect, the practical skill is to turn every AI-influenced business event into a governed evidence object:

captured at source,
classified by obligation,
retained by policy,
preserved under hold,
produced with custody,
deleted only when eligible,
and replayable under audit.