AI Data Lifecycle Governance / Provenance / Retention Playbook
本 Playbook 面向金融零售场景下的 AI 数据生命周期治理,重点覆盖从业务源数据进入 AI 系统,到检索、提示词、评估、日志、反馈、模型适配,再到保留、删除、审计取证的端到端架构。它不是一份泛泛的数据治理说明,而是一套可用于 BA/PM/架构评审、监管问询、模型上线门禁、供应商尽调和审计证据准备的工作框架。
AI Data Lifecycle Governance, Provenance, Retention and Deletion Playbook
目的
本 Playbook 面向金融零售场景下的 AI 数据生命周期治理,重点覆盖从业务源数据进入 AI 系统,到检索、提示词、评估、日志、反馈、模型适配,再到保留、删除、审计取证的端到端架构。它不是一份泛泛的数据治理说明,而是一套可用于 BA/PM/架构评审、监管问询、模型上线门禁、供应商尽调和审计证据准备的工作框架。
核心目标有四个:
- 把 AI 系统中的数据对象拆清楚:源数据、派生数据、向量、prompt、retrieval context、eval dataset、human feedback、runtime log、incident evidence、model adaptation artifact。
- 把 provenance 从“数据血缘图”升级为“可证明的责任链”:谁提供、谁转换、谁授权、谁调用、谁评估、谁批准、谁删除。
- 把 retention 和 deletion 从合规条款变成系统能力:分类、保留时钟、legal hold、删除编排、缓存/向量/日志/备份一致性、可验证证据。
- 帮助有金融零售 PM/BA/Developer 背景的人,在 AI 项目中说清楚数据生命周期架构,而不是只停留在“不要把 PII 发给模型”。
适用对象
- 金融零售 AI 产品负责人、BA、平台 PM、企业架构师、数据架构师、安全与隐私治理人员。
- 负责智能客服、信贷辅助、营销推荐、投诉处理、客户 360、知识库 RAG、运营分析、风控解释、员工 Copilot 的团队。
- 需要把 NIST AI RMF、ISO/IEC 42001、W3C PROV、NIST Privacy Framework 映射到实际系统设计、需求文档、审计证据和上线门禁的团队。
核心观点
AI 数据治理的难点不在“有没有数据目录”,而在 AI 系统会持续制造新的数据状态:检索片段、上下文窗口、推理轨迹、评分样本、反馈标签、蒸馏数据、微调样本、embedding、trace log、prompt variant、eval failure case。这些对象都可能改变风险边界。
传统数据治理常围绕 source-to-report lineage;AI 治理必须扩展为 source-to-prompt-to-output-to-feedback-to-adaptation lineage。只治理训练数据不够,只治理模型不够,只治理 prompt 也不够。真正可审计的 AI 架构必须能回答六个问题:
- 这个回答用了哪些原始数据、文档版本、向量索引版本和检索规则?
- 这些数据是否被允许用于该目的、该用户、该渠道和该模型供应商?
- 这些数据进入 prompt、日志、评估集、反馈集或模型适配流程了吗?
- retention clock 从哪一个事件开始计算,由谁批准延长或冻结?
- 删除请求如何传播到向量库、缓存、对象存储、trace、eval set、feedback store、供应商环境和备份恢复点?
- 审计人员如何用查询复现证据,而不是依赖口头说明?
Source Anchors
以下官方锚点用于约束本文的治理语言和架构思路:
| Anchor | 官方来源 | 本文采用方式 |
|---|---|---|
| NIST AI RMF | NIST AI Risk Management Framework 与 AI RMF 1.0 publication | 用 Govern, Map, Measure, Manage 组织 AI 风险管理闭环,尤其关注数据质量、透明性、可追溯性、有效性和监控。 |
| ISO/IEC 42001:2023 | ISO/IEC 42001:2023 - AI management systems | 用 AI management system 的思路落地政策、角色、流程、控制、持续改进和证据化管理。 |
| W3C PROV | PROV-DM: The PROV Data Model, PROV-O, PROV Overview | 用 Entity, Activity, Agent, wasDerivedFrom, wasGeneratedBy, wasAssociatedWith, actedOnBehalfOf 表达 AI 数据对象、处理活动和责任主体。 |
| NIST Privacy Framework | NIST Privacy Framework | 用 privacy risk management 思路连接数据最小化、目的限制、可识别性、个人影响、控制活动和隐私证据。 |
AI Data Lifecycle: From Source to Prompt, Retrieval, Eval, Log, Feedback and Model Adaptation
1. Source Intake
典型源数据包括客户主数据、账户资料、交易流水、卡交易、贷款申请、KYC/KYB、CRM 互动、投诉工单、呼叫中心录音转写、网页点击、营销同意状态、风险评分、政策文档、产品条款、培训材料和历史知识库。
架构要求:
- 每个 source dataset 必须有 owner、system of record、purpose、lawful basis 或授权依据、classification、PII/PCI/financial secrecy 标识、cross-border 标识、retention class。
- 对 AI 使用场景必须建立 AI use-purpose mapping,例如“客服回答账户问题”“员工检索政策”“信贷材料摘要”“营销内容生成”。
- 源数据进入 AI pipeline 前要完成 sensitivity tagging、consent/permission check、contractual restriction check、quality profiling 和 allowed-use decision。
- 对外部供应商、第三方数据、开放网页、合作伙伴数据,必须记录 license、terms of use、data usage restriction、refresh cadence 和 withdrawal mechanism。
关键产物:
- Source dataset record
- Data processing purpose map
- AI allowed-use policy
- Data quality profile
- Consent and restriction snapshot
2. Transformation and Feature/Document Preparation
AI 系统经常把源数据转换为文档片段、特征、摘要、标签、embedding、metadata envelope、masked copy、synthetic sample 或 evaluation item。每一次转换都必须被视为新的 governed entity。
架构要求:
- Transformation job 要记录输入版本、转换代码版本、参数、执行环境、执行人或 service principal、输出校验结果。
- Chunking、summarization、redaction、normalization、deduplication、OCR、speech-to-text 都要生成可追溯的 activity record。
- Embedding 不应被误认为“无个人数据风险”。如果 embedding 可从稀疏上下文、metadata、相似度查询中反推出敏感信息,仍要纳入保护和删除范围。
- 对被脱敏、聚合、合成的数据,要记录 technique、residual risk、re-identification control 和适用范围,不把“masked”自动等同于“可自由复用”。
关键产物:
- Transformation lineage
- Document chunk registry
- Embedding index manifest
- Redaction proof
- Quality gate result
3. Prompt Construction
Prompt 是 AI 数据生命周期中的高风险节点,因为它会把用户输入、系统指令、检索结果、历史上下文、工具返回、策略规则和隐式 metadata 合并为一次模型调用。
架构要求:
- Prompt assembly service 要生成 prompt manifest,而不是只保留最终字符串。manifest 至少记录 system instruction version、policy version、retrieved document IDs、redaction status、user role、channel、tool calls、model endpoint 和 retention class。
- 对客户身份、账户号、卡号、SSN、收入、健康、家庭状况、投诉细节等敏感信息,使用 purpose-aware masking。客服认证场景可以临时使用必要字段;员工知识检索场景通常不应把客户级数据放入 prompt。
- Prompt template 变更要进入 change control,尤其是会改变数据暴露范围、工具权限、日志内容或用户影响的改动。
- 对 prompt injection 和 data exfiltration 防护,不能只放在模型层;还要放在 retrieval filter、tool permission、output policy 和 logging policy。
关键产物:
- Prompt manifest
- Prompt template registry
- Context permission decision
- Redaction and masking record
- Tool invocation policy
4. Retrieval and RAG Context
RAG 的治理重点不是“文档是否进了知识库”,而是“在某个具体用户、渠道、目的、时间点下,哪些片段被允许作为上下文进入模型”。
架构要求:
- Vector index 要有 index version、source snapshot、embedding model version、chunking rule version、metadata schema、access policy、deletion propagation status。
- Retrieval 必须执行 policy-aware filtering:用户身份、员工岗位、地区、业务线、客户关系、数据授权、文档有效期、legal hold、product applicability。
- 对每次 retrieval 记录 query hash、retrieved chunk IDs、score、rank、filter decision、excluded reason summary 和 response correlation ID。
- 对过期政策、撤回授权、客户删除请求、产品条款更新,要触发 index rebuild 或 targeted purge。
关键产物:
- Retrieval trace
- Vector index manifest
- Access filter decision log
- Chunk-to-source mapping
- Index purge certificate
5. Evaluation Data Lifecycle
Eval dataset 往往被忽视,但它会长期保存真实问题、失败案例、敏感输出、人工标注、偏见样本和监管关注点。
架构要求:
- Eval item 必须记录来源:synthetic、production sampled、incident derived、expert authored、regression case、regulatory scenario。
- Production sampled eval 必须执行采样授权、脱敏、敏感字段最小化和 retention classification。
- Eval dataset 要分层:golden set、red-team set、incident regression set、privacy test set、policy compliance set、bias/fairness set、business acceptance set。
- 不同 eval set 的访问权限不同。包含真实客户上下文的 incident regression set 不应开放给所有 prompt engineer 或供应商。
- Eval result 要能反向定位模型版本、prompt 版本、retrieval index 版本、policy 版本和测试环境。
关键产物:
- Eval dataset register
- Eval item provenance
- Test run manifest
- Failure case classification
- Regression evidence pack
6. Runtime Logs and Trace Evidence
AI runtime log 同时是诊断资产、审计证据和隐私风险源。日志策略要在“可调查”和“最小化”之间做架构级取舍。
架构要求:
- Trace 不应默认保存完整 prompt、完整 output、完整用户输入和完整 tool response。优先保存 manifest、hash、reference ID、policy decision、safety score 和可控的 encrypted payload。
- 对高风险场景,如信贷、投诉、欺诈预警、账户限制、客户承诺、收费解释,要保留足以复现决策链的 evidence trace。
- 日志要分层:operational metrics、security audit log、AI decision trace、privacy event log、model quality telemetry、incident evidence vault。
- 对供应商 API,要明确 provider 是否保存请求和响应、保存多久、是否用于训练、是否支持删除、删除证明如何提供。
关键产物:
- Trace schema
- Logging minimization policy
- Evidence vault
- Provider retention attestation
- Audit replay package
7. Feedback Data Governance
Feedback 不是简单的“点赞/点踩”。在金融零售 AI 场景中,反馈可能来自客户、坐席、质检、法务、合规、模型监控、申诉、监管检查和事故复盘。
架构要求:
- Feedback 要区分 user preference、business correctness、policy violation、harm signal、privacy complaint、bias signal、tool failure、retrieval failure、model hallucination。
- 客户反馈进入训练或微调前,需要明确授权和目的兼容性。员工反馈也可能包含客户信息,不能默认低风险。
- Negative feedback 应形成 triage workflow:误答、过期知识、检索失败、权限配置错误、prompt flaw、模型能力不足、业务规则缺失。
- 高价值反馈可以进入 eval regression;进入 fine-tuning 或 adapter training 前必须经过 sanitization、sampling approval 和 model adaptation review。
- Feedback loop 必须防止“模型把投诉内容学习成下次回答风格”或“把单个客户案例泛化为业务规则”。
关键产物:
- Feedback taxonomy
- Feedback triage record
- Training eligibility decision
- Human review outcome
- Feedback-to-eval mapping
8. Model Adaptation and Learning Loop
模型适配包括 prompt tuning、retrieval tuning、reranker training、fine-tuning、LoRA/adapter、distillation、RLHF/RLAIF、tool policy update、knowledge base rebuild。每一种适配都会改变数据风险和责任链。
架构要求:
- 区分 non-learning use、evaluation-only use、retrieval improvement use、supervised fine-tuning use、preference optimization use。
- Model adaptation package 要记录 input dataset provenance、selection rationale、exclusion rule、PII treatment、license restriction、approval record、training run ID、output artifact ID、rollback path。
- 训练后的模型或 adapter 要能回答“是否包含某类数据”“是否使用过客户对话”“是否使用过投诉或信贷信息”“删除请求如何影响已经训练的权重”。
- 对无法从权重中精确删除的场景,要在数据进入训练前设置更高门槛,采用 de-identification、aggregation、synthetic substitution、adapter isolation、model retirement 或 compensating control。
关键产物:
- Model adaptation register
- Training data bill of materials
- Approval evidence
- Model card/data card supplement
- Deletion impact assessment
Provenance Graph
W3C PROV 映射
在 AI 数据生命周期中,建议用 W3C PROV 的三类核心概念建立最小可行 provenance graph:
- Entity:数据集、文档、chunk、embedding、prompt manifest、retrieval result、model output、eval item、feedback record、training sample、model artifact、audit package。
- Activity:ingest、classify、mask、chunk、embed、retrieve、assemble prompt、invoke model、evaluate、log、review feedback、approve training、delete、purge index。
- Agent:customer、employee、product owner、data owner、model owner、service account、vendor API、human reviewer、compliance approver、deletion orchestrator。
建议关系:
wasDerivedFrom: chunk derived from policy document; eval item derived from production interaction.wasGeneratedBy: prompt manifest generated by prompt assembly activity.used: model invocation used prompt manifest and retrieval result.wasAssociatedWith: retrieval activity associated with service account and application.wasAttributedTo: model output attributed to AI service under product owner accountability.actedOnBehalfOf: vendor model endpoint acted on behalf of financial institution under contract.
Provenance Graph 最小结构
flowchart LR
S[Source System\nCRM / Core Banking / Policy Docs] --> I[Ingest Activity]
I --> C[Classified Dataset Entity]
C --> T[Transform / Mask / Chunk Activity]
T --> D[Document Chunk Entity]
T --> E[Embedding Entity]
D --> R[Retrieval Activity]
E --> R
R --> RC[Retrieved Context Entity]
U[User Input Entity] --> P[Prompt Assembly Activity]
RC --> P
P --> PM[Prompt Manifest Entity]
PM --> M[Model Invocation Activity]
M --> O[Output Entity]
O --> L[Runtime Trace Entity]
O --> F[Feedback Activity]
F --> FE[Feedback Entity]
FE --> EV[Eval / Adaptation Decision]
EV --> AD[Model Adaptation Artifact]
G[Governance Agents\nData Owner / Model Owner / Compliance / Vendor] -.associated with.-> I
G -.associated with.-> R
G -.associated with.-> M
G -.associated with.-> EV
Provenance 设计原则
- Graph 不需要一开始覆盖所有字段,但必须覆盖所有风险转换点:源数据进入 AI、上下文进入 prompt、生产样本进入 eval、反馈进入训练、删除进入向量和日志。
- 不把 provenance 只做成离线文档。关键 activity 要由系统自动写入事件表或 graph store。
- 对不可保存明文的内容,用 hash、reference ID、encrypted evidence vault 和 keyed access 建立可验证关联。
- Graph 查询要支持监管问题,例如“这个客户的数据是否进入过某次模型适配”“某个错误回答由哪些文档版本和 prompt 版本产生”。
Retention and Deletion Architecture
Retention 不是一个字段,而是一套时钟
AI 数据对象至少有五类 retention clock:
- Source retention clock:源系统依法或业务政策保留的期限。
- Derived artifact retention clock:chunk、embedding、summary、masked copy、feature、eval item 的期限。
- Runtime trace retention clock:一次调用的 operational log、security log、decision trace 和 evidence vault 的期限。
- Feedback retention clock:客户反馈、员工反馈、人工标注、事故复盘、监管问询证据的期限。
- Model artifact retention clock:训练数据清单、训练配置、模型卡、评估报告、批准记录、上线版本和回滚包的期限。
这些时钟不应简单继承源系统期限。一个投诉案例进入 incident regression set 后,可能有更长的审计保留要求;一个营销 prompt 日志可能应在很短周期内删除或聚合。
删除架构组件
建议建立 deletion orchestration layer,而不是让每个系统自行处理:
- Data subject/request intake:接收客户删除、访问、更正、限制处理请求。
- Identity resolution:把客户 ID、账户 ID、设备 ID、会话 ID、聊天 ID、工单 ID、embedding metadata、trace correlation ID 关联起来。
- Scope decision engine:判断删除范围、法律保留、合同限制、监管保留、业务例外和允许匿名化替代的范围。
- Deletion task planner:为 source、lake、warehouse、vector DB、cache、object store、log store、eval set、feedback store、vendor endpoint 生成任务。
- Deletion executor:执行 hard delete、soft delete、crypto-shredding、tombstone、index purge、metadata severing、anonymization。
- Verification engine:用 evidence query 验证记录不可检索、索引不可召回、缓存已失效、供应商已确认。
- Evidence ledger:保存删除请求、决策、执行结果、异常、批准人、时间戳和证明。
Deletion Propagation Patterns
| Pattern | 适用对象 | 设计要点 | 风险 |
|---|---|---|---|
| Hard delete | 临时 prompt payload、缓存、低价值日志 | 直接物理删除并记录 deletion receipt | 误删会影响事故复盘,需要保留最小证据。 |
| Soft delete with suppression | 源系统客户记录、工单、交互记录 | 标记为不可用于 AI 检索、训练、评估,业务系统仍可按要求保留 | 如果 retrieval filter 漏掉 suppression,会再次暴露。 |
| Tombstone | 向量索引、chunk registry、eval item | 保留 ID 和删除状态,删除内容和可识别 metadata | Tombstone 设计过宽会泄露敏感关系。 |
| Crypto-shredding | 加密证据库、长期归档 | 删除或轮换密钥,使密文不可恢复 | 需要证明密钥销毁流程和访问隔离。 |
| Index purge and rebuild | Vector DB、search index、reranker cache | 删除 chunk/embedding 并重建受影响分片 | 只删源文档不删 embedding 会形成残留。 |
| Aggregation/anonymization | 指标、质量统计、模型监控报表 | 保留不可回溯统计结果 | 小样本、稀有事件可能重新识别。 |
| Model retirement or adapter isolation | 使用过敏感训练样本的模型适配件 | 下线受影响 adapter,回滚到未受影响版本 | 成本高,需要上线前把训练资格控制做严。 |
Legal Hold and Regulatory Evidence
金融零售场景中,删除不是绝对动作。投诉、争议交易、反欺诈调查、AML、监管检查、诉讼保全、模型风险事件可能触发 legal hold 或 regulatory hold。
架构要求:
- 删除决策必须记录 legal hold check result。
- Hold 范围要精确到 data object 和 purpose,避免以“有监管风险”为由无限期保存所有 AI trace。
- Hold 期间禁止进入新用途,例如不能因为保存了投诉证据,就把投诉内容拿去训练营销文案生成器。
- Hold 解除后要重新触发 deletion planner,而不是依靠人工记忆。
Data Minimization
AI data minimization 不是“字段少一点”,而是按生命周期阶段做 purpose-aware minimization。
最小化控制点
| 生命周期阶段 | 最小化问题 | 推荐控制 |
|---|---|---|
| Source intake | 这个 AI use case 是否需要客户级明细? | 用字段级 allowed-use policy 和 purpose mapping 限制进入 AI pipeline 的字段。 |
| Transformation | 是否可以用衍生特征、摘要、分类标签替代原文? | 使用 redaction、aggregation、tokenization、document-level summarization。 |
| Retrieval | 当前用户是否有权看到该 chunk? | 使用 ABAC/RBAC、relationship-based access、validity window、consent filter。 |
| Prompt | 模型是否需要明文 PII 才能完成任务? | prompt-time masking、reference token、tool-mediated lookup、no-PII prompt policy。 |
| Logging | 排障是否需要完整输入输出? | manifest-first logging、payload encryption、sampling、short retention。 |
| Eval | 测试目标是否需要真实客户案例? | synthetic scenario、sanitized incident case、field substitution、restricted golden set。 |
| Feedback | 反馈是否必须绑定客户身份? | separate feedback content from identity, role-limited reviewer access。 |
| Adaptation | 模型学习是否需要原始生产数据? | preference labels、synthetic examples、adapter isolation、training exclusion rules。 |
PM/BA 需求表达方式
把“保护隐私”改写为可验收需求:
- 系统必须在 prompt assembly 前执行字段级最小化规则,并生成包含 rule version、masked fields、exception reason 的 prompt manifest。
- 对员工 Copilot,默认禁止把客户账号、卡号、SSN、DOB、收入、交易明细和投诉全文注入通用模型;需要客户级上下文时必须通过受控 tool call 返回最小结果。
- 对 RAG 知识库,检索服务必须在向量召回后、上下文注入前执行权限过滤,并记录被排除文档数量和排除类别。
- 对日志,默认保存 manifest 和 hash;完整 payload 仅在 high-risk incident、客户申诉、监管证据或批准的调试窗口内加密保存。
Feedback Data Governance
Feedback 分类
| 类型 | 示例 | 治理含义 |
|---|---|---|
| Preference feedback | 客户点踩回答,坐席标记“不喜欢这个措辞” | 可用于 UX 改进,但不等于事实正确性。 |
| Correctness feedback | 质检指出利率解释错误 | 进入业务规则、知识库或 eval regression。 |
| Policy feedback | 回答违反收费披露、营销同意、投诉处理要求 | 进入合规事件与 prompt/policy 修复。 |
| Harm feedback | 客户因错误建议产生经济损失或不公平待遇 | 进入 incident process、根因分析和高优先级门禁。 |
| Retrieval feedback | 回答引用过期条款或错误产品地区 | 优先修复文档版本、metadata 和 retrieval filter。 |
| Privacy feedback | 输出暴露其他客户信息或不必要 PII | 触发 privacy event、deletion review 和日志隔离。 |
| Training feedback | 专家把回答改写为标准答案 | 进入 eval 或 adaptation 前必须完成数据资格审查。 |
Feedback 入训练门禁
进入模型适配前必须回答:
- 反馈内容是否含有客户身份、账户、交易、健康、家庭、收入、地理、投诉或其他敏感上下文?
- 反馈来源是否允许用于模型改进,还是仅允许用于服务质量处理?
- 该反馈代表稳定业务规则,还是单个客户情境?
- 是否已经完成脱敏、聚合或合成替代?
- 是否存在偏见放大风险,例如少数高价值客户反馈过度影响模型行为?
- 是否已进入 eval regression,先验证 prompt/RAG/规则修复是否足够?
- 是否能在未来删除请求或授权撤回时定位该反馈派生物?
Feedback Loop 架构
flowchart TD
A[Runtime Output] --> B[Feedback Capture]
B --> C[Feedback Classification]
C --> D{Risk Type}
D -->|Knowledge Error| E[KB / Metadata Fix]
D -->|Prompt Failure| F[Prompt Policy Fix]
D -->|Model Limitation| G[Eval Regression]
D -->|Sensitive Incident| H[Incident / Privacy Review]
G --> I[Adaptation Eligibility Review]
I --> J{Eligible for Learning}
J -->|Yes| K[Sanitized Training Package]
J -->|No| L[Eval-only or Evidence-only Store]
K --> M[Model Adaptation Review]
Financial Retail Case: AI Assistant for Credit Card Dispute and Fee Explanation
场景
一家金融零售机构上线员工 Copilot,帮助客服解释信用卡争议交易、手续费、退款状态和产品条款。系统使用 RAG 检索政策文档、客户账户摘要、历史工单和交易状态,并让坐席在回答客户前确认。
数据对象
| 对象 | 示例 | 风险 |
|---|---|---|
| Source customer data | 客户身份、账户状态、卡交易、争议记录、通话摘要 | PII、金融隐私、争议处理证据。 |
| Policy documents | 手续费条款、争议交易 SOP、监管披露要求 | 版本过期会导致错误承诺。 |
| Retrieval context | 最近交易、争议状态、适用条款片段 | 过度检索会把无关交易暴露给模型。 |
| Prompt manifest | 坐席角色、客户关系、工具调用、文档 ID、masking rule | 可作为审计证据,也可能成为敏感 trace。 |
| Output | 建议坐席如何解释 | 可能产生不当承诺、误导或差别待遇。 |
| Feedback | 坐席纠正、质检标注、客户投诉 | 可用于改进,也可能包含真实客户问题。 |
| Eval set | 争议交易、年费、退款、欺诈误报案例 | 高价值回归资产,需要强访问控制。 |
架构决策
- 客户明细不进入通用知识向量库;通过受控 tool call 获取最小客户摘要。
- 政策文档进入 RAG 时按产品、州/地区、生效日期、渠道、客户类型做 metadata 标记。
- Prompt 中只放交易 reference token、争议状态摘要和必要金额;完整卡号、SSN、DOB、全量交易列表不进入 prompt。
- 每次回答生成 prompt manifest,记录检索到的政策版本、客户摘要工具调用 ID、masking rule version、坐席 ID 和模型版本。
- 坐席反馈先进入 quality triage;只有脱敏后的标准问答和业务规则修复样本可进入 eval regression。
- 涉及客户经济损失、投诉升级、监管披露错误的 trace 进入 evidence vault,使用更长保留期和 legal hold 检查。
- 客户删除或限制处理请求触发 identity resolution,检查 CRM、工单、trace、feedback、eval、vector chunk 和供应商调用记录。
审计可回答问题
- 2026-05-12 某客户关于年费返还的回答引用了哪个条款版本?
- 该回答是否把客户完整卡号或 SSN 发送给模型供应商?
- 坐席纠正后的反馈是否进入了训练数据?
- 客户要求删除后,对应工单摘要是否仍能被向量检索召回?
- 监管检查期间,哪些 high-risk interaction trace 被 legal hold,谁批准,何时解除?
Templates
Template 1: Data Lifecycle Inventory
| Field | Description | Example |
|---|---|---|
| AI use case ID | AI 场景唯一编号 | AICOPILOT-CARD-DISPUTE-001 |
| Business owner | 业务责任人 | Card Servicing Product Owner |
| Data owner | 源数据责任人 | Core Banking Data Owner |
| Data object | 数据对象 | Dispute case summary |
| Source system | 来源系统 | CRM dispute module |
| System of record | 权威系统 | Card dispute platform |
| Purpose | 使用目的 | 帮助坐席解释争议状态 |
| Allowed AI use | 允许的 AI 用途 | Retrieval context via controlled tool call |
| Prohibited AI use | 禁止用途 | Fine-tuning, marketing personalization, external sharing |
| Data classification | 数据分类 | Confidential, customer financial data |
| Sensitive fields | 敏感字段 | Account ID, transaction ID, merchant, amount, dispute reason |
| Minimization rule | 最小化规则 | Include only active dispute status and relevant transaction reference |
| Transformation | 转换活动 | Summarization and field masking |
| Derived artifacts | 派生物 | Customer summary, prompt manifest, trace hash |
| Retention class | 保留分类 | Customer service trace: 180 days; complaint evidence: policy-defined extended period |
| Deletion propagation | 删除传播 | CRM, summary store, trace, feedback, eval exception list |
| Legal hold logic | 保全逻辑 | Complaint, dispute investigation, litigation, regulator request |
| Vendor involvement | 供应商参与 | Model API processes masked prompt; no training use |
| Evidence owner | 证据责任人 | AI Governance Lead |
Template 2: Provenance Table
| PROV Concept | AI Object | ID Pattern | Relationship | Evidence Field |
|---|---|---|---|---|
| Entity | Source policy document | policy_doc_id | wasAttributedTo Policy Owner | Owner, version, effective date |
| Entity | Document chunk | chunk_id | wasDerivedFrom Source policy document | Chunking rule, source offset, hash |
| Entity | Embedding | embedding_id | wasGeneratedBy Embed activity | Embedding model, index version |
| Activity | Retrieval | retrieval_run_id | used query and vector index | Query hash, filters, ranks, exclusions |
| Entity | Prompt manifest | prompt_manifest_id | wasGeneratedBy prompt assembly | Template version, retrieved chunks, masking |
| Activity | Model invocation | model_call_id | used prompt manifest | Model endpoint, region, policy decision |
| Entity | Output | output_id | wasGeneratedBy model invocation | Output hash, safety score, channel |
| Entity | Feedback | feedback_id | wasDerivedFrom output | Reviewer role, category, training eligibility |
| Activity | Deletion | deletion_job_id | used identity resolution result | Scope, action, verification result |
| Agent | Vendor model endpoint | vendor_agent_id | actedOnBehalfOf institution | Contract control, retention attestation |
Template 3: Retention and Deletion Matrix
| Data Object | Default Retention Trigger | Default Retention Logic | Deletion Action | Propagation Targets | Evidence |
|---|---|---|---|---|---|
| Raw user input | Interaction completed | Short operational window unless incident-related | Hard delete or encrypted vault purge | Chat store, prompt payload store, cache | Payload deletion receipt, hash retained if permitted |
| Prompt manifest | Model invocation completed | Retain for audit replay according to risk tier | Keep manifest; delete sensitive payload references | Trace DB, evidence vault | Manifest ID, policy version, masking record |
| Retrieved chunk | Source document effective period | Follows source policy and index validity | Tombstone chunk and purge embedding | Vector DB, search index, reranker cache | Index purge certificate |
| Embedding | Index version active | Retain while source is valid and allowed | Delete vector and rebuild affected index partition | Vector DB, backup, cache | Vector deletion job result |
| Runtime output | Interaction completed | Risk-tiered trace retention | Delete payload; retain minimal quality metrics | Conversation store, trace, analytics | Output hash, deletion receipt |
| Feedback record | Feedback submitted | Based on category: UX, correctness, incident, privacy | Delete identity link or purge content | Feedback store, eval queue, adaptation review | Feedback disposition record |
| Eval item | Accepted into eval set | Retain while test objective remains valid | Remove item and regression references | Eval store, test reports, notebooks | Eval set diff and approval |
| Training sample | Approved for adaptation | Retain for model lineage and reproducibility | Remove from future training; assess trained artifact impact | Training store, feature store, model registry | Training exclusion proof |
| Model adapter | Deployed or archived | Retain while deployed and for rollback period | Retire adapter if deletion impact is material | Model registry, deployment platform | Retirement decision and replacement version |
| Incident evidence | Incident opened | Retain under incident, complaint, legal or regulatory policy | Delete after hold release and closure period | Evidence vault, legal hold register | Hold release, final purge proof |
Template 4: DPIA/PIA Prompts
Use these prompts as analysis questions in a DPIA/PIA workshop. They are written for BA/PM facilitation, not for sending sensitive details to an external model.
| Area | Prompt |
|---|---|
| Purpose | What specific customer or employee outcome does this AI use case support, and which uses are explicitly outside scope? |
| Necessity | Which data fields are strictly necessary for the model or retrieval service to complete the task? |
| Alternatives | Can the same outcome be achieved with policy lookup, rules, synthetic examples, aggregation, masked references or tool-mediated access instead of raw personal data? |
| Context | Does the AI system combine data from contexts that customers, employees or regulators would not reasonably expect to be combined? |
| Prompt exposure | Which personal, financial or confidential fields can enter prompt context, and under what role, channel and purpose? |
| Retrieval | How does the system prevent retrieval of documents that are expired, regionally inapplicable, unauthorized or subject to deletion suppression? |
| Logging | What exact data is retained in logs, what is hashed, what is encrypted, and who can access payload-level evidence? |
| Feedback | Can customer or employee feedback enter eval or training data, and what approval is required? |
| Adaptation | Does any production data influence model weights, adapters, rerankers or prompt selection policies? |
| Vendor | Does any vendor retain prompts, outputs, embeddings, logs or feedback, and can deletion be verified? |
| Deletion | How does a deletion or restriction request propagate to source, derived artifacts, vector index, trace, feedback, eval and vendor systems? |
| Fairness | Could the dataset, retrieval policy or feedback loop create differential treatment by protected or vulnerable groups? |
| Evidence | Which query or report proves that controls executed for a specific interaction, customer, model version or deletion request? |
Template 5: Evidence Query
The following SQL-like examples assume governance events are stored in auditable tables. Adjust names to the local platform.
-- 1. Prove what data entered a specific model call.
SELECT
mc.model_call_id,
mc.model_version,
pm.prompt_manifest_id,
pm.template_version,
pm.masking_rule_version,
rt.retrieved_chunk_id,
rt.source_document_id,
rt.source_document_version,
rt.access_filter_result,
mc.vendor_endpoint,
mc.provider_retention_class
FROM ai_model_call mc
JOIN ai_prompt_manifest pm
ON mc.prompt_manifest_id = pm.prompt_manifest_id
LEFT JOIN ai_retrieval_trace rt
ON pm.retrieval_run_id = rt.retrieval_run_id
WHERE mc.model_call_id = :model_call_id;
-- 2. Show whether a customer's data entered feedback, eval or adaptation.
SELECT
ir.customer_id,
ir.interaction_id,
fb.feedback_id,
fb.feedback_category,
fb.training_eligibility,
ev.eval_item_id,
ev.eval_set_name,
tr.training_package_id,
tr.approval_status
FROM ai_identity_resolution ir
LEFT JOIN ai_feedback fb
ON ir.interaction_id = fb.interaction_id
LEFT JOIN ai_eval_item ev
ON fb.feedback_id = ev.source_feedback_id
LEFT JOIN ai_training_sample tr
ON ev.eval_item_id = tr.source_eval_item_id
WHERE ir.customer_id = :customer_id;
-- 3. Verify deletion propagation across AI stores.
SELECT
dr.deletion_request_id,
dr.customer_id,
dt.target_system,
dt.target_object_type,
dt.target_object_id,
dt.action,
dt.execution_status,
dt.verification_status,
dt.completed_at,
dt.evidence_uri
FROM ai_deletion_request dr
JOIN ai_deletion_task dt
ON dr.deletion_request_id = dt.deletion_request_id
WHERE dr.deletion_request_id = :deletion_request_id
ORDER BY dt.target_system, dt.target_object_type;
-- 4. Find model outputs generated from an outdated policy version.
SELECT
o.output_id,
o.interaction_id,
mc.model_version,
rt.source_document_id,
rt.source_document_version,
pd.effective_to,
o.generated_at
FROM ai_output o
JOIN ai_model_call mc
ON o.model_call_id = mc.model_call_id
JOIN ai_prompt_manifest pm
ON mc.prompt_manifest_id = pm.prompt_manifest_id
JOIN ai_retrieval_trace rt
ON pm.retrieval_run_id = rt.retrieval_run_id
JOIN policy_document pd
ON rt.source_document_id = pd.document_id
AND rt.source_document_version = pd.version
WHERE pd.effective_to < o.generated_at;
-- 5. Evidence pack for a high-risk customer interaction.
SELECT
i.interaction_id,
i.channel,
i.agent_id,
pm.prompt_manifest_id,
pm.policy_version,
pm.masking_rule_version,
mc.model_call_id,
mc.model_version,
mc.vendor_endpoint,
o.output_id,
o.output_hash,
sr.safety_result,
hr.human_review_outcome,
lg.legal_hold_status
FROM customer_interaction i
JOIN ai_prompt_manifest pm
ON i.interaction_id = pm.interaction_id
JOIN ai_model_call mc
ON pm.prompt_manifest_id = mc.prompt_manifest_id
JOIN ai_output o
ON mc.model_call_id = o.model_call_id
LEFT JOIN ai_safety_review sr
ON o.output_id = sr.output_id
LEFT JOIN ai_human_review hr
ON o.output_id = hr.output_id
LEFT JOIN ai_legal_hold lg
ON i.interaction_id = lg.interaction_id
WHERE i.interaction_id = :interaction_id;
Implementation Checklist for Architecture Review
| Control Area | Review Question | Evidence |
|---|---|---|
| Data inventory | Are all AI data objects inventoried beyond source datasets? | Lifecycle inventory, data cards, source-to-derived mapping |
| Allowed use | Is each data object mapped to allowed and prohibited AI uses? | Purpose map, policy decision logs |
| Provenance | Can the system trace source-to-prompt-to-output-to-feedback? | PROV graph, trace IDs, lineage tables |
| Retrieval | Are access filters applied after vector recall and before prompt injection? | Retrieval trace, excluded reason summary |
| Prompt | Is prompt assembly controlled, versioned and minimized? | Prompt manifest, template registry |
| Logging | Are logs minimized while preserving high-risk evidence? | Trace schema, evidence vault access log |
| Feedback | Is feedback classified before eval or training reuse? | Feedback taxonomy, triage decisions |
| Adaptation | Is production data barred from model adaptation unless approved? | Training eligibility, model adaptation register |
| Retention | Does each object have a retention clock and trigger? | Retention matrix, policy mappings |
| Deletion | Does deletion propagate to vector, cache, eval, feedback and vendor stores? | Deletion task records, purge certificates |
| Legal hold | Are holds precise and released through workflow? | Hold register, release record |
| Vendor | Are provider retention and training-use commitments evidenced? | Contract controls, attestations, deletion confirmations |
BA/PM Requirement Patterns
Use requirement language that is testable:
- As a compliance reviewer, I need to retrieve the prompt manifest, retrieval context IDs, model version, masking rule and human review result for a high-risk AI response, so that I can reconstruct the decision chain without exposing unnecessary payload data.
- As a data owner, I need to approve which customer data fields can enter each AI use case, so that allowed AI use is controlled at field, purpose, role and channel level.
- As a privacy lead, I need deletion requests to generate tasks for source records, derived summaries, vector embeddings, prompt traces, feedback items, eval cases and vendor stores, so that AI-derived data does not survive outside the source system.
- As a model owner, I need production feedback to be classified and approved before it enters eval or training, so that single-case corrections do not silently become model behavior.
- As an auditor, I need evidence queries that prove which policy document version and retrieval filters were used for a specific output, so that accountability is based on system records.
Interview Expression
30-Second Version
For AI data governance, I do not treat data lineage as ending at the data lake. In an AI system the lineage must continue through retrieval, prompt assembly, model invocation, logging, feedback, eval and model adaptation. My architecture approach is to create a governed data lifecycle inventory, use a PROV-style graph to connect entities, activities and agents, and design retention and deletion as orchestration across source systems, vector indexes, logs, feedback stores, eval sets and vendors.
2-Minute Version
In financial retail, the risk is not only that sensitive customer data exists; the risk is that AI creates new derived artifacts that teams forget to govern. A customer complaint can become a prompt, a trace log, a feedback label, an eval case and later a fine-tuning sample. If we cannot trace that chain, we cannot answer regulators, customers or internal audit.
I would start with NIST AI RMF for the risk management lifecycle and ISO/IEC 42001 for management system controls. Then I would model provenance using W3C PROV: entities like source records, chunks, embeddings, prompt manifests and outputs; activities like masking, retrieval, prompt assembly and deletion; agents like data owners, model services, vendors and reviewers.
For architecture, I would require prompt manifests, retrieval traces, vector index manifests, feedback classification, adaptation approvals and deletion orchestration. Retention would be object-specific: source data, embeddings, logs, eval items, feedback and model artifacts each need different clocks. Deletion must propagate beyond the source system into vector DBs, caches, trace stores, eval sets, feedback queues and provider environments, with evidence queries proving completion.
The key message is that AI governance must be designed as an operating architecture, not a policy PDF. The strongest control is the ability to answer, for one output: what data was used, why it was allowed, who was responsible, how long it is retained, and how it can be deleted or held for audit.
Senior-Level Follow-Up Points
- If a team says “we do not train on customer data,” I still ask whether customer data enters prompt logs, eval sets, feedback review, vector stores or vendor telemetry.
- If a team says “data is anonymized,” I ask what technique was used, what residual re-identification risk remains, and whether embedding similarity can leak sensitive relationships.
- If a team says “we can delete customer data,” I ask whether deletion covers derived summaries, chunks, embeddings, prompt manifests, feedback, eval cases, backups and vendor retention.
- If a team says “RAG is safer than fine-tuning,” I agree only conditionally: RAG improves freshness and control, but retrieval permission, chunk provenance, index deletion and prompt exposure still need governance.
- If a team wants to reuse production feedback for training, I require feedback classification, purpose compatibility, sanitization, sampling approval, bias review and model adaptation evidence.
Final Architecture Principle
A mature financial retail AI platform should make data provenance and deletion evidence as ordinary as API logs and deployment records. The architecture is successful when a regulator, auditor, product owner or data subject can ask a precise question about one AI output, one customer, one dataset, one model version or one deletion request, and the system can answer with records rather than explanations.