目录
AI Data Residency / Cross-Border / Sovereign AI Architecture Playbook
适用对象: 金融零售 AI Product Manager、Senior BA、CBAP-level learner、Product Architect、Data Architect、Security Architect、Privacy Architect、Model Risk、Third-Party Risk、Compliance Product Owner。
核心问题: 如何把 data residency、cross-border transfer、sovereign AI 从法律和风险语言转成 AI 产品需求、运行时架构、供应商控制、证据链和作品集资产。
重点边界: 本 playbook 不讨论泛泛的云合规。它聚焦 AI RAG、tool calling、prompt/log/eval、model provider、vendor telemetry、encryption key residency、transfer impact review 和 sovereign deployment patterns。
0. Disclaimer
本文是学习、作品集和架构训练材料, 不构成法律、隐私、合规、审计、监管、税务、外包、数据保护、模型风险管理或安全意见。
正式项目必须由 Legal、Privacy、Compliance、Security、Data Governance、Model Risk、Third-Party Risk、Enterprise Architecture、Product、Operations、Customer Experience 和业务责任人共同确认适用要求。
不要把任何 source anchor 解读成 universal legal requirement。
适用性取决于:
jurisdiction and regulator context。
data subject: customer, prospect, employee, merchant, representative, household, business entity。
customer segment: retail, wealth, SME, corporate, vulnerable customer, minor where applicable。
product: deposit, card, lending, insurance, brokerage, wallet, loyalty, open banking。
data class: personal data, financial account data, credit data, card data, KYC/AML, complaint, employee data。
purpose: service, fraud, compliance, marketing, advice, eval, model monitoring, vendor support。
vendor and processor/subprocessor chain。
contract, transfer mechanism, outsourcing terms and customer disclosure。
actual runtime data path, including logs, eval, support access and backups。
1. Executive Framing
AI data residency architecture 的目标不是在 cloud console 里选择一个 region。
它的目标是让组织能在运行时回答并证明:
This AI use of this data followed the approved jurisdiction,
purpose, processor, model, tool, log, eval, backup and key route.
金融零售 AI 的 residency risk 来自 context expansion。
同一个客户问题可能经过:
mobile app 或 contact center。
identity and consent service。
customer profile。
account and transaction systems。
card dispute or lending system。
RAG corpus and vector index。
prompt assembly service。
external or internal model endpoint。
tool gateway。
logging and tracing platform。
eval sampling pipeline。
human review queue。
vendor telemetry。
backup and disaster recovery system。
KMS/HSM and break-glass workflow。
如果团队只问“数据库在哪里”, 会漏掉多数 AI processing path。
高级判断:
Data residency is a runtime architecture property. Cloud region is only one input.
本 playbook 的输出物:
data residency decision tree。
data classification model。
jurisdiction-purpose-processor matrix。
cross-border AI data path map。
RAG/tool/log/eval/vendor control design。
sovereign deployment pattern comparison。
model/provider region control register。
encryption and key residency design。
transfer impact review workflow。
evidence ledger schema。
operating model and RACI。
metrics and KRIs。
30-day lab。
interview answers。
portfolio deliverables。
2. Source Anchors
以下官方来源作为概念和控制设计锚点。访问日期按 2026-06-30 记录。
Standards-to-artifact:
Source lens Architecture artifact 面试表达 NIST Privacy Framework Processing context and privacy control map “我把 privacy risk 变成 data path 和 control evidence。” NIST AI RMF AI residency risk register and monitoring dashboard “我用 Map / Measure / Manage 验证跨境路径和供应商路径。” FTC Safeguards Rule Customer information safeguard matrix “金融客户信息保护必须覆盖 AI prompts, tools, logs, vendors and support access。” CFPB data rights Open banking authorization and revocation model “客户授权数据共享和内部 AI secondary use 必须拆开设计。” EDPB transfer guidance Transfer impact review pack “跨境路径需要记录 route, necessity, safeguards and residual risk。” ISO/IEC 42001 AI management system operating model “我把 data residency 放进 AIMS policy, roles, audit and continuous improvement。”
3. Operating Principles
Principle Meaning Architecture behavior Path over place 不只看 database region, 看完整 AI data path Map prompt, RAG, tools, logs, eval, backup, key and support access。 Purpose before transfer 先确认 AI purpose, 再评估 route Service, fraud, marketing, eval 和 vendor support 不能混用。 Least data movement 能本地处理就本地处理, 能摘要就不传原文 local pre-processing, masking, summarization and aggregation。 Policy as code 不把 residency 放在 wiki 里 PDP, model gateway, RAG metadata, tool gateway and release gate。 Derived artifacts inherit risk embedding、summary、label、eval sample 可能仍受限制 classification and lineage for derived AI artifacts。 Keys are part of sovereignty 数据在本地但 key 在外部不等于强 sovereignty local KMS/HSM, key owner, access log and break-glass control。 Evidence by design 事后补证据通常失败 decision log, route manifest, approval ID and ledger schema。 Degraded mode is governed outage fallback 也不能乱跨境 region-safe fallback, capability downgrade and kill switch。
4. Data Residency Decision Tree
使用这个 decision tree 作为需求澄清和架构评审入口。
1. Who is the subject?
-> customer / prospect / employee / merchant / representative / organization
2. Which jurisdiction and product entity apply?
-> subject location, booking entity, branch, channel, product terms, contract
3. What is the AI purpose?
-> service / fraud / compliance / marketing / advice / open banking / eval / operations
4. What data classes are used?
-> public / internal / PII / account / transaction / credit / card / KYC / complaint / employee
5. Which AI artifacts are created?
-> prompt / RAG chunk / embedding / output / tool payload / log / trace / eval sample / memory
6. Which processors and subprocessors touch them?
-> cloud / model provider / vector DB / observability / annotation / support / DR provider
7. Which regions and access paths exist?
-> compute, storage, logs, backup, admin access, support access, key access
8. Is cross-border processing necessary for the purpose?
-> yes with review / no local route / minimize / pseudonymize / aggregate / deny
9. Which safeguards and evidence are required by policy?
-> contract, encryption, key residency, access control, monitoring, transfer review, audit trail
10. What is the runtime decision?
-> allow / deny / localize / minimize / pseudonymize / require review / human approval
Decision outputs:
Output Meaning allow_localProcess inside approved local or sovereign boundary. allow_regionalProcess in approved regional route with evidence. allow_cross_border_with_controlsTransfer path approved with documented safeguards and residual risk. minimize_before_transferRedact, summarize, tokenize, aggregate or pseudonymize before route. deny_routeData/purpose/vendor/region combination not approved. review_requiredLegal, Privacy, Security, Vendor Risk or Model Risk review needed. human_approval_requiredManual approval before enabling production route. degrade_capabilityDisable high-risk context and provide lower-risk answer.
5. Data Classification for AI Residency
Classification must cover source data and AI-derived artifacts.
5.1 Source Data Classes
Class Examples Default architecture posture Public public product FAQ, published fees, branch hours Low restriction, still validate source integrity. Internal procedure manuals, internal training, non-sensitive metrics Region preference by enterprise policy. Confidential business pricing strategy, roadmap, partner terms Approved internal or contractual route only. PII name, address, email, phone, customer identifier Purpose-bound processing, minimization and access control. Financial account account number, balance, transactions, statements Strict purpose, scoped retrieval, masked logs. Payment card PAN, CVV, token, dispute evidence Specialized controls, avoid prompt/log exposure. Credit and underwriting credit score, bureau data, adverse action info High-risk route, human review and explainability boundary. KYC / AML / fraud identity verification, sanctions, suspicious activity Need-to-know route, restricted disclosure. Complaint and vulnerable customer complaint narrative, hardship, vulnerability indicators Enhanced access, retention and harm controls. Employee data performance, HR, monitoring, workforce notes Workforce notice, role-based and monitoring minimization.
5.2 AI-Derived Artifact Classes
Artifact Why it matters Control Prompt May contain raw customer data and instructions prompt manifest, masking, region routing Completion May reveal or infer sensitive data output classification and retention Embedding May encode restricted source content index residency, deletion propagation RAG chunk Carries source data and ACL corpus manifest and purpose filters Tool payload Can move operational data across systems tool gateway and scoped token Tool result Often contains account or transaction data minimization before prompt/log Trace Captures chain-of-thought-like workflow metadata or tool plans structured trace without raw payload where feasible Feedback User correction or thumbs-down note can include sensitive content classification and redaction Eval sample Production data reused for QA or regression approval, anonymization, synthetic preference Fine-tuning sample Strongest secondary-use risk separate approval and data lineage AI memory Persistent user or employee state explicit purpose, TTL, deletion workflow Vendor telemetry Metadata, errors, safety flags, support packets vendor controls and telemetry inventory
5.3 Classification Rules
Rule Product/architecture implication Derived artifact inherits the highest relevant source restriction unless approved otherwise. Embeddings from restricted documents stay restricted. Logs are data products, not exhaust. Observability needs classification, retention and access control. Aggregation can reduce risk but does not automatically remove obligations. Confirm threshold, re-identification risk and purpose. Pseudonymization is a safeguard, not a magic deletion of risk. Keep linkage key governance and re-identification controls. Data class can change after model output. Generated complaint summary may become complaint record. Human review queue is a processing location. Reviewer location and access matter.
6. Jurisdiction / Purpose / Processor Matrix
This matrix is the core BA artifact.
It converts legal/privacy/vendor review into an executable architecture table.
6.1 Matrix Fields
Field Description matrix_idStable ID for route approval. subject_typecustomer, prospect, employee, merchant, representative. subject_jurisdictionLocation or legal context used for routing. product_entityBooking entity, branch, tenant or regulated affiliate. customer_segmentretail, wealth, SME, vulnerable, employee, minor where relevant. purpose_idPurpose catalog reference. data_classesSource and derived data classes. processorInternal platform, cloud, model provider, vector DB, observability vendor. subprocessorsDownstream providers where known and approved. model_endpoint_regionRegion where inference runs. tool_destinationsSystems and regions receiving tool calls. log_regionWhere operational logs and evidence are stored. eval_regionWhere eval samples and labels are stored. key_regionKMS/HSM and key control location. approved_decisionallow, localize, minimize, deny, review required. safeguardsEncryption, access control, contracts, minimization, monitoring. evidence_refTransfer review, DPIA/PIA where applicable, vendor review, risk acceptance. expiry_review_dateWhen approval must be recertified.
6.2 Example Matrix
Subject Product Purpose Data Processor route Decision EU retail customer EU card payment_dispute_supporttransaction, dispute evidence EU app, EU RAG, EU model endpoint, EU logs, EU keys allow_localUS customer US deposit customer_service_account_helpaccount, transaction summary US app, US model endpoint, US logs, US keys allow_regionalEU customer marketing marketing_personalizationsegment-level features EU feature store, approved campaign tool minimize_before_transfer if external creative model usedGlobal employee policy search employee_productivity_copilotinternal policy docs regional RAG, central observability with masked logs allow_cross_border_with_controls subject to policyOpen banking user data sharing open_banking_data_sharingaccount and transaction API data API gateway to authorized third party allow_with_scopeWealth client RM copilot rm_meeting_preparationportfolio, notes, suitability local/private route, no external raw prompt allow_local or review_required
6.3 Compatibility Rules
Proposed reuse Default stance Reason Service transcript to marketing AI deny or separate review purpose mismatch and customer expectation risk Fraud signal to customer-facing explanation minimize and human review disclosure and abuse risk Open banking authorization to internal model training deny unless separately approved secondary use risk Complaint data to eval set conditional with minimization and evidence quality purpose may be valid but high sensitivity Employee copilot logs to productivity scoring deny or separate workforce review monitoring and fairness risk Regional model fallback to global endpoint deny unless explicit fallback route approved hidden cross-border transfer risk
7. Cross-Border AI Architecture
7.1 Reference Architecture
User / Employee / API Client
-> channel ingress and region resolver
-> identity, consent and authorization
-> purpose catalog and data classification
-> residency policy decision point
-> AI orchestrator
-> prompt assembly service
-> RAG gateway
-> tool gateway
-> model/provider region gateway
-> memory service
-> logging and evidence gateway
-> eval sampling gateway
-> vendor telemetry controller
-> policy decision log
-> evidence ledger
-> dashboards and recertification workflows
Input Example Subject customer cust_123, employee rm_789 Subject jurisdiction EU, US state, UK, SG, CA, booking entity context Product entity bank_eu_card, bank_us_deposit, wealth_sg Purpose payment_dispute_supportConsent/authorization grant, withdrawal, open banking token, employee notice Data class transaction, credit, card, complaint, employee Action retrieve, summarize, draft, submit, log, evaluate, train Artifact prompt, embedding, tool payload, log, eval sample Processor internal, cloud, model provider, observability, annotation Endpoint region eu-west, us-east, sovereign cloud region, on-prem Key policy local KMS, external KMS, HSM, BYOK, HYOK Contract route approved processor and subprocessor chain
7.3 Residency PDP Decisions
Decision Meaning allow Route matches approved matrix and controls. deny Route conflicts with jurisdiction, purpose, data class or vendor policy. localize Use local model, local RAG, local logs and local keys. minimize Redact, summarize, aggregate, tokenize or pseudonymize before processing. split_route Use local processing for restricted fields and external route for public context. synthetic_only Use synthetic or de-identified eval data instead of production data. review_required Trigger Legal/Privacy/Security/Model Risk/Vendor Risk workflow. contract_review_required Vendor, subprocessor, retention or support access changed. step_up_approval Human approval required before high-impact action or transfer. kill_switch Stop capability or route due to policy breach or unresolved risk.
8. Cross-Border RAG Architecture
8.1 RAG Data Path
Source system
-> classification and jurisdiction labeling
-> corpus manifest
-> chunking in approved region
-> embedding in approved region
-> vector index with ACL and purpose metadata
-> retrieval with subject/purpose/region filters
-> prompt assembly with minimized context
-> inference through approved model route
-> retrieval trace and evidence ledger
8.2 Corpus Manifest
Field Example corpus_ideu_card_dispute_policy_v3source_systempolicy management system source_regionEU allowed_jurisdictionsEU product entities allowed_purposespayment_dispute_support, complaint_resolution data_classespublic policy, internal procedure, customer case where applicable contains_personal_datano / yes with constraints embedding_regionEU vector_index_regionEU retrieval_aclrole, tenant, product, case assignment deletion_propagationsource deletion to index tombstone evidence_levelstandard or enhanced
8.3 RAG Controls
Risk Control Evidence Wrong region corpus retrieved region metadata filter retrieval trace with corpus ID Purpose mismatch purpose allowlist per corpus denied retrieval decision ACL mismatch source ACL mirrored in vector index positive and negative tests Restricted customer data embedded globally local embedding pipeline embedding job region log Revoked data remains retrievable tombstone and purge workflow deletion propagation evidence Prompt overexposure chunk budget and redaction prompt manifest Eval sampling leak synthetic or local eval queue eval lineage record
Tool calling is often riskier than model inference because tools touch systems of record.
AI orchestrator
-> tool intent
-> policy decision point
-> scoped token issuer
-> payload minimizer
-> region-aware tool gateway
-> system of record
-> tool result classifier
-> prompt/log minimizer
-> evidence ledger
Tool Data path risk Control transactions.readaccount and transaction data may cross region local API endpoint, scoped token, masked result card_dispute.create_draftdispute narrative and evidence local case system, customer confirmation crm.note.writepersistent customer record purpose check, role check, output classifier marketing.offer.generatecustomer profile to creative model segment-level input, preference suppression open_banking.token.revokecustomer-authorized data sharing authorization scope, immediate revocation fraud.case.triagerestricted fraud signals need-to-know access, no external raw prompt employee.hr.lookupworkforce data employee policy route and high restriction
10. Logs, Traces, Eval and Human Review
10.1 Logging Architecture
Log object Recommended content Avoid by default Operational metrics latency, cost, route, error code raw prompt or account data Policy decision log purpose, data class, route, decision, reason full customer payload Prompt manifest template ID, source IDs, masking flags full retrieved chunks Tool trace tool name, object scope, region, decision full tool result Evidence vault encrypted payload when justified broad engineering access Vendor log endpoint, region, retention class uncontrolled provider debug packets Security audit log identity, access, denial, anomaly sensitive content beyond need
10.2 Eval Architecture
Eval is a product and risk control, but it can create hidden data reuse.
Eval type Data residency concern Preferred pattern Synthetic regression no direct production data use for baseline coverage Golden set from production production data copied into eval store local store, approval and minimization Red-team prompts may include sensitive scenarios synthetic or sanitized scenarios Human labels reviewer location and access region-approved review queue Vendor eval service third-party processing path vendor review and payload minimization Fine-tuning data strong secondary use separate approval and lineage
11. Vendor and Processor Architecture
11.1 Vendor Inventory
Vendor type AI residency question Cloud provider Where are compute, storage, backup, admin access and support processed? Model provider Which endpoint region, retention policy, training use and subprocessors apply? Vector database Where are embeddings and indexes stored and replicated? Observability vendor Does telemetry include prompt, completion, tool payload or identifiers? Annotation vendor Where are reviewers and work queues located? Security vendor Are logs or payloads inspected outside approved regions? Data enrichment vendor Is customer data sent for enrichment or matching? Customer support platform Are AI transcripts stored in global tenant?
11.2 Vendor Route Decision
If provider endpoint region is approved
and retention setting matches policy
and no-training boundary is active
and subprocessor chain is approved
and logs/telemetry are minimized
and key policy matches matrix
then route may be enabled.
Otherwise route is denied, localized, minimized or sent to review.
12. Sovereign Deployment Patterns
Sovereign AI can mean different operating models. Define it before using it in strategy or marketing.
Pattern Description Use when Trade-off Local SaaS region Managed provider in approved local region moderate sensitivity and fast launch provider control remains material Regional private cloud Dedicated tenant or private deployment in region higher control and regulated workload higher cost and operations effort Sovereign cloud Cloud operated under jurisdiction-specific controls public sector or strict regulated route service catalog may be narrower On-prem model serving Model hosted in bank-controlled data center highest control and restricted data model quality, scaling and patching burden Hybrid split route Restricted data local, public context external balance quality and residency complex orchestration and evidence Edge/local inference small model near channel or branch low latency and offline mode limited model capability Confidential computing workload protected in TEE reduce exposure to infrastructure operator attestation and side-channel governance Local RAG plus external reasoning local retrieval/minimization, external model receives summary reduce data movement summary quality and leakage risk
12.1 Pattern Selection Criteria
Criterion Question Data sensitivity Is raw customer, credit, card, KYC, complaint or employee data needed? Purpose criticality Is this service, fraud, regulated advice, marketing or eval? Latency Can local route meet SLA? Model quality Does local model meet task accuracy and language needs? Cost Is sovereign route economically sustainable? Evidence Can route, key, log and operator control be proven? Vendor exit Can model/provider be replaced without data lock-in? Resilience What happens if local provider or region is down?
13. Model / Provider Region Controls
13.1 Model Gateway
AI orchestrator
-> model request classifier
-> data class and purpose policy
-> provider/model region registry
-> route selection
-> payload minimizer
-> provider endpoint
-> response classifier
-> log/evidence gateway
13.2 Provider Region Register
Field Example provider_idprovider_x model_idmodel_x_large_2026_05 endpoint_regionEU, US, UK, SG, sovereign-region-1 supported_data_classespublic, internal, masked PII, transaction summary blocked_data_classesraw PAN, credit bureau, AML notes allowed_purposescustomer_service_account_help, public_product_education training_useno training, opt-out, separate agreement retention_policyzero retention or configured retention class telemetry_policymetadata only, redacted, disabled where available subprocessor_refapproved vendor record key_policyprovider-managed, BYOK, HYOK, local HSM fallback_routelocal smaller model or deny approval_refmodel risk and vendor risk decision review_expirydate for recertification
13.3 Routing Rules
Request Route Public product explanation global or regional model allowed if product policy allows. Authenticated account explanation approved regional model with masked account fields. Card dispute draft local/regional model, no raw PAN, controlled logs. Credit adverse action explanation high-control route, human review, explainability artifacts. AML suspicious activity triage restricted internal route, no external raw prompt. Marketing creative generation segment-level prompt, preference suppression, campaign evidence. Eval regression synthetic or region-local approved sample.
14. Encryption and Key Residency
Encryption supports residency but does not replace routing and purpose controls.
14.1 Key Questions
Question Why it matters Where are keys generated? Generation location can matter for control claims. Who controls keys? Provider-managed keys and customer-managed keys have different risk profiles. Where can keys be used? Decryption path may cross boundaries. Who can access keys? Admin and break-glass access must be logged and approved. Are logs and backups encrypted with local keys? Evidence and DR artifacts also need controls. Can keys be destroyed on exit? Vendor exit and deletion depend on key lifecycle.
14.2 Key Residency Patterns
Pattern Description Fit Provider-managed key Vendor controls key lifecycle lower sensitivity or low-risk artifact Customer-managed key Organization controls key in provider KMS common regulated cloud pattern Bring your own key Organization imports or manages key material stronger enterprise control Hold your own key Key never leaves organization-controlled HSM high sovereignty posture Split key / dual control Multiple parties required for key operation high-risk evidence vault Local HSM Hardware security module in approved region restricted data and sovereign route
15. Data Minimization Patterns
Minimization is the most practical cross-border risk reducer.
15.1 Minimization Ladder
Level Pattern Example 0 deny raw AML notes never leave internal route 1 field suppression remove PAN, SSN, account number 2 masking last4 only, merchant category only 3 tokenization replace customer ID with scoped token 4 summarization “three card transactions in dispute window” 5 aggregation segment-level campaign prompt 6 synthetic data generated eval cases 7 local processing keep raw data local and send only answer
15.2 Minimization by AI Artifact
Artifact Minimization tactic User prompt classify and redact before model call. RAG context retrieve fewer chunks, mask sensitive fields, include source IDs. Tool payload send object ID and required fields only. Tool result summarize or mask before prompt re-entry. Log store manifest, hashes and decision IDs. Eval sample synthetic first, then approved sanitized production samples. Memory store preference or stable fact only when purpose allows. Vendor ticket attach redacted trace and route decision, not raw payload.
16. Transfer Impact Review Workflow
This workflow is an architecture governance artifact, not a legal conclusion.
16.1 Trigger Events
Trigger Example New model provider adding an external LLM endpoint New endpoint region routing EU prompts to non-EU endpoint New data class adding credit bureau data to assistant New purpose service bot data reused for marketing AI New vendor telemetry safety monitoring sends payload samples New eval process production chat transcripts sampled for QA New subprocessor provider adds downstream analytics processor New support path offshore vendor support can view traces New fallback route outage route sends data to global endpoint New key control provider-managed keys replace local KMS
16.2 Review Steps
Step Output 1. Define use case and business necessity use case brief and purpose ID 2. Map data classes and subjects classification table 3. Draw full AI data path source, RAG, prompt, model, tools, logs, eval, backups, keys 4. Identify processors/subprocessors vendor inventory and contract refs 5. Assess minimization alternatives local, masked, aggregated, synthetic options 6. Define technical safeguards encryption, key, access, logging, deletion, monitoring 7. Define contractual and operational safeguards vendor terms, support access, incident, exit 8. Record residual risk and approvals review decision and expiry 9. Convert decision to runtime policy model gateway, RAG filters, tool rules, log config 10. Test positive and negative routes evidence pack and launch gate
17. Evidence Ledger
Evidence ledger is the production proof that policy became runtime behavior.
Ledger events cover residency decisions, RAG retrieval, model route, tool route, log retention, eval sampling, key access, transfer review linkage, vendor changes and kill-switch actions.
17.1 Ledger Schema
Field Description event_idUnique ledger event ID. event_timeUTC timestamp. interaction_idUser session, case or workflow ID. subject_typecustomer, employee, merchant, representative. subject_jurisdictionRouting jurisdiction context. product_entityBooking entity or tenant. purpose_idPurpose catalog reference. data_classesSource and derived classes. artifact_typeprompt, RAG chunk, tool payload, log, eval sample. processor_idInternal or vendor processor. subprocessor_refSubprocessor inventory reference. source_regionWhere data originated. destination_regionWhere data was processed or stored. model_idModel and version. endpoint_regionInference endpoint region. key_policy_idKMS/HSM policy reference. decisionallow, deny, localize, minimize, review, kill switch. reason_codeStructured reason. policy_versionRuntime policy bundle version. review_refTransfer/vendor/model review ID. evidence_hashHash of evidence manifest where payload is not stored.
17.2 Evidence Query
Proof that a specific interaction used approved route:
SELECT
event_time,
interaction_id,
purpose_id,
artifact_type,
processor_id,
source_region,
destination_region,
endpoint_region,
decision,
reason_code,
policy_version,
review_ref
FROM ai_residency_ledger
WHERE interaction_id = :interaction_id
ORDER BY event_time;
18. Operating Model
18.1 RACI
Activity PM Senior BA Architect Privacy Legal Security Data Gov Model Risk Vendor Risk Ops Define AI purpose and customer value A R C C C C C C C C Build data path map C R A C C R R C C C Classify source and derived artifacts C R C C C C A C C C Maintain jurisdiction-purpose-processor matrix C R A C C C R C R C Approve legal/privacy interpretation C C C R A C C C C C Design runtime policy controls C C A C C R R C C C Approve model route C C R C C C C A C C Approve vendor route C C C C C C C C A C Operate evidence ledger C R C C C R A C C R Respond to route incident C C R R C A R C R R
Legend: R = responsible, A = accountable, C = consulted.
18.2 Governance Forums
Forum Scope AI product review purpose, user value, customer journey, data need. Architecture review board data path, region routing, keys, logs, resilience. Privacy/legal review applicability, notices, transfer review, contractual terms. Model risk committee model route, eval, monitoring, fallback and risk tier. Vendor risk review provider, subprocessor, support access, telemetry, exit. Security review access control, encryption, KMS/HSM, incident response. Operational readiness runbooks, dashboards, support model, kill switch.
18.3 Release Gate
Gate question Evidence Is the purpose approved? purpose catalog entry Is the data path mapped end to end? data path diagram and matrix Are processors and subprocessors approved? vendor inventory and approval refs Are model endpoints and regions allowlisted? provider region register Are RAG/tool/log/eval controls configured? policy bundle and tests Are keys and backups aligned with route? KMS/HSM and DR evidence Is transfer impact review complete where triggered? review ID and residual risk Are dashboards and KRIs live? monitoring links and alert rules Is fallback route safe? degraded-mode test Is exit plan feasible? vendor exit and deletion runbook
19. Metrics and KRIs
19.1 Product Metrics
Metric What it reveals Feature adoption by region Whether local route supports customer value. Completion rate by route Whether minimized/local route hurts task completion. Human handoff by residency denial Friction caused by blocked routes. Latency by model region Customer experience impact. Cost by sovereign pattern Unit economics of local/private deployment. Customer complaint rate about data use Trust and disclosure risk.
19.2 Control KRIs
KRI Signal Denied cross-border attempts Misconfiguration, product drift or abuse. Route mismatch rate Requests not matching approved matrix. Unclassified artifact count Data governance gap. Prompt/log payload policy violations Observability risk. Eval samples without lineage Hidden secondary-use risk. Vendor endpoint outside allowlist Provider route breach. Key access anomalies Sovereignty and security risk. Transfer review overdue Governance backlog. Subprocessor change unreviewed Vendor risk gap. Fallback route activation Resilience and policy stress. Withdrawal/consent route conflict Runtime state propagation gap.
19.3 Executive Dashboard
Theme Executive question Customer trust Can we explain where customer data goes? Regulatory readiness Can we prove route and controls for high-risk products? Operational resilience Can local routes survive provider and region outages? Vendor concentration Which AI capabilities depend on one provider route? Cost What is the premium for sovereign routes and is it justified? Risk appetite Which residual risks have business acceptance?
20. 30-Day Lab
目标: 30 天内产出一个可展示的 AI Data Residency / Cross-Border / Sovereign AI Architecture portfolio pack。Use case 建议: Retail Banking AI Dispute Assistant。
Days Focus Outputs 1-5 Frame use case, subject, product, jurisdiction, purpose and caveats use case brief, purpose entry, applicability statement 6-10 Map source, RAG, prompt, model, tool, log, eval, backup and key paths end-to-end data path diagram 11-15 Build data classification, processor inventory and provider region register classification table, vendor inventory, endpoint register 16-20 Design PDP, RAG, tool, log, eval and key controls policy spec, corpus manifest, tool policy, key policy 21-25 Simulate transfer impact review and evidence ledger review record, ledger schema, SQL proof query, KRI list 26-30 Package PM/BA/architect narrative executive memo, ADR, requirements, interview answers, portfolio deck
21. Interview Answers
Question 30 秒版本 2 分钟重点 data residency、cross-border transfer 和 sovereign AI 区别? Residency 看 AI artifacts 在哪里存储、处理和访问; transfer 看是否跨 jurisdiction boundary; sovereign AI 看本地控制能力。 拆 prompt、RAG、tool、log、eval、backup、telemetry、keys and support access;说明适用性由 Legal/Privacy/Compliance 按 jurisdiction、subject、product、vendor、contract 判断。 为什么 cloud region 不够? AI 数据路径不止 database region。 举例 prompt 到外部模型、global observability、offshore labeling、vendor support ticket;用 residency PDP 和 evidence ledger 控制。 如何设计 cross-border RAG? 每个 corpus 有 source region、allowed purposes、data classes、embedding/index region、ACL 和 deletion propagation。 Retrieval 前按 jurisdiction、purpose、role、object scope、consent/authorization 过滤;eval 使用 synthetic 或 region-local sample。 matrix 应该包含什么? subject jurisdiction、product entity、purpose、data class、processor、endpoint、tool/log/eval/key region、decision、safeguards、review ref。 它是 BA 到 runtime policy 的桥;新增 vendor、data class、eval route 或 fallback route 触发 review。 如何控制 provider region? 所有 model request 走 centralized model gateway。 Provider register 记录 model、endpoint、allowed data classes、retention、no-training、telemetry、subprocessors、key policy、fallback and approval。 key residency 影响什么? Encryption 不能替代 route control; key location and control affect sovereignty claim。 把 KMS/HSM region、BYOK/HYOK、rotation、break-glass、backup encryption and exit destruction 放进 matrix。 eval data 怎么管? Eval pipeline 是独立 processing path。 默认 synthetic first;真实 failure sample 需要 classification、redaction、local queue、reviewer control、lineage and retention evidence。 何时选 sovereign/private route? 当 sensitive data、strict jurisdiction、key control、support access、resilience 或 trust 需要更强控制。 用 data sensitivity、purpose criticality、quality、latency、cost、operator control、audit evidence、fallback and exit 做 trade-off。 如何证明未跨越未批准边界? Evidence ledger 按 interaction_id 记录 RAG、model、tool、log、eval and key decisions。 证据可存 manifest、source IDs、hash、policy version and review refs, 减少 raw payload overcollection。
22. Portfolio Deliverables
Deliverable What good looks like Executive memo Explains why residency is product trust, architecture and risk control. Use case brief Customer value, scope, non-scope, jurisdiction assumptions and caveats. Data classification matrix Source and derived AI artifacts classified. Data path diagram Source, RAG, prompt, model, tools, logs, eval, backup and keys shown. Jurisdiction-purpose-processor matrix Route decisions, safeguards, evidence refs and review expiry. Sovereign pattern ADR Compares local SaaS, private cloud, sovereign cloud, on-prem and hybrid. Provider region register Model endpoints, retention, telemetry, subprocessors, fallback and approvals. Transfer impact review sample Route, necessity, safeguards, residual risk and approval trail. Evidence ledger schema Event types, fields and one SQL proof query. KRI dashboard spec Denied attempts, route mismatch, eval lineage, key anomalies, vendor changes.
Portfolio storyline:
I treated data residency as a runtime AI architecture problem.
I mapped every artifact from user message to RAG, model, tools,
logs, eval, backup and keys, then translated jurisdiction,
purpose, processor and data class into executable policy.
23. Production Readiness Checklist
Every AI capability has approved purpose, data class and route.
Full data path covers RAG, prompts, tools, logs, eval, vendor telemetry, backups and keys.
Jurisdiction-purpose-processor matrix and provider region register are reviewed.
RAG corpus manifests include allowed purposes, regions, ACL and deletion propagation.
Tool gateway enforces purpose, object scope, region and payload minimization.
Logs and eval samples avoid raw payload unless justified and controlled.
Vendor subprocessors, support access, telemetry and exit path are inventoried.
Transfer impact review triggers on route, vendor, data class, purpose, eval or key changes.
Encryption and key residency align with data path and evidence requirements.
Fallback routes, kill switch, KRIs and evidence ledger are tested.
Named owners exist across Product, BA, Architecture, Legal, Privacy, Compliance, Security, Data Governance, Model Risk, Vendor Risk and Ops.
24. Closing View
金融零售 AI 的 data residency 成熟度不在于说出“我们使用某某 region”, 而在于能证明哪些数据被使用、为什么被使用、经过哪些 processor/subprocessor、在哪些 regions 处理和记录、使用哪些 keys、哪些 route 被允许或拒绝、哪些 evidence 可以复核。
真正的目标不是把 AI 全部锁死在一个地方, 而是让每一次 AI data use 都能被设计、限制、路由、最小化、监控、撤销、退出和证明。