返回 Papers
AI 扩展计划 / Playbooks

AI Data Residency / Cross-Border / Sovereign AI Playbook

本文是学习、作品集和架构训练材料, 不构成法律、隐私、合规、审计、监管、税务、外包、数据保护、模型风险管理或安全意见。

800AI_DATA_RESIDENCY_CROSS_BORDER_SOVEREIGN_AI_PLAYBOOK.md

AI Data Residency / Cross-Border / Sovereign AI Architecture Playbook

适用对象: 金融零售 AI Product Manager、Senior BA、CBAP-level learner、Product Architect、Data Architect、Security Architect、Privacy Architect、Model Risk、Third-Party Risk、Compliance Product Owner。 核心问题: 如何把 data residency、cross-border transfer、sovereign AI 从法律和风险语言转成 AI 产品需求、运行时架构、供应商控制、证据链和作品集资产。 重点边界: 本 playbook 不讨论泛泛的云合规。它聚焦 AI RAG、tool calling、prompt/log/eval、model provider、vendor telemetry、encryption key residency、transfer impact review 和 sovereign deployment patterns。


0. Disclaimer

本文是学习、作品集和架构训练材料, 不构成法律、隐私、合规、审计、监管、税务、外包、数据保护、模型风险管理或安全意见。

正式项目必须由 Legal、Privacy、Compliance、Security、Data Governance、Model Risk、Third-Party Risk、Enterprise Architecture、Product、Operations、Customer Experience 和业务责任人共同确认适用要求。

不要把任何 source anchor 解读成 universal legal requirement。

适用性取决于:

  • jurisdiction and regulator context。
  • data subject: customer, prospect, employee, merchant, representative, household, business entity。
  • customer segment: retail, wealth, SME, corporate, vulnerable customer, minor where applicable。
  • product: deposit, card, lending, insurance, brokerage, wallet, loyalty, open banking。
  • data class: personal data, financial account data, credit data, card data, KYC/AML, complaint, employee data。
  • purpose: service, fraud, compliance, marketing, advice, eval, model monitoring, vendor support。
  • vendor and processor/subprocessor chain。
  • contract, transfer mechanism, outsourcing terms and customer disclosure。
  • actual runtime data path, including logs, eval, support access and backups。

1. Executive Framing

AI data residency architecture 的目标不是在 cloud console 里选择一个 region。

它的目标是让组织能在运行时回答并证明:

This AI use of this data followed the approved jurisdiction,
purpose, processor, model, tool, log, eval, backup and key route.

金融零售 AI 的 residency risk 来自 context expansion。

同一个客户问题可能经过:

  • mobile app 或 contact center。
  • identity and consent service。
  • customer profile。
  • account and transaction systems。
  • card dispute or lending system。
  • RAG corpus and vector index。
  • prompt assembly service。
  • external or internal model endpoint。
  • tool gateway。
  • logging and tracing platform。
  • eval sampling pipeline。
  • human review queue。
  • vendor telemetry。
  • backup and disaster recovery system。
  • KMS/HSM and break-glass workflow。

如果团队只问“数据库在哪里”, 会漏掉多数 AI processing path。

高级判断:

Data residency is a runtime architecture property. Cloud region is only one input.

本 playbook 的输出物:

  • data residency decision tree。
  • data classification model。
  • jurisdiction-purpose-processor matrix。
  • cross-border AI data path map。
  • RAG/tool/log/eval/vendor control design。
  • sovereign deployment pattern comparison。
  • model/provider region control register。
  • encryption and key residency design。
  • transfer impact review workflow。
  • evidence ledger schema。
  • operating model and RACI。
  • metrics and KRIs。
  • 30-day lab。
  • interview answers。
  • portfolio deliverables。

2. Source Anchors

以下官方来源作为概念和控制设计锚点。访问日期按 2026-06-30 记录。

AnchorOfficial source用在本 playbook 的位置
NIST Privacy Frameworkhttps://www.nist.gov/privacy-framework用 Identify-P、Govern-P、Control-P、Communicate-P、Protect-P 组织 privacy risk、processing context、data minimization、communication 和 evidence。
NIST AI RMFhttps://www.nist.gov/itl/ai-risk-management-framework用 Govern / Map / Measure / Manage 把 residency、vendor、model route、eval 和 monitoring 纳入 AI risk lifecycle。
FTC Safeguards Rulehttps://www.ftc.gov/business-guidance/resources/ftc-safeguards-rule-what-your-business-needs-know用客户信息保护、访问控制、服务提供商监督、风险评估和信息安全计划约束金融 AI 数据路径。
CFPB Personal Financial Data Rightshttps://www.consumerfinance.gov/personal-financial-data-rights/用于开放银行、客户授权、第三方访问、撤销、API scope 和客户控制的产品讨论。
EDPB International Transfershttps://www.edpb.europa.eu/our-work-tools/our-documents/topic/international-transfers_en作为国际数据传输、传输评估、补充措施和监管解释索引锚点。
ISO/IEC 42001https://www.iso.org/standard/42001用 AI management system 语言建立政策、角色、流程、供应商、监控、证据和持续改进。

Standards-to-artifact:

Source lensArchitecture artifact面试表达
NIST Privacy FrameworkProcessing context and privacy control map“我把 privacy risk 变成 data path 和 control evidence。”
NIST AI RMFAI residency risk register and monitoring dashboard“我用 Map / Measure / Manage 验证跨境路径和供应商路径。”
FTC Safeguards RuleCustomer information safeguard matrix“金融客户信息保护必须覆盖 AI prompts, tools, logs, vendors and support access。”
CFPB data rightsOpen banking authorization and revocation model“客户授权数据共享和内部 AI secondary use 必须拆开设计。”
EDPB transfer guidanceTransfer impact review pack“跨境路径需要记录 route, necessity, safeguards and residual risk。”
ISO/IEC 42001AI management system operating model“我把 data residency 放进 AIMS policy, roles, audit and continuous improvement。”

3. Operating Principles

PrincipleMeaningArchitecture behavior
Path over place不只看 database region, 看完整 AI data pathMap prompt, RAG, tools, logs, eval, backup, key and support access。
Purpose before transfer先确认 AI purpose, 再评估 routeService, fraud, marketing, eval 和 vendor support 不能混用。
Least data movement能本地处理就本地处理, 能摘要就不传原文local pre-processing, masking, summarization and aggregation。
Policy as code不把 residency 放在 wiki 里PDP, model gateway, RAG metadata, tool gateway and release gate。
Derived artifacts inherit riskembedding、summary、label、eval sample 可能仍受限制classification and lineage for derived AI artifacts。
Keys are part of sovereignty数据在本地但 key 在外部不等于强 sovereigntylocal KMS/HSM, key owner, access log and break-glass control。
Evidence by design事后补证据通常失败decision log, route manifest, approval ID and ledger schema。
Degraded mode is governedoutage fallback 也不能乱跨境region-safe fallback, capability downgrade and kill switch。

4. Data Residency Decision Tree

使用这个 decision tree 作为需求澄清和架构评审入口。

1. Who is the subject?
   -> customer / prospect / employee / merchant / representative / organization

2. Which jurisdiction and product entity apply?
   -> subject location, booking entity, branch, channel, product terms, contract

3. What is the AI purpose?
   -> service / fraud / compliance / marketing / advice / open banking / eval / operations

4. What data classes are used?
   -> public / internal / PII / account / transaction / credit / card / KYC / complaint / employee

5. Which AI artifacts are created?
   -> prompt / RAG chunk / embedding / output / tool payload / log / trace / eval sample / memory

6. Which processors and subprocessors touch them?
   -> cloud / model provider / vector DB / observability / annotation / support / DR provider

7. Which regions and access paths exist?
   -> compute, storage, logs, backup, admin access, support access, key access

8. Is cross-border processing necessary for the purpose?
   -> yes with review / no local route / minimize / pseudonymize / aggregate / deny

9. Which safeguards and evidence are required by policy?
   -> contract, encryption, key residency, access control, monitoring, transfer review, audit trail

10. What is the runtime decision?
   -> allow / deny / localize / minimize / pseudonymize / require review / human approval

Decision outputs:

OutputMeaning
allow_localProcess inside approved local or sovereign boundary.
allow_regionalProcess in approved regional route with evidence.
allow_cross_border_with_controlsTransfer path approved with documented safeguards and residual risk.
minimize_before_transferRedact, summarize, tokenize, aggregate or pseudonymize before route.
deny_routeData/purpose/vendor/region combination not approved.
review_requiredLegal, Privacy, Security, Vendor Risk or Model Risk review needed.
human_approval_requiredManual approval before enabling production route.
degrade_capabilityDisable high-risk context and provide lower-risk answer.

5. Data Classification for AI Residency

Classification must cover source data and AI-derived artifacts.

5.1 Source Data Classes

ClassExamplesDefault architecture posture
Publicpublic product FAQ, published fees, branch hoursLow restriction, still validate source integrity.
Internalprocedure manuals, internal training, non-sensitive metricsRegion preference by enterprise policy.
Confidential businesspricing strategy, roadmap, partner termsApproved internal or contractual route only.
PIIname, address, email, phone, customer identifierPurpose-bound processing, minimization and access control.
Financial accountaccount number, balance, transactions, statementsStrict purpose, scoped retrieval, masked logs.
Payment cardPAN, CVV, token, dispute evidenceSpecialized controls, avoid prompt/log exposure.
Credit and underwritingcredit score, bureau data, adverse action infoHigh-risk route, human review and explainability boundary.
KYC / AML / fraudidentity verification, sanctions, suspicious activityNeed-to-know route, restricted disclosure.
Complaint and vulnerable customercomplaint narrative, hardship, vulnerability indicatorsEnhanced access, retention and harm controls.
Employee dataperformance, HR, monitoring, workforce notesWorkforce notice, role-based and monitoring minimization.

5.2 AI-Derived Artifact Classes

ArtifactWhy it mattersControl
PromptMay contain raw customer data and instructionsprompt manifest, masking, region routing
CompletionMay reveal or infer sensitive dataoutput classification and retention
EmbeddingMay encode restricted source contentindex residency, deletion propagation
RAG chunkCarries source data and ACLcorpus manifest and purpose filters
Tool payloadCan move operational data across systemstool gateway and scoped token
Tool resultOften contains account or transaction dataminimization before prompt/log
TraceCaptures chain-of-thought-like workflow metadata or tool plansstructured trace without raw payload where feasible
FeedbackUser correction or thumbs-down note can include sensitive contentclassification and redaction
Eval sampleProduction data reused for QA or regressionapproval, anonymization, synthetic preference
Fine-tuning sampleStrongest secondary-use riskseparate approval and data lineage
AI memoryPersistent user or employee stateexplicit purpose, TTL, deletion workflow
Vendor telemetryMetadata, errors, safety flags, support packetsvendor controls and telemetry inventory

5.3 Classification Rules

RuleProduct/architecture implication
Derived artifact inherits the highest relevant source restriction unless approved otherwise.Embeddings from restricted documents stay restricted.
Logs are data products, not exhaust.Observability needs classification, retention and access control.
Aggregation can reduce risk but does not automatically remove obligations.Confirm threshold, re-identification risk and purpose.
Pseudonymization is a safeguard, not a magic deletion of risk.Keep linkage key governance and re-identification controls.
Data class can change after model output.Generated complaint summary may become complaint record.
Human review queue is a processing location.Reviewer location and access matter.

6. Jurisdiction / Purpose / Processor Matrix

This matrix is the core BA artifact.

It converts legal/privacy/vendor review into an executable architecture table.

6.1 Matrix Fields

FieldDescription
matrix_idStable ID for route approval.
subject_typecustomer, prospect, employee, merchant, representative.
subject_jurisdictionLocation or legal context used for routing.
product_entityBooking entity, branch, tenant or regulated affiliate.
customer_segmentretail, wealth, SME, vulnerable, employee, minor where relevant.
purpose_idPurpose catalog reference.
data_classesSource and derived data classes.
processorInternal platform, cloud, model provider, vector DB, observability vendor.
subprocessorsDownstream providers where known and approved.
model_endpoint_regionRegion where inference runs.
tool_destinationsSystems and regions receiving tool calls.
log_regionWhere operational logs and evidence are stored.
eval_regionWhere eval samples and labels are stored.
key_regionKMS/HSM and key control location.
approved_decisionallow, localize, minimize, deny, review required.
safeguardsEncryption, access control, contracts, minimization, monitoring.
evidence_refTransfer review, DPIA/PIA where applicable, vendor review, risk acceptance.
expiry_review_dateWhen approval must be recertified.

6.2 Example Matrix

SubjectProductPurposeDataProcessor routeDecision
EU retail customerEU cardpayment_dispute_supporttransaction, dispute evidenceEU app, EU RAG, EU model endpoint, EU logs, EU keysallow_local
US customerUS depositcustomer_service_account_helpaccount, transaction summaryUS app, US model endpoint, US logs, US keysallow_regional
EU customermarketingmarketing_personalizationsegment-level featuresEU feature store, approved campaign toolminimize_before_transfer if external creative model used
Global employeepolicy searchemployee_productivity_copilotinternal policy docsregional RAG, central observability with masked logsallow_cross_border_with_controls subject to policy
Open banking userdata sharingopen_banking_data_sharingaccount and transaction API dataAPI gateway to authorized third partyallow_with_scope
Wealth clientRM copilotrm_meeting_preparationportfolio, notes, suitabilitylocal/private route, no external raw promptallow_local or review_required

6.3 Compatibility Rules

Proposed reuseDefault stanceReason
Service transcript to marketing AIdeny or separate reviewpurpose mismatch and customer expectation risk
Fraud signal to customer-facing explanationminimize and human reviewdisclosure and abuse risk
Open banking authorization to internal model trainingdeny unless separately approvedsecondary use risk
Complaint data to eval setconditional with minimization and evidencequality purpose may be valid but high sensitivity
Employee copilot logs to productivity scoringdeny or separate workforce reviewmonitoring and fairness risk
Regional model fallback to global endpointdeny unless explicit fallback route approvedhidden cross-border transfer risk

7. Cross-Border AI Architecture

7.1 Reference Architecture

User / Employee / API Client
  -> channel ingress and region resolver
  -> identity, consent and authorization
  -> purpose catalog and data classification
  -> residency policy decision point
  -> AI orchestrator
       -> prompt assembly service
       -> RAG gateway
       -> tool gateway
       -> model/provider region gateway
       -> memory service
       -> logging and evidence gateway
       -> eval sampling gateway
       -> vendor telemetry controller
  -> policy decision log
  -> evidence ledger
  -> dashboards and recertification workflows

7.2 Residency PDP Inputs

InputExample
Subjectcustomer cust_123, employee rm_789
Subject jurisdictionEU, US state, UK, SG, CA, booking entity context
Product entitybank_eu_card, bank_us_deposit, wealth_sg
Purposepayment_dispute_support
Consent/authorizationgrant, withdrawal, open banking token, employee notice
Data classtransaction, credit, card, complaint, employee
Actionretrieve, summarize, draft, submit, log, evaluate, train
Artifactprompt, embedding, tool payload, log, eval sample
Processorinternal, cloud, model provider, observability, annotation
Endpoint regioneu-west, us-east, sovereign cloud region, on-prem
Key policylocal KMS, external KMS, HSM, BYOK, HYOK
Contract routeapproved processor and subprocessor chain

7.3 Residency PDP Decisions

DecisionMeaning
allowRoute matches approved matrix and controls.
denyRoute conflicts with jurisdiction, purpose, data class or vendor policy.
localizeUse local model, local RAG, local logs and local keys.
minimizeRedact, summarize, aggregate, tokenize or pseudonymize before processing.
split_routeUse local processing for restricted fields and external route for public context.
synthetic_onlyUse synthetic or de-identified eval data instead of production data.
review_requiredTrigger Legal/Privacy/Security/Model Risk/Vendor Risk workflow.
contract_review_requiredVendor, subprocessor, retention or support access changed.
step_up_approvalHuman approval required before high-impact action or transfer.
kill_switchStop capability or route due to policy breach or unresolved risk.

8. Cross-Border RAG Architecture

8.1 RAG Data Path

Source system
  -> classification and jurisdiction labeling
  -> corpus manifest
  -> chunking in approved region
  -> embedding in approved region
  -> vector index with ACL and purpose metadata
  -> retrieval with subject/purpose/region filters
  -> prompt assembly with minimized context
  -> inference through approved model route
  -> retrieval trace and evidence ledger

8.2 Corpus Manifest

FieldExample
corpus_ideu_card_dispute_policy_v3
source_systempolicy management system
source_regionEU
allowed_jurisdictionsEU product entities
allowed_purposespayment_dispute_support, complaint_resolution
data_classespublic policy, internal procedure, customer case where applicable
contains_personal_datano / yes with constraints
embedding_regionEU
vector_index_regionEU
retrieval_aclrole, tenant, product, case assignment
deletion_propagationsource deletion to index tombstone
evidence_levelstandard or enhanced

8.3 RAG Controls

RiskControlEvidence
Wrong region corpus retrievedregion metadata filterretrieval trace with corpus ID
Purpose mismatchpurpose allowlist per corpusdenied retrieval decision
ACL mismatchsource ACL mirrored in vector indexpositive and negative tests
Restricted customer data embedded globallylocal embedding pipelineembedding job region log
Revoked data remains retrievabletombstone and purge workflowdeletion propagation evidence
Prompt overexposurechunk budget and redactionprompt manifest
Eval sampling leaksynthetic or local eval queueeval lineage record

9. Cross-Border Tool Architecture

Tool calling is often riskier than model inference because tools touch systems of record.

9.1 Tool Gateway Pattern

AI orchestrator
  -> tool intent
  -> policy decision point
  -> scoped token issuer
  -> payload minimizer
  -> region-aware tool gateway
  -> system of record
  -> tool result classifier
  -> prompt/log minimizer
  -> evidence ledger

9.2 Tool Control Table

ToolData path riskControl
transactions.readaccount and transaction data may cross regionlocal API endpoint, scoped token, masked result
card_dispute.create_draftdispute narrative and evidencelocal case system, customer confirmation
crm.note.writepersistent customer recordpurpose check, role check, output classifier
marketing.offer.generatecustomer profile to creative modelsegment-level input, preference suppression
open_banking.token.revokecustomer-authorized data sharingauthorization scope, immediate revocation
fraud.case.triagerestricted fraud signalsneed-to-know access, no external raw prompt
employee.hr.lookupworkforce dataemployee policy route and high restriction

10. Logs, Traces, Eval and Human Review

10.1 Logging Architecture

Log objectRecommended contentAvoid by default
Operational metricslatency, cost, route, error coderaw prompt or account data
Policy decision logpurpose, data class, route, decision, reasonfull customer payload
Prompt manifesttemplate ID, source IDs, masking flagsfull retrieved chunks
Tool tracetool name, object scope, region, decisionfull tool result
Evidence vaultencrypted payload when justifiedbroad engineering access
Vendor logendpoint, region, retention classuncontrolled provider debug packets
Security audit logidentity, access, denial, anomalysensitive content beyond need

10.2 Eval Architecture

Eval is a product and risk control, but it can create hidden data reuse.

Eval typeData residency concernPreferred pattern
Synthetic regressionno direct production datause for baseline coverage
Golden set from productionproduction data copied into eval storelocal store, approval and minimization
Red-team promptsmay include sensitive scenariossynthetic or sanitized scenarios
Human labelsreviewer location and accessregion-approved review queue
Vendor eval servicethird-party processing pathvendor review and payload minimization
Fine-tuning datastrong secondary useseparate approval and lineage

11. Vendor and Processor Architecture

11.1 Vendor Inventory

Vendor typeAI residency question
Cloud providerWhere are compute, storage, backup, admin access and support processed?
Model providerWhich endpoint region, retention policy, training use and subprocessors apply?
Vector databaseWhere are embeddings and indexes stored and replicated?
Observability vendorDoes telemetry include prompt, completion, tool payload or identifiers?
Annotation vendorWhere are reviewers and work queues located?
Security vendorAre logs or payloads inspected outside approved regions?
Data enrichment vendorIs customer data sent for enrichment or matching?
Customer support platformAre AI transcripts stored in global tenant?

11.2 Vendor Route Decision

If provider endpoint region is approved
and retention setting matches policy
and no-training boundary is active
and subprocessor chain is approved
and logs/telemetry are minimized
and key policy matches matrix
then route may be enabled.
Otherwise route is denied, localized, minimized or sent to review.

12. Sovereign Deployment Patterns

Sovereign AI can mean different operating models. Define it before using it in strategy or marketing.

PatternDescriptionUse whenTrade-off
Local SaaS regionManaged provider in approved local regionmoderate sensitivity and fast launchprovider control remains material
Regional private cloudDedicated tenant or private deployment in regionhigher control and regulated workloadhigher cost and operations effort
Sovereign cloudCloud operated under jurisdiction-specific controlspublic sector or strict regulated routeservice catalog may be narrower
On-prem model servingModel hosted in bank-controlled data centerhighest control and restricted datamodel quality, scaling and patching burden
Hybrid split routeRestricted data local, public context externalbalance quality and residencycomplex orchestration and evidence
Edge/local inferencesmall model near channel or branchlow latency and offline modelimited model capability
Confidential computingworkload protected in TEEreduce exposure to infrastructure operatorattestation and side-channel governance
Local RAG plus external reasoninglocal retrieval/minimization, external model receives summaryreduce data movementsummary quality and leakage risk

12.1 Pattern Selection Criteria

CriterionQuestion
Data sensitivityIs raw customer, credit, card, KYC, complaint or employee data needed?
Purpose criticalityIs this service, fraud, regulated advice, marketing or eval?
LatencyCan local route meet SLA?
Model qualityDoes local model meet task accuracy and language needs?
CostIs sovereign route economically sustainable?
EvidenceCan route, key, log and operator control be proven?
Vendor exitCan model/provider be replaced without data lock-in?
ResilienceWhat happens if local provider or region is down?

13. Model / Provider Region Controls

13.1 Model Gateway

AI orchestrator
  -> model request classifier
  -> data class and purpose policy
  -> provider/model region registry
  -> route selection
  -> payload minimizer
  -> provider endpoint
  -> response classifier
  -> log/evidence gateway

13.2 Provider Region Register

FieldExample
provider_idprovider_x
model_idmodel_x_large_2026_05
endpoint_regionEU, US, UK, SG, sovereign-region-1
supported_data_classespublic, internal, masked PII, transaction summary
blocked_data_classesraw PAN, credit bureau, AML notes
allowed_purposescustomer_service_account_help, public_product_education
training_useno training, opt-out, separate agreement
retention_policyzero retention or configured retention class
telemetry_policymetadata only, redacted, disabled where available
subprocessor_refapproved vendor record
key_policyprovider-managed, BYOK, HYOK, local HSM
fallback_routelocal smaller model or deny
approval_refmodel risk and vendor risk decision
review_expirydate for recertification

13.3 Routing Rules

RequestRoute
Public product explanationglobal or regional model allowed if product policy allows.
Authenticated account explanationapproved regional model with masked account fields.
Card dispute draftlocal/regional model, no raw PAN, controlled logs.
Credit adverse action explanationhigh-control route, human review, explainability artifacts.
AML suspicious activity triagerestricted internal route, no external raw prompt.
Marketing creative generationsegment-level prompt, preference suppression, campaign evidence.
Eval regressionsynthetic or region-local approved sample.

14. Encryption and Key Residency

Encryption supports residency but does not replace routing and purpose controls.

14.1 Key Questions

QuestionWhy it matters
Where are keys generated?Generation location can matter for control claims.
Who controls keys?Provider-managed keys and customer-managed keys have different risk profiles.
Where can keys be used?Decryption path may cross boundaries.
Who can access keys?Admin and break-glass access must be logged and approved.
Are logs and backups encrypted with local keys?Evidence and DR artifacts also need controls.
Can keys be destroyed on exit?Vendor exit and deletion depend on key lifecycle.

14.2 Key Residency Patterns

PatternDescriptionFit
Provider-managed keyVendor controls key lifecyclelower sensitivity or low-risk artifact
Customer-managed keyOrganization controls key in provider KMScommon regulated cloud pattern
Bring your own keyOrganization imports or manages key materialstronger enterprise control
Hold your own keyKey never leaves organization-controlled HSMhigh sovereignty posture
Split key / dual controlMultiple parties required for key operationhigh-risk evidence vault
Local HSMHardware security module in approved regionrestricted data and sovereign route

15. Data Minimization Patterns

Minimization is the most practical cross-border risk reducer.

15.1 Minimization Ladder

LevelPatternExample
0denyraw AML notes never leave internal route
1field suppressionremove PAN, SSN, account number
2maskinglast4 only, merchant category only
3tokenizationreplace customer ID with scoped token
4summarization“three card transactions in dispute window”
5aggregationsegment-level campaign prompt
6synthetic datagenerated eval cases
7local processingkeep raw data local and send only answer

15.2 Minimization by AI Artifact

ArtifactMinimization tactic
User promptclassify and redact before model call.
RAG contextretrieve fewer chunks, mask sensitive fields, include source IDs.
Tool payloadsend object ID and required fields only.
Tool resultsummarize or mask before prompt re-entry.
Logstore manifest, hashes and decision IDs.
Eval samplesynthetic first, then approved sanitized production samples.
Memorystore preference or stable fact only when purpose allows.
Vendor ticketattach redacted trace and route decision, not raw payload.

16. Transfer Impact Review Workflow

This workflow is an architecture governance artifact, not a legal conclusion.

16.1 Trigger Events

TriggerExample
New model provideradding an external LLM endpoint
New endpoint regionrouting EU prompts to non-EU endpoint
New data classadding credit bureau data to assistant
New purposeservice bot data reused for marketing AI
New vendor telemetrysafety monitoring sends payload samples
New eval processproduction chat transcripts sampled for QA
New subprocessorprovider adds downstream analytics processor
New support pathoffshore vendor support can view traces
New fallback routeoutage route sends data to global endpoint
New key controlprovider-managed keys replace local KMS

16.2 Review Steps

StepOutput
1. Define use case and business necessityuse case brief and purpose ID
2. Map data classes and subjectsclassification table
3. Draw full AI data pathsource, RAG, prompt, model, tools, logs, eval, backups, keys
4. Identify processors/subprocessorsvendor inventory and contract refs
5. Assess minimization alternativeslocal, masked, aggregated, synthetic options
6. Define technical safeguardsencryption, key, access, logging, deletion, monitoring
7. Define contractual and operational safeguardsvendor terms, support access, incident, exit
8. Record residual risk and approvalsreview decision and expiry
9. Convert decision to runtime policymodel gateway, RAG filters, tool rules, log config
10. Test positive and negative routesevidence pack and launch gate

17. Evidence Ledger

Evidence ledger is the production proof that policy became runtime behavior.

Ledger events cover residency decisions, RAG retrieval, model route, tool route, log retention, eval sampling, key access, transfer review linkage, vendor changes and kill-switch actions.

17.1 Ledger Schema

FieldDescription
event_idUnique ledger event ID.
event_timeUTC timestamp.
interaction_idUser session, case or workflow ID.
subject_typecustomer, employee, merchant, representative.
subject_jurisdictionRouting jurisdiction context.
product_entityBooking entity or tenant.
purpose_idPurpose catalog reference.
data_classesSource and derived classes.
artifact_typeprompt, RAG chunk, tool payload, log, eval sample.
processor_idInternal or vendor processor.
subprocessor_refSubprocessor inventory reference.
source_regionWhere data originated.
destination_regionWhere data was processed or stored.
model_idModel and version.
endpoint_regionInference endpoint region.
key_policy_idKMS/HSM policy reference.
decisionallow, deny, localize, minimize, review, kill switch.
reason_codeStructured reason.
policy_versionRuntime policy bundle version.
review_refTransfer/vendor/model review ID.
evidence_hashHash of evidence manifest where payload is not stored.

17.2 Evidence Query

Proof that a specific interaction used approved route:

SELECT
  event_time,
  interaction_id,
  purpose_id,
  artifact_type,
  processor_id,
  source_region,
  destination_region,
  endpoint_region,
  decision,
  reason_code,
  policy_version,
  review_ref
FROM ai_residency_ledger
WHERE interaction_id = :interaction_id
ORDER BY event_time;

18. Operating Model

18.1 RACI

ActivityPMSenior BAArchitectPrivacyLegalSecurityData GovModel RiskVendor RiskOps
Define AI purpose and customer valueARCCCCCCCC
Build data path mapCRACCRRCCC
Classify source and derived artifactsCRCCCCACCC
Maintain jurisdiction-purpose-processor matrixCRACCCRCRC
Approve legal/privacy interpretationCCCRACCCCC
Design runtime policy controlsCCACCRRCCC
Approve model routeCCRCCCCACC
Approve vendor routeCCCCCCCCAC
Operate evidence ledgerCRCCCRACCR
Respond to route incidentCCRRCARCRR

Legend: R = responsible, A = accountable, C = consulted.

18.2 Governance Forums

ForumScope
AI product reviewpurpose, user value, customer journey, data need.
Architecture review boarddata path, region routing, keys, logs, resilience.
Privacy/legal reviewapplicability, notices, transfer review, contractual terms.
Model risk committeemodel route, eval, monitoring, fallback and risk tier.
Vendor risk reviewprovider, subprocessor, support access, telemetry, exit.
Security reviewaccess control, encryption, KMS/HSM, incident response.
Operational readinessrunbooks, dashboards, support model, kill switch.

18.3 Release Gate

Gate questionEvidence
Is the purpose approved?purpose catalog entry
Is the data path mapped end to end?data path diagram and matrix
Are processors and subprocessors approved?vendor inventory and approval refs
Are model endpoints and regions allowlisted?provider region register
Are RAG/tool/log/eval controls configured?policy bundle and tests
Are keys and backups aligned with route?KMS/HSM and DR evidence
Is transfer impact review complete where triggered?review ID and residual risk
Are dashboards and KRIs live?monitoring links and alert rules
Is fallback route safe?degraded-mode test
Is exit plan feasible?vendor exit and deletion runbook

19. Metrics and KRIs

19.1 Product Metrics

MetricWhat it reveals
Feature adoption by regionWhether local route supports customer value.
Completion rate by routeWhether minimized/local route hurts task completion.
Human handoff by residency denialFriction caused by blocked routes.
Latency by model regionCustomer experience impact.
Cost by sovereign patternUnit economics of local/private deployment.
Customer complaint rate about data useTrust and disclosure risk.

19.2 Control KRIs

KRISignal
Denied cross-border attemptsMisconfiguration, product drift or abuse.
Route mismatch rateRequests not matching approved matrix.
Unclassified artifact countData governance gap.
Prompt/log payload policy violationsObservability risk.
Eval samples without lineageHidden secondary-use risk.
Vendor endpoint outside allowlistProvider route breach.
Key access anomaliesSovereignty and security risk.
Transfer review overdueGovernance backlog.
Subprocessor change unreviewedVendor risk gap.
Fallback route activationResilience and policy stress.
Withdrawal/consent route conflictRuntime state propagation gap.

19.3 Executive Dashboard

ThemeExecutive question
Customer trustCan we explain where customer data goes?
Regulatory readinessCan we prove route and controls for high-risk products?
Operational resilienceCan local routes survive provider and region outages?
Vendor concentrationWhich AI capabilities depend on one provider route?
CostWhat is the premium for sovereign routes and is it justified?
Risk appetiteWhich residual risks have business acceptance?

20. 30-Day Lab

目标: 30 天内产出一个可展示的 AI Data Residency / Cross-Border / Sovereign AI Architecture portfolio pack。Use case 建议: Retail Banking AI Dispute Assistant。

DaysFocusOutputs
1-5Frame use case, subject, product, jurisdiction, purpose and caveatsuse case brief, purpose entry, applicability statement
6-10Map source, RAG, prompt, model, tool, log, eval, backup and key pathsend-to-end data path diagram
11-15Build data classification, processor inventory and provider region registerclassification table, vendor inventory, endpoint register
16-20Design PDP, RAG, tool, log, eval and key controlspolicy spec, corpus manifest, tool policy, key policy
21-25Simulate transfer impact review and evidence ledgerreview record, ledger schema, SQL proof query, KRI list
26-30Package PM/BA/architect narrativeexecutive memo, ADR, requirements, interview answers, portfolio deck

21. Interview Answers

Question30 秒版本2 分钟重点
data residency、cross-border transfer 和 sovereign AI 区别?Residency 看 AI artifacts 在哪里存储、处理和访问; transfer 看是否跨 jurisdiction boundary; sovereign AI 看本地控制能力。拆 prompt、RAG、tool、log、eval、backup、telemetry、keys and support access;说明适用性由 Legal/Privacy/Compliance 按 jurisdiction、subject、product、vendor、contract 判断。
为什么 cloud region 不够?AI 数据路径不止 database region。举例 prompt 到外部模型、global observability、offshore labeling、vendor support ticket;用 residency PDP 和 evidence ledger 控制。
如何设计 cross-border RAG?每个 corpus 有 source region、allowed purposes、data classes、embedding/index region、ACL 和 deletion propagation。Retrieval 前按 jurisdiction、purpose、role、object scope、consent/authorization 过滤;eval 使用 synthetic 或 region-local sample。
matrix 应该包含什么?subject jurisdiction、product entity、purpose、data class、processor、endpoint、tool/log/eval/key region、decision、safeguards、review ref。它是 BA 到 runtime policy 的桥;新增 vendor、data class、eval route 或 fallback route 触发 review。
如何控制 provider region?所有 model request 走 centralized model gateway。Provider register 记录 model、endpoint、allowed data classes、retention、no-training、telemetry、subprocessors、key policy、fallback and approval。
key residency 影响什么?Encryption 不能替代 route control; key location and control affect sovereignty claim。把 KMS/HSM region、BYOK/HYOK、rotation、break-glass、backup encryption and exit destruction 放进 matrix。
eval data 怎么管?Eval pipeline 是独立 processing path。默认 synthetic first;真实 failure sample 需要 classification、redaction、local queue、reviewer control、lineage and retention evidence。
何时选 sovereign/private route?当 sensitive data、strict jurisdiction、key control、support access、resilience 或 trust 需要更强控制。用 data sensitivity、purpose criticality、quality、latency、cost、operator control、audit evidence、fallback and exit 做 trade-off。
如何证明未跨越未批准边界?Evidence ledger 按 interaction_id 记录 RAG、model、tool、log、eval and key decisions。证据可存 manifest、source IDs、hash、policy version and review refs, 减少 raw payload overcollection。

22. Portfolio Deliverables

DeliverableWhat good looks like
Executive memoExplains why residency is product trust, architecture and risk control.
Use case briefCustomer value, scope, non-scope, jurisdiction assumptions and caveats.
Data classification matrixSource and derived AI artifacts classified.
Data path diagramSource, RAG, prompt, model, tools, logs, eval, backup and keys shown.
Jurisdiction-purpose-processor matrixRoute decisions, safeguards, evidence refs and review expiry.
Sovereign pattern ADRCompares local SaaS, private cloud, sovereign cloud, on-prem and hybrid.
Provider region registerModel endpoints, retention, telemetry, subprocessors, fallback and approvals.
Transfer impact review sampleRoute, necessity, safeguards, residual risk and approval trail.
Evidence ledger schemaEvent types, fields and one SQL proof query.
KRI dashboard specDenied attempts, route mismatch, eval lineage, key anomalies, vendor changes.

Portfolio storyline:

I treated data residency as a runtime AI architecture problem.
I mapped every artifact from user message to RAG, model, tools,
logs, eval, backup and keys, then translated jurisdiction,
purpose, processor and data class into executable policy.

23. Production Readiness Checklist

  • Every AI capability has approved purpose, data class and route.
  • Full data path covers RAG, prompts, tools, logs, eval, vendor telemetry, backups and keys.
  • Jurisdiction-purpose-processor matrix and provider region register are reviewed.
  • RAG corpus manifests include allowed purposes, regions, ACL and deletion propagation.
  • Tool gateway enforces purpose, object scope, region and payload minimization.
  • Logs and eval samples avoid raw payload unless justified and controlled.
  • Vendor subprocessors, support access, telemetry and exit path are inventoried.
  • Transfer impact review triggers on route, vendor, data class, purpose, eval or key changes.
  • Encryption and key residency align with data path and evidence requirements.
  • Fallback routes, kill switch, KRIs and evidence ledger are tested.
  • Named owners exist across Product, BA, Architecture, Legal, Privacy, Compliance, Security, Data Governance, Model Risk, Vendor Risk and Ops.

24. Closing View

金融零售 AI 的 data residency 成熟度不在于说出“我们使用某某 region”, 而在于能证明哪些数据被使用、为什么被使用、经过哪些 processor/subprocessor、在哪些 regions 处理和记录、使用哪些 keys、哪些 route 被允许或拒绝、哪些 evidence 可以复核。

真正的目标不是把 AI 全部锁死在一个地方, 而是让每一次 AI data use 都能被设计、限制、路由、最小化、监控、撤销、退出和证明。