AI 扩展计划 / Playbooks

AI Data Residency / Cross-Border / Sovereign AI Playbook

本文是学习、作品集和架构训练材料, 不构成法律、隐私、合规、审计、监管、税务、外包、数据保护、模型风险管理或安全意见。

800 行AI_DATA_RESIDENCY_CROSS_BORDER_SOVEREIGN_AI_PLAYBOOK.md

AI Data Residency / Cross-Border / Sovereign AI Architecture Playbook

适用对象: 金融零售 AI Product Manager、Senior BA、CBAP-level learner、Product Architect、Data Architect、Security Architect、Privacy Architect、Model Risk、Third-Party Risk、Compliance Product Owner。核心问题: 如何把 data residency、cross-border transfer、sovereign AI 从法律和风险语言转成 AI 产品需求、运行时架构、供应商控制、证据链和作品集资产。重点边界: 本 playbook 不讨论泛泛的云合规。它聚焦 AI RAG、tool calling、prompt/log/eval、model provider、vendor telemetry、encryption key residency、transfer impact review 和 sovereign deployment patterns。

0. Disclaimer

本文是学习、作品集和架构训练材料, 不构成法律、隐私、合规、审计、监管、税务、外包、数据保护、模型风险管理或安全意见。

正式项目必须由 Legal、Privacy、Compliance、Security、Data Governance、Model Risk、Third-Party Risk、Enterprise Architecture、Product、Operations、Customer Experience 和业务责任人共同确认适用要求。

不要把任何 source anchor 解读成 universal legal requirement。

适用性取决于:

jurisdiction and regulator context。
data subject: customer, prospect, employee, merchant, representative, household, business entity。
customer segment: retail, wealth, SME, corporate, vulnerable customer, minor where applicable。
product: deposit, card, lending, insurance, brokerage, wallet, loyalty, open banking。
data class: personal data, financial account data, credit data, card data, KYC/AML, complaint, employee data。
purpose: service, fraud, compliance, marketing, advice, eval, model monitoring, vendor support。
vendor and processor/subprocessor chain。
contract, transfer mechanism, outsourcing terms and customer disclosure。
actual runtime data path, including logs, eval, support access and backups。

1. Executive Framing

AI data residency architecture 的目标不是在 cloud console 里选择一个 region。

它的目标是让组织能在运行时回答并证明:

This AI use of this data followed the approved jurisdiction,
purpose, processor, model, tool, log, eval, backup and key route.

金融零售 AI 的 residency risk 来自 context expansion。

同一个客户问题可能经过:

mobile app 或 contact center。
identity and consent service。
customer profile。
account and transaction systems。
card dispute or lending system。
RAG corpus and vector index。
prompt assembly service。
external or internal model endpoint。
tool gateway。
logging and tracing platform。
eval sampling pipeline。
human review queue。
vendor telemetry。
backup and disaster recovery system。
KMS/HSM and break-glass workflow。

如果团队只问“数据库在哪里”, 会漏掉多数 AI processing path。

高级判断:

Data residency is a runtime architecture property. Cloud region is only one input.

本 playbook 的输出物:

data residency decision tree。
data classification model。
jurisdiction-purpose-processor matrix。
cross-border AI data path map。
RAG/tool/log/eval/vendor control design。
sovereign deployment pattern comparison。
model/provider region control register。
encryption and key residency design。
transfer impact review workflow。
evidence ledger schema。
operating model and RACI。
metrics and KRIs。
30-day lab。
interview answers。
portfolio deliverables。

2. Source Anchors

以下官方来源作为概念和控制设计锚点。访问日期按 2026-06-30 记录。

Anchor	Official source	用在本 playbook 的位置
NIST Privacy Framework	https://www.nist.gov/privacy-framework	用 Identify-P、Govern-P、Control-P、Communicate-P、Protect-P 组织 privacy risk、processing context、data minimization、communication 和 evidence。
NIST AI RMF	https://www.nist.gov/itl/ai-risk-management-framework	用 Govern / Map / Measure / Manage 把 residency、vendor、model route、eval 和 monitoring 纳入 AI risk lifecycle。
FTC Safeguards Rule	https://www.ftc.gov/business-guidance/resources/ftc-safeguards-rule-what-your-business-needs-know	用客户信息保护、访问控制、服务提供商监督、风险评估和信息安全计划约束金融 AI 数据路径。
CFPB Personal Financial Data Rights	https://www.consumerfinance.gov/personal-financial-data-rights/	用于开放银行、客户授权、第三方访问、撤销、API scope 和客户控制的产品讨论。
EDPB International Transfers	https://www.edpb.europa.eu/our-work-tools/our-documents/topic/international-transfers_en	作为国际数据传输、传输评估、补充措施和监管解释索引锚点。
ISO/IEC 42001	https://www.iso.org/standard/42001	用 AI management system 语言建立政策、角色、流程、供应商、监控、证据和持续改进。

Standards-to-artifact:

Source lens	Architecture artifact	面试表达
NIST Privacy Framework	Processing context and privacy control map	“我把 privacy risk 变成 data path 和 control evidence。”
NIST AI RMF	AI residency risk register and monitoring dashboard	“我用 Map / Measure / Manage 验证跨境路径和供应商路径。”
FTC Safeguards Rule	Customer information safeguard matrix	“金融客户信息保护必须覆盖 AI prompts, tools, logs, vendors and support access。”
CFPB data rights	Open banking authorization and revocation model	“客户授权数据共享和内部 AI secondary use 必须拆开设计。”
EDPB transfer guidance	Transfer impact review pack	“跨境路径需要记录 route, necessity, safeguards and residual risk。”
ISO/IEC 42001	AI management system operating model	“我把 data residency 放进 AIMS policy, roles, audit and continuous improvement。”

3. Operating Principles

Principle	Meaning	Architecture behavior
Path over place	不只看 database region, 看完整 AI data path	Map prompt, RAG, tools, logs, eval, backup, key and support access。
Purpose before transfer	先确认 AI purpose, 再评估 route	Service, fraud, marketing, eval 和 vendor support 不能混用。
Least data movement	能本地处理就本地处理, 能摘要就不传原文	local pre-processing, masking, summarization and aggregation。
Policy as code	不把 residency 放在 wiki 里	PDP, model gateway, RAG metadata, tool gateway and release gate。
Derived artifacts inherit risk	embedding、summary、label、eval sample 可能仍受限制	classification and lineage for derived AI artifacts。
Keys are part of sovereignty	数据在本地但 key 在外部不等于强 sovereignty	local KMS/HSM, key owner, access log and break-glass control。
Evidence by design	事后补证据通常失败	decision log, route manifest, approval ID and ledger schema。
Degraded mode is governed	outage fallback 也不能乱跨境	region-safe fallback, capability downgrade and kill switch。

4. Data Residency Decision Tree

使用这个 decision tree 作为需求澄清和架构评审入口。

1. Who is the subject?
   -> customer / prospect / employee / merchant / representative / organization

2. Which jurisdiction and product entity apply?
   -> subject location, booking entity, branch, channel, product terms, contract

3. What is the AI purpose?
   -> service / fraud / compliance / marketing / advice / open banking / eval / operations

4. What data classes are used?
   -> public / internal / PII / account / transaction / credit / card / KYC / complaint / employee

5. Which AI artifacts are created?
   -> prompt / RAG chunk / embedding / output / tool payload / log / trace / eval sample / memory

6. Which processors and subprocessors touch them?
   -> cloud / model provider / vector DB / observability / annotation / support / DR provider

7. Which regions and access paths exist?
   -> compute, storage, logs, backup, admin access, support access, key access

8. Is cross-border processing necessary for the purpose?
   -> yes with review / no local route / minimize / pseudonymize / aggregate / deny

9. Which safeguards and evidence are required by policy?
   -> contract, encryption, key residency, access control, monitoring, transfer review, audit trail

10. What is the runtime decision?
   -> allow / deny / localize / minimize / pseudonymize / require review / human approval

Decision outputs:

Output	Meaning
`allow_local`	Process inside approved local or sovereign boundary.
`allow_regional`	Process in approved regional route with evidence.
`allow_cross_border_with_controls`	Transfer path approved with documented safeguards and residual risk.
`minimize_before_transfer`	Redact, summarize, tokenize, aggregate or pseudonymize before route.
`deny_route`	Data/purpose/vendor/region combination not approved.
`review_required`	Legal, Privacy, Security, Vendor Risk or Model Risk review needed.
`human_approval_required`	Manual approval before enabling production route.
`degrade_capability`	Disable high-risk context and provide lower-risk answer.

5. Data Classification for AI Residency

Classification must cover source data and AI-derived artifacts.

5.1 Source Data Classes

Class	Examples	Default architecture posture
Public	public product FAQ, published fees, branch hours	Low restriction, still validate source integrity.
Internal	procedure manuals, internal training, non-sensitive metrics	Region preference by enterprise policy.
Confidential business	pricing strategy, roadmap, partner terms	Approved internal or contractual route only.
PII	name, address, email, phone, customer identifier	Purpose-bound processing, minimization and access control.
Financial account	account number, balance, transactions, statements	Strict purpose, scoped retrieval, masked logs.
Payment card	PAN, CVV, token, dispute evidence	Specialized controls, avoid prompt/log exposure.
Credit and underwriting	credit score, bureau data, adverse action info	High-risk route, human review and explainability boundary.
KYC / AML / fraud	identity verification, sanctions, suspicious activity	Need-to-know route, restricted disclosure.
Complaint and vulnerable customer	complaint narrative, hardship, vulnerability indicators	Enhanced access, retention and harm controls.
Employee data	performance, HR, monitoring, workforce notes	Workforce notice, role-based and monitoring minimization.

5.2 AI-Derived Artifact Classes

Artifact	Why it matters	Control
Prompt	May contain raw customer data and instructions	prompt manifest, masking, region routing
Completion	May reveal or infer sensitive data	output classification and retention
Embedding	May encode restricted source content	index residency, deletion propagation
RAG chunk	Carries source data and ACL	corpus manifest and purpose filters
Tool payload	Can move operational data across systems	tool gateway and scoped token
Tool result	Often contains account or transaction data	minimization before prompt/log
Trace	Captures chain-of-thought-like workflow metadata or tool plans	structured trace without raw payload where feasible
Feedback	User correction or thumbs-down note can include sensitive content	classification and redaction
Eval sample	Production data reused for QA or regression	approval, anonymization, synthetic preference
Fine-tuning sample	Strongest secondary-use risk	separate approval and data lineage
AI memory	Persistent user or employee state	explicit purpose, TTL, deletion workflow
Vendor telemetry	Metadata, errors, safety flags, support packets	vendor controls and telemetry inventory

5.3 Classification Rules

Rule	Product/architecture implication
Derived artifact inherits the highest relevant source restriction unless approved otherwise.	Embeddings from restricted documents stay restricted.
Logs are data products, not exhaust.	Observability needs classification, retention and access control.
Aggregation can reduce risk but does not automatically remove obligations.	Confirm threshold, re-identification risk and purpose.
Pseudonymization is a safeguard, not a magic deletion of risk.	Keep linkage key governance and re-identification controls.
Data class can change after model output.	Generated complaint summary may become complaint record.
Human review queue is a processing location.	Reviewer location and access matter.

6. Jurisdiction / Purpose / Processor Matrix

This matrix is the core BA artifact.

It converts legal/privacy/vendor review into an executable architecture table.

6.1 Matrix Fields

Field	Description
`matrix_id`	Stable ID for route approval.
`subject_type`	customer, prospect, employee, merchant, representative.
`subject_jurisdiction`	Location or legal context used for routing.
`product_entity`	Booking entity, branch, tenant or regulated affiliate.
`customer_segment`	retail, wealth, SME, vulnerable, employee, minor where relevant.
`purpose_id`	Purpose catalog reference.
`data_classes`	Source and derived data classes.
`processor`	Internal platform, cloud, model provider, vector DB, observability vendor.
`subprocessors`	Downstream providers where known and approved.
`model_endpoint_region`	Region where inference runs.
`tool_destinations`	Systems and regions receiving tool calls.
`log_region`	Where operational logs and evidence are stored.
`eval_region`	Where eval samples and labels are stored.
`key_region`	KMS/HSM and key control location.
`approved_decision`	allow, localize, minimize, deny, review required.
`safeguards`	Encryption, access control, contracts, minimization, monitoring.
`evidence_ref`	Transfer review, DPIA/PIA where applicable, vendor review, risk acceptance.
`expiry_review_date`	When approval must be recertified.

6.2 Example Matrix

Subject	Product	Purpose	Data	Processor route	Decision
EU retail customer	EU card	`payment_dispute_support`	transaction, dispute evidence	EU app, EU RAG, EU model endpoint, EU logs, EU keys	`allow_local`
US customer	US deposit	`customer_service_account_help`	account, transaction summary	US app, US model endpoint, US logs, US keys	`allow_regional`
EU customer	marketing	`marketing_personalization`	segment-level features	EU feature store, approved campaign tool	`minimize_before_transfer` if external creative model used
Global employee	policy search	`employee_productivity_copilot`	internal policy docs	regional RAG, central observability with masked logs	`allow_cross_border_with_controls` subject to policy
Open banking user	data sharing	`open_banking_data_sharing`	account and transaction API data	API gateway to authorized third party	`allow_with_scope`
Wealth client	RM copilot	`rm_meeting_preparation`	portfolio, notes, suitability	local/private route, no external raw prompt	`allow_local` or `review_required`

6.3 Compatibility Rules

Proposed reuse	Default stance	Reason
Service transcript to marketing AI	deny or separate review	purpose mismatch and customer expectation risk
Fraud signal to customer-facing explanation	minimize and human review	disclosure and abuse risk
Open banking authorization to internal model training	deny unless separately approved	secondary use risk
Complaint data to eval set	conditional with minimization and evidence	quality purpose may be valid but high sensitivity
Employee copilot logs to productivity scoring	deny or separate workforce review	monitoring and fairness risk
Regional model fallback to global endpoint	deny unless explicit fallback route approved	hidden cross-border transfer risk

7. Cross-Border AI Architecture

7.1 Reference Architecture

User / Employee / API Client
  -> channel ingress and region resolver
  -> identity, consent and authorization
  -> purpose catalog and data classification
  -> residency policy decision point
  -> AI orchestrator
       -> prompt assembly service
       -> RAG gateway
       -> tool gateway
       -> model/provider region gateway
       -> memory service
       -> logging and evidence gateway
       -> eval sampling gateway
       -> vendor telemetry controller
  -> policy decision log
  -> evidence ledger
  -> dashboards and recertification workflows

7.2 Residency PDP Inputs

Input	Example
Subject	customer `cust_123`, employee `rm_789`
Subject jurisdiction	EU, US state, UK, SG, CA, booking entity context
Product entity	bank_eu_card, bank_us_deposit, wealth_sg
Purpose	`payment_dispute_support`
Consent/authorization	grant, withdrawal, open banking token, employee notice
Data class	transaction, credit, card, complaint, employee
Action	retrieve, summarize, draft, submit, log, evaluate, train
Artifact	prompt, embedding, tool payload, log, eval sample
Processor	internal, cloud, model provider, observability, annotation
Endpoint region	eu-west, us-east, sovereign cloud region, on-prem
Key policy	local KMS, external KMS, HSM, BYOK, HYOK
Contract route	approved processor and subprocessor chain

7.3 Residency PDP Decisions

Decision	Meaning
allow	Route matches approved matrix and controls.
deny	Route conflicts with jurisdiction, purpose, data class or vendor policy.
localize	Use local model, local RAG, local logs and local keys.
minimize	Redact, summarize, aggregate, tokenize or pseudonymize before processing.
split_route	Use local processing for restricted fields and external route for public context.
synthetic_only	Use synthetic or de-identified eval data instead of production data.
review_required	Trigger Legal/Privacy/Security/Model Risk/Vendor Risk workflow.
contract_review_required	Vendor, subprocessor, retention or support access changed.
step_up_approval	Human approval required before high-impact action or transfer.
kill_switch	Stop capability or route due to policy breach or unresolved risk.

8. Cross-Border RAG Architecture

8.1 RAG Data Path

Source system
  -> classification and jurisdiction labeling
  -> corpus manifest
  -> chunking in approved region
  -> embedding in approved region
  -> vector index with ACL and purpose metadata
  -> retrieval with subject/purpose/region filters
  -> prompt assembly with minimized context
  -> inference through approved model route
  -> retrieval trace and evidence ledger

8.2 Corpus Manifest

Field	Example
`corpus_id`	`eu_card_dispute_policy_v3`
`source_system`	policy management system
`source_region`	EU
`allowed_jurisdictions`	EU product entities
`allowed_purposes`	payment_dispute_support, complaint_resolution
`data_classes`	public policy, internal procedure, customer case where applicable
`contains_personal_data`	no / yes with constraints
`embedding_region`	EU
`vector_index_region`	EU
`retrieval_acl`	role, tenant, product, case assignment
`deletion_propagation`	source deletion to index tombstone
`evidence_level`	standard or enhanced

8.3 RAG Controls

Risk	Control	Evidence
Wrong region corpus retrieved	region metadata filter	retrieval trace with corpus ID
Purpose mismatch	purpose allowlist per corpus	denied retrieval decision
ACL mismatch	source ACL mirrored in vector index	positive and negative tests
Restricted customer data embedded globally	local embedding pipeline	embedding job region log
Revoked data remains retrievable	tombstone and purge workflow	deletion propagation evidence
Prompt overexposure	chunk budget and redaction	prompt manifest
Eval sampling leak	synthetic or local eval queue	eval lineage record

9. Cross-Border Tool Architecture

Tool calling is often riskier than model inference because tools touch systems of record.

9.1 Tool Gateway Pattern

AI orchestrator
  -> tool intent
  -> policy decision point
  -> scoped token issuer
  -> payload minimizer
  -> region-aware tool gateway
  -> system of record
  -> tool result classifier
  -> prompt/log minimizer
  -> evidence ledger

9.2 Tool Control Table

Tool	Data path risk	Control
`transactions.read`	account and transaction data may cross region	local API endpoint, scoped token, masked result
`card_dispute.create_draft`	dispute narrative and evidence	local case system, customer confirmation
`crm.note.write`	persistent customer record	purpose check, role check, output classifier
`marketing.offer.generate`	customer profile to creative model	segment-level input, preference suppression
`open_banking.token.revoke`	customer-authorized data sharing	authorization scope, immediate revocation
`fraud.case.triage`	restricted fraud signals	need-to-know access, no external raw prompt
`employee.hr.lookup`	workforce data	employee policy route and high restriction

10. Logs, Traces, Eval and Human Review

10.1 Logging Architecture

Log object	Recommended content	Avoid by default
Operational metrics	latency, cost, route, error code	raw prompt or account data
Policy decision log	purpose, data class, route, decision, reason	full customer payload
Prompt manifest	template ID, source IDs, masking flags	full retrieved chunks
Tool trace	tool name, object scope, region, decision	full tool result
Evidence vault	encrypted payload when justified	broad engineering access
Vendor log	endpoint, region, retention class	uncontrolled provider debug packets
Security audit log	identity, access, denial, anomaly	sensitive content beyond need

10.2 Eval Architecture

Eval is a product and risk control, but it can create hidden data reuse.

Eval type	Data residency concern	Preferred pattern
Synthetic regression	no direct production data	use for baseline coverage
Golden set from production	production data copied into eval store	local store, approval and minimization
Red-team prompts	may include sensitive scenarios	synthetic or sanitized scenarios
Human labels	reviewer location and access	region-approved review queue
Vendor eval service	third-party processing path	vendor review and payload minimization
Fine-tuning data	strong secondary use	separate approval and lineage

11. Vendor and Processor Architecture

11.1 Vendor Inventory

Vendor type	AI residency question
Cloud provider	Where are compute, storage, backup, admin access and support processed?
Model provider	Which endpoint region, retention policy, training use and subprocessors apply?
Vector database	Where are embeddings and indexes stored and replicated?
Observability vendor	Does telemetry include prompt, completion, tool payload or identifiers?
Annotation vendor	Where are reviewers and work queues located?
Security vendor	Are logs or payloads inspected outside approved regions?
Data enrichment vendor	Is customer data sent for enrichment or matching?
Customer support platform	Are AI transcripts stored in global tenant?

11.2 Vendor Route Decision

If provider endpoint region is approved
and retention setting matches policy
and no-training boundary is active
and subprocessor chain is approved
and logs/telemetry are minimized
and key policy matches matrix
then route may be enabled.
Otherwise route is denied, localized, minimized or sent to review.

12. Sovereign Deployment Patterns

Sovereign AI can mean different operating models. Define it before using it in strategy or marketing.

Pattern	Description	Use when	Trade-off
Local SaaS region	Managed provider in approved local region	moderate sensitivity and fast launch	provider control remains material
Regional private cloud	Dedicated tenant or private deployment in region	higher control and regulated workload	higher cost and operations effort
Sovereign cloud	Cloud operated under jurisdiction-specific controls	public sector or strict regulated route	service catalog may be narrower
On-prem model serving	Model hosted in bank-controlled data center	highest control and restricted data	model quality, scaling and patching burden
Hybrid split route	Restricted data local, public context external	balance quality and residency	complex orchestration and evidence
Edge/local inference	small model near channel or branch	low latency and offline mode	limited model capability
Confidential computing	workload protected in TEE	reduce exposure to infrastructure operator	attestation and side-channel governance
Local RAG plus external reasoning	local retrieval/minimization, external model receives summary	reduce data movement	summary quality and leakage risk

12.1 Pattern Selection Criteria

Criterion	Question
Data sensitivity	Is raw customer, credit, card, KYC, complaint or employee data needed?
Purpose criticality	Is this service, fraud, regulated advice, marketing or eval?
Latency	Can local route meet SLA?
Model quality	Does local model meet task accuracy and language needs?
Cost	Is sovereign route economically sustainable?
Evidence	Can route, key, log and operator control be proven?
Vendor exit	Can model/provider be replaced without data lock-in?
Resilience	What happens if local provider or region is down?

13. Model / Provider Region Controls

13.1 Model Gateway

AI orchestrator
  -> model request classifier
  -> data class and purpose policy
  -> provider/model region registry
  -> route selection
  -> payload minimizer
  -> provider endpoint
  -> response classifier
  -> log/evidence gateway

13.2 Provider Region Register

Field	Example
`provider_id`	provider_x
`model_id`	model_x_large_2026_05
`endpoint_region`	EU, US, UK, SG, sovereign-region-1
`supported_data_classes`	public, internal, masked PII, transaction summary
`blocked_data_classes`	raw PAN, credit bureau, AML notes
`allowed_purposes`	customer_service_account_help, public_product_education
`training_use`	no training, opt-out, separate agreement
`retention_policy`	zero retention or configured retention class
`telemetry_policy`	metadata only, redacted, disabled where available
`subprocessor_ref`	approved vendor record
`key_policy`	provider-managed, BYOK, HYOK, local HSM
`fallback_route`	local smaller model or deny
`approval_ref`	model risk and vendor risk decision
`review_expiry`	date for recertification

13.3 Routing Rules

Request	Route
Public product explanation	global or regional model allowed if product policy allows.
Authenticated account explanation	approved regional model with masked account fields.
Card dispute draft	local/regional model, no raw PAN, controlled logs.
Credit adverse action explanation	high-control route, human review, explainability artifacts.
AML suspicious activity triage	restricted internal route, no external raw prompt.
Marketing creative generation	segment-level prompt, preference suppression, campaign evidence.
Eval regression	synthetic or region-local approved sample.

14. Encryption and Key Residency

Encryption supports residency but does not replace routing and purpose controls.

14.1 Key Questions

Question	Why it matters
Where are keys generated?	Generation location can matter for control claims.
Who controls keys?	Provider-managed keys and customer-managed keys have different risk profiles.
Where can keys be used?	Decryption path may cross boundaries.
Who can access keys?	Admin and break-glass access must be logged and approved.
Are logs and backups encrypted with local keys?	Evidence and DR artifacts also need controls.
Can keys be destroyed on exit?	Vendor exit and deletion depend on key lifecycle.

14.2 Key Residency Patterns

Pattern	Description	Fit
Provider-managed key	Vendor controls key lifecycle	lower sensitivity or low-risk artifact
Customer-managed key	Organization controls key in provider KMS	common regulated cloud pattern
Bring your own key	Organization imports or manages key material	stronger enterprise control
Hold your own key	Key never leaves organization-controlled HSM	high sovereignty posture
Split key / dual control	Multiple parties required for key operation	high-risk evidence vault
Local HSM	Hardware security module in approved region	restricted data and sovereign route

15. Data Minimization Patterns

Minimization is the most practical cross-border risk reducer.

15.1 Minimization Ladder

Level	Pattern	Example
0	deny	raw AML notes never leave internal route
1	field suppression	remove PAN, SSN, account number
2	masking	last4 only, merchant category only
3	tokenization	replace customer ID with scoped token
4	summarization	“three card transactions in dispute window”
5	aggregation	segment-level campaign prompt
6	synthetic data	generated eval cases
7	local processing	keep raw data local and send only answer

15.2 Minimization by AI Artifact

Artifact	Minimization tactic
User prompt	classify and redact before model call.
RAG context	retrieve fewer chunks, mask sensitive fields, include source IDs.
Tool payload	send object ID and required fields only.
Tool result	summarize or mask before prompt re-entry.
Log	store manifest, hashes and decision IDs.
Eval sample	synthetic first, then approved sanitized production samples.
Memory	store preference or stable fact only when purpose allows.
Vendor ticket	attach redacted trace and route decision, not raw payload.

16. Transfer Impact Review Workflow

This workflow is an architecture governance artifact, not a legal conclusion.

16.1 Trigger Events

Trigger	Example
New model provider	adding an external LLM endpoint
New endpoint region	routing EU prompts to non-EU endpoint
New data class	adding credit bureau data to assistant
New purpose	service bot data reused for marketing AI
New vendor telemetry	safety monitoring sends payload samples
New eval process	production chat transcripts sampled for QA
New subprocessor	provider adds downstream analytics processor
New support path	offshore vendor support can view traces
New fallback route	outage route sends data to global endpoint
New key control	provider-managed keys replace local KMS

16.2 Review Steps

Step	Output
1. Define use case and business necessity	use case brief and purpose ID
2. Map data classes and subjects	classification table
3. Draw full AI data path	source, RAG, prompt, model, tools, logs, eval, backups, keys
4. Identify processors/subprocessors	vendor inventory and contract refs
5. Assess minimization alternatives	local, masked, aggregated, synthetic options
6. Define technical safeguards	encryption, key, access, logging, deletion, monitoring
7. Define contractual and operational safeguards	vendor terms, support access, incident, exit
8. Record residual risk and approvals	review decision and expiry
9. Convert decision to runtime policy	model gateway, RAG filters, tool rules, log config
10. Test positive and negative routes	evidence pack and launch gate

17. Evidence Ledger

Evidence ledger is the production proof that policy became runtime behavior.

Ledger events cover residency decisions, RAG retrieval, model route, tool route, log retention, eval sampling, key access, transfer review linkage, vendor changes and kill-switch actions.

17.1 Ledger Schema

Field	Description
`event_id`	Unique ledger event ID.
`event_time`	UTC timestamp.
`interaction_id`	User session, case or workflow ID.
`subject_type`	customer, employee, merchant, representative.
`subject_jurisdiction`	Routing jurisdiction context.
`product_entity`	Booking entity or tenant.
`purpose_id`	Purpose catalog reference.
`data_classes`	Source and derived classes.
`artifact_type`	prompt, RAG chunk, tool payload, log, eval sample.
`processor_id`	Internal or vendor processor.
`subprocessor_ref`	Subprocessor inventory reference.
`source_region`	Where data originated.
`destination_region`	Where data was processed or stored.
`model_id`	Model and version.
`endpoint_region`	Inference endpoint region.
`key_policy_id`	KMS/HSM policy reference.
`decision`	allow, deny, localize, minimize, review, kill switch.
`reason_code`	Structured reason.
`policy_version`	Runtime policy bundle version.
`review_ref`	Transfer/vendor/model review ID.
`evidence_hash`	Hash of evidence manifest where payload is not stored.

17.2 Evidence Query

Proof that a specific interaction used approved route:

SELECT
  event_time,
  interaction_id,
  purpose_id,
  artifact_type,
  processor_id,
  source_region,
  destination_region,
  endpoint_region,
  decision,
  reason_code,
  policy_version,
  review_ref
FROM ai_residency_ledger
WHERE interaction_id = :interaction_id
ORDER BY event_time;

18. Operating Model

18.1 RACI

Activity	PM	Senior BA	Architect	Privacy	Legal	Security	Data Gov	Model Risk	Vendor Risk	Ops
Define AI purpose and customer value	A	R	C	C	C	C	C	C	C	C
Build data path map	C	R	A	C	C	R	R	C	C	C
Classify source and derived artifacts	C	R	C	C	C	C	A	C	C	C
Maintain jurisdiction-purpose-processor matrix	C	R	A	C	C	C	R	C	R	C
Approve legal/privacy interpretation	C	C	C	R	A	C	C	C	C	C
Design runtime policy controls	C	C	A	C	C	R	R	C	C	C
Approve model route	C	C	R	C	C	C	C	A	C	C
Approve vendor route	C	C	C	C	C	C	C	C	A	C
Operate evidence ledger	C	R	C	C	C	R	A	C	C	R
Respond to route incident	C	C	R	R	C	A	R	C	R	R

Legend: R = responsible, A = accountable, C = consulted.

18.2 Governance Forums

Forum	Scope
AI product review	purpose, user value, customer journey, data need.
Architecture review board	data path, region routing, keys, logs, resilience.
Privacy/legal review	applicability, notices, transfer review, contractual terms.
Model risk committee	model route, eval, monitoring, fallback and risk tier.
Vendor risk review	provider, subprocessor, support access, telemetry, exit.
Security review	access control, encryption, KMS/HSM, incident response.
Operational readiness	runbooks, dashboards, support model, kill switch.

18.3 Release Gate

Gate question	Evidence
Is the purpose approved?	purpose catalog entry
Is the data path mapped end to end?	data path diagram and matrix
Are processors and subprocessors approved?	vendor inventory and approval refs
Are model endpoints and regions allowlisted?	provider region register
Are RAG/tool/log/eval controls configured?	policy bundle and tests
Are keys and backups aligned with route?	KMS/HSM and DR evidence
Is transfer impact review complete where triggered?	review ID and residual risk
Are dashboards and KRIs live?	monitoring links and alert rules
Is fallback route safe?	degraded-mode test
Is exit plan feasible?	vendor exit and deletion runbook

19. Metrics and KRIs

19.1 Product Metrics

Metric	What it reveals
Feature adoption by region	Whether local route supports customer value.
Completion rate by route	Whether minimized/local route hurts task completion.
Human handoff by residency denial	Friction caused by blocked routes.
Latency by model region	Customer experience impact.
Cost by sovereign pattern	Unit economics of local/private deployment.
Customer complaint rate about data use	Trust and disclosure risk.

19.2 Control KRIs

KRI	Signal
Denied cross-border attempts	Misconfiguration, product drift or abuse.
Route mismatch rate	Requests not matching approved matrix.
Unclassified artifact count	Data governance gap.
Prompt/log payload policy violations	Observability risk.
Eval samples without lineage	Hidden secondary-use risk.
Vendor endpoint outside allowlist	Provider route breach.
Key access anomalies	Sovereignty and security risk.
Transfer review overdue	Governance backlog.
Subprocessor change unreviewed	Vendor risk gap.
Fallback route activation	Resilience and policy stress.
Withdrawal/consent route conflict	Runtime state propagation gap.

19.3 Executive Dashboard

Theme	Executive question
Customer trust	Can we explain where customer data goes?
Regulatory readiness	Can we prove route and controls for high-risk products?
Operational resilience	Can local routes survive provider and region outages?
Vendor concentration	Which AI capabilities depend on one provider route?
Cost	What is the premium for sovereign routes and is it justified?
Risk appetite	Which residual risks have business acceptance?

20. 30-Day Lab

目标: 30 天内产出一个可展示的 AI Data Residency / Cross-Border / Sovereign AI Architecture portfolio pack。Use case 建议: Retail Banking AI Dispute Assistant。

Days	Focus	Outputs
1-5	Frame use case, subject, product, jurisdiction, purpose and caveats	use case brief, purpose entry, applicability statement
6-10	Map source, RAG, prompt, model, tool, log, eval, backup and key paths	end-to-end data path diagram
11-15	Build data classification, processor inventory and provider region register	classification table, vendor inventory, endpoint register
16-20	Design PDP, RAG, tool, log, eval and key controls	policy spec, corpus manifest, tool policy, key policy
21-25	Simulate transfer impact review and evidence ledger	review record, ledger schema, SQL proof query, KRI list
26-30	Package PM/BA/architect narrative	executive memo, ADR, requirements, interview answers, portfolio deck

21. Interview Answers

Question	30 秒版本	2 分钟重点
data residency、cross-border transfer 和 sovereign AI 区别?	Residency 看 AI artifacts 在哪里存储、处理和访问; transfer 看是否跨 jurisdiction boundary; sovereign AI 看本地控制能力。	拆 prompt、RAG、tool、log、eval、backup、telemetry、keys and support access；说明适用性由 Legal/Privacy/Compliance 按 jurisdiction、subject、product、vendor、contract 判断。
为什么 cloud region 不够?	AI 数据路径不止 database region。	举例 prompt 到外部模型、global observability、offshore labeling、vendor support ticket；用 residency PDP 和 evidence ledger 控制。
如何设计 cross-border RAG?	每个 corpus 有 source region、allowed purposes、data classes、embedding/index region、ACL 和 deletion propagation。	Retrieval 前按 jurisdiction、purpose、role、object scope、consent/authorization 过滤；eval 使用 synthetic 或 region-local sample。
matrix 应该包含什么?	subject jurisdiction、product entity、purpose、data class、processor、endpoint、tool/log/eval/key region、decision、safeguards、review ref。	它是 BA 到 runtime policy 的桥；新增 vendor、data class、eval route 或 fallback route 触发 review。
如何控制 provider region?	所有 model request 走 centralized model gateway。	Provider register 记录 model、endpoint、allowed data classes、retention、no-training、telemetry、subprocessors、key policy、fallback and approval。
key residency 影响什么?	Encryption 不能替代 route control; key location and control affect sovereignty claim。	把 KMS/HSM region、BYOK/HYOK、rotation、break-glass、backup encryption and exit destruction 放进 matrix。
eval data 怎么管?	Eval pipeline 是独立 processing path。	默认 synthetic first；真实 failure sample 需要 classification、redaction、local queue、reviewer control、lineage and retention evidence。
何时选 sovereign/private route?	当 sensitive data、strict jurisdiction、key control、support access、resilience 或 trust 需要更强控制。	用 data sensitivity、purpose criticality、quality、latency、cost、operator control、audit evidence、fallback and exit 做 trade-off。
如何证明未跨越未批准边界?	Evidence ledger 按 interaction_id 记录 RAG、model、tool、log、eval and key decisions。	证据可存 manifest、source IDs、hash、policy version and review refs, 减少 raw payload overcollection。

22. Portfolio Deliverables

Deliverable	What good looks like
Executive memo	Explains why residency is product trust, architecture and risk control.
Use case brief	Customer value, scope, non-scope, jurisdiction assumptions and caveats.
Data classification matrix	Source and derived AI artifacts classified.
Data path diagram	Source, RAG, prompt, model, tools, logs, eval, backup and keys shown.
Jurisdiction-purpose-processor matrix	Route decisions, safeguards, evidence refs and review expiry.
Sovereign pattern ADR	Compares local SaaS, private cloud, sovereign cloud, on-prem and hybrid.
Provider region register	Model endpoints, retention, telemetry, subprocessors, fallback and approvals.
Transfer impact review sample	Route, necessity, safeguards, residual risk and approval trail.
Evidence ledger schema	Event types, fields and one SQL proof query.
KRI dashboard spec	Denied attempts, route mismatch, eval lineage, key anomalies, vendor changes.

Portfolio storyline:

I treated data residency as a runtime AI architecture problem.
I mapped every artifact from user message to RAG, model, tools,
logs, eval, backup and keys, then translated jurisdiction,
purpose, processor and data class into executable policy.

23. Production Readiness Checklist

Every AI capability has approved purpose, data class and route.
Full data path covers RAG, prompts, tools, logs, eval, vendor telemetry, backups and keys.
Jurisdiction-purpose-processor matrix and provider region register are reviewed.
RAG corpus manifests include allowed purposes, regions, ACL and deletion propagation.
Tool gateway enforces purpose, object scope, region and payload minimization.
Logs and eval samples avoid raw payload unless justified and controlled.
Vendor subprocessors, support access, telemetry and exit path are inventoried.
Transfer impact review triggers on route, vendor, data class, purpose, eval or key changes.
Encryption and key residency align with data path and evidence requirements.
Fallback routes, kill switch, KRIs and evidence ledger are tested.
Named owners exist across Product, BA, Architecture, Legal, Privacy, Compliance, Security, Data Governance, Model Risk, Vendor Risk and Ops.

24. Closing View

金融零售 AI 的 data residency 成熟度不在于说出“我们使用某某 region”, 而在于能证明哪些数据被使用、为什么被使用、经过哪些 processor/subprocessor、在哪些 regions 处理和记录、使用哪些 keys、哪些 route 被允许或拒绝、哪些 evidence 可以复核。

真正的目标不是把 AI 全部锁死在一个地方, 而是让每一次 AI data use 都能被设计、限制、路由、最小化、监控、撤销、退出和证明。