AI 底层逻辑 / 经典论文

AI Open Banking / Open Finance：授权数据共享架构

本文是学习、架构训练和作品集材料, 不构成法律意见、监管意见、CFPB 1033 适用性判断、合规期限判断、实体义务判断、第三方授权充分性判断、隐私影响评估结论、模型验证报告、信息安全认证、消费者通知建议、供应商推荐或业务上线批准。

802 行ai-foundations/papers/135-ai-open-banking-open-finance-consented-data-sharing-architecture.md

面向对象: Advanced AI PM / Senior BA / Product Architect / Enterprise Architect / Financial Retail Architect / Data Product Owner / API Platform Lead / Fraud Risk Architect / Privacy / Compliance / Model Risk / Third-Party Risk / Customer Trust Lead。核心问题: AI 系统如何在 open banking / open finance 生态中使用 customer-authorized financial data, 既能提供真实客户价值, 又能控制 consent、data minimization、revocation、API contract、third-party risk、RAG grounding、model feature boundary、fraud/scam、audit evidence 和 customer trust? 学习目标: 建立 consented data sharing architecture、data provider / data recipient / aggregator role model、developer onboarding、API ecosystem governance、AI/RAG over account data、feature boundary、data quality lineage、revocation handling、fraud controls、third-party oversight、evidence replay 和 senior PM/architect decision framework。

0. Disclaimer

正式项目必须由 Legal、Compliance、Privacy、Information Security、Fraud Risk、Financial Crime、Model Risk、Data Governance、API Platform、Third-Party Risk Management、Vendor Management、Customer Experience、Operations、Accessibility、Product Owner、Architecture、Internal Audit 和必要外部顾问共同判断。CFPB 1033 或其他 open banking / open finance 要求的具体适用性, 取决于 product、entity role、data type、customer relationship、jurisdiction、rule status、litigation / regulatory developments、contract structure、partner role、data flow、use case 和 Legal / Compliance interpretation。

本文只讨论架构和治理方法。它不推断任何机构是否是 data provider、authorized third party、data aggregator、service provider 或 covered entity; 不推断任何数据字段是否属于 covered data; 不推断任何合规日期、宽限期、豁免、诉讼影响或监管解释。所有 rule-status 和 implementation 相关问题必须以 Legal / Compliance 对官方来源的最新解释为准。

Source Anchors

Source	Link	用途
CFPB Personal Financial Data Rights final rule	https://www.consumerfinance.gov/rules-policy/final-rules/required-rulemaking-on-personal-financial-data-rights/	用作 U.S. personal financial data rights、secure and reliable consumer / authorized third-party data access、privacy protections、open banking 方向的官方锚点
CFPB Regulation 1033 / Personal Financial Data Rights	https://www.consumerfinance.gov/rules-policy/regulations/1033/	用 § 1033 的 covered data、developer interface、authorized third party、authorization disclosure、third-party obligations、revocation 和 recordkeeping 结构组织架构问题
CFPB Personal financial data rights compliance resource	https://www.consumerfinance.gov/compliance/compliance-resources/other-applicable-requirements/personal-financial-data-rights/	用作 rule status、implementation resources、official interpretations 和 regulatory-development watch 的入口; 不在本文推断适用性或期限
CFPB guidance / circulars landing page	https://www.consumerfinance.gov/compliance/circulars/	用作 CFPB guidance / circulars / supervisory materials 的官方入口, 支持 horizon scanning 和 policy update workflow
FFIEC Authentication and Access to Financial Institution Services and Systems	https://www.ffiec.gov/press/pr081121.htm	用 authentication、access risk assessment、layered security、MFA / equivalent controls、third-party / customer-permissioned entity access 风险组织 access controls
NIST Privacy Framework	https://www.nist.gov/privacy-framework	用 privacy risk management、data processing purpose、governance、control design、customer trust 和 minimization 组织 privacy-by-architecture
NIST AI RMF	https://www.nist.gov/itl/ai-risk-management-framework	用 Govern / Map / Measure / Manage 思路组织 AI 风险识别、eval、monitoring、incident response 和 evidence
ISO/IEC 42001 overview	https://www.iso.org/standard/42001	用 AI management system、policy、roles、operation、performance evaluation、internal audit 和 continual improvement 建立 operating model

一句话:

Consented financial data sharing is not "AI can read bank data". It is a purpose-bound, revocable, API-mediated trust architecture that lets AI consume only authorized data, reason within governed boundaries, and act only with explicit customer and policy authorization.

1. Thesis

AI open banking / open finance 的核心架构变化, 不是把 screen scraping 换成 API, 也不是把银行流水丢给 LLM 总结。真正变化是:

from: broad data harvesting + credential sharing + opaque enrichment
to: customer-authorized access + standardized API contract
  + purpose-bound use + revocation + evidence replay
  + AI data-use boundary + fraud and third-party controls

成熟架构必须同时回答十二个问题:

客户授权的是哪个 data recipient, 面向哪个 requested product or service, 访问哪些 data categories, 多长时间, 多高频率?
data provider、data recipient、data aggregator、developer app、AI platform 和 downstream processor 各自是什么角色?
API contract 如何表达 scope、field semantics、quality、latency、pagination、error、consent state、revocation 和 incident behavior?
AI 是否只使用为当前 purpose 授权的数据, 是否阻止 cross-sell、targeted advertising、unrelated profiling 和 unrestricted training?
客户能否像授权一样容易地 revoke, revoke 后 collection、use、retention、cache、vector index、features 和 agent sessions 如何停止或降级?
data minimization 是产品策略、API scope、feature engineering、prompt grounding 和 retention policy 的共同约束, 还是只是隐私文案?
RAG over account data 是否做到 tenant / customer isolation、source citation、freshness、entitlement check、prompt-injection defense 和 evidence capture?
model features 是否有 allowed uses / prohibited uses, 是否把 transaction patterns 变成未经授权的 sensitive inferences?
fraud / scam controls 是否覆盖 account takeover、authorized push payment scam、malicious third-party app、data exfiltration、synthetic behavior 和 agent overreach?
developer onboarding 是否有 security, privacy, data-use, operational, financial crime, model-risk 和 exit controls?
客户价值是否足够清楚, 让客户理解自己为什么要分享数据, 以及分享后能获得什么?
事后能否重放 authorization disclosure、consent receipt、API calls、data transformations、AI run、human decision、customer communication 和 revocation handling?

关键原则:

Consented does not mean unlimited.
Available through API does not mean usable for every model.
Data portability does not mean downstream data resale.
Account history does not mean behavioral surveillance license.
AI personalization does not override purpose limitation.
Revocation must affect data, features, embeddings, agents and evidence.

2. Why It Matters

金融零售 AI 正把 open banking 从 data-access program 推向 AI decision and action ecosystem:

Pressure	表现	架构含义
Consumer data portability	客户希望把账户、交易、余额、账单、账户验证信息带到新服务	需要 API-based, standardized, machine-readable, secure access, 不能依赖客户凭证共享
Personal financial management	AI 预算助手、现金流预测、订阅优化、债务建议、理财建议需要账户数据	需要 purpose-bound access、explainability、data freshness、recommendation suitability boundary
Open finance expansion	Banking data 之外扩展到 payroll、tax、brokerage、insurance、pension、merchant、wallet、loyalty	需要跨域 consent registry、semantic mapping、data quality 和 jurisdiction policy
Embedded finance	第三方 app 在贷款、支付、储蓄、BNPL、保险、商户工具中调用金融数据	需要 developer onboarding、TPRM、API controls、consumer disclosure 和 monitoring
Agentic AI	AI agent 代表客户查询账户、准备申请、比较产品、发起服务请求	需要 read / reason / recommend / act 分级授权, action-bound approval 和 revoke propagation
Fraud industrialization	恶意 app、scam scripts、ATO、social engineering、data exfiltration、credential stuffing	需要 FFIEC-style layered access controls、behavioral monitoring、third-party anomaly controls
Customer trust	客户担心“银行数据被拿去训练模型、卖广告、交叉销售、无法撤回”	需要透明 value proposition、minimization、revocation receipt、complaint route 和 audit evidence

高级 PM / Architect 的问题不是:

How do we plug Plaid-like data into the AI feature?

而是:

For this customer-requested outcome,
what data is necessary,
which party is authorized to collect it,
how is access technically constrained,
how will AI use be bounded and evidenced,
what happens on revocation,
and how do we prove the ecosystem behaved as promised?

consented data sharing 与 unrestricted data harvesting 的边界必须写进产品、API、数据和模型架构, 不能只写进 privacy notice。

Dimension	Consented data sharing	Unrestricted data harvesting
Customer intent	客户为 requested product / service 授权特定 data access	客户被迫接受宽泛条款, 数据被用于不相关目的
Access method	API / developer interface / scoped token / consent state / audit log	screen scraping、credential sharing、暗箱 SDK、brokered datasets
Scope	data categories、duration、frequency、provider、recipient、purpose 受限	尽可能多拿、长期保留、跨产品复用
Data minimization	只请求完成当前价值主张所需字段	先收集再找用途
AI use	grounded、purpose-bound、feature boundary 明确	用于训练、画像、广告、交叉销售或推断敏感属性
Revocation	revocation 影响 collection、API token、cache、feature、embedding、agent workflow	只停止新抓取, 历史数据继续漂移使用
Evidence	consent receipt、API calls、data lineage、AI runs、decision logs 可重放	事后无法证明客户同意了什么
Customer control	可查看、撤销、投诉、纠错、转移	客户不知道谁拿了数据, 也不知道如何停止

判定测试:

If the customer cannot explain who gets what data for what purpose,
and the institution cannot technically enforce that boundary,
the architecture is data harvesting with a consent screen.

4. Ecosystem Role Model

AI open banking 必须先做 role clarity, 再做 data flow。一个机构可能在不同旅程中同时扮演 data provider、data recipient、aggregator client、model operator 或 service provider。

Role	Responsibility	AI architecture question
Customer / consumer	授权、撤销、查看分享历史、请求服务、纠错	客户是否理解 AI 会如何使用数据, 能否撤销和获得解释?
Data provider	持有账户、交易、余额、条款、账单或账户验证信息的金融服务方	developer interface 是否安全、可靠、标准化、可监控, 且不要求第三方使用客户凭证?
Authorized data recipient	为客户请求的产品/服务访问数据的一方	是否只收集、使用、保留合理必要的数据, 是否可证明目的限制?
Data aggregator	连接 provider 和 recipient 的中介或 service provider	aggregator 是否有 contract pass-through、security、revocation propagation、accuracy 和 incident controls?
Developer app	客户面对的应用, 可能嵌入 AI assistant	app 是否通过 onboarding、security review、data-use attestation 和 monitoring?
AI system / model operator	使用数据做总结、预测、推荐、自动化或 agent action	模型是否只能访问授权上下文, 是否把 data class 和 purpose 作为 runtime control?
API platform	管理 client registration、token、scope、rate limit、schema、observability	scope 是否足够细, error / revocation / outage behavior 是否契约化?
Data governance	管理 lineage、quality、retention、catalog、feature store、embedding store	open banking data 是否被标记为 consent-bound asset?
Fraud / security	防止 unauthorized access、scam、ATO、malicious recipient、exfiltration	是否监控 third-party access pattern 和 customer-permissioned entity risk?
Legal / Compliance / Privacy	解释义务、审查 notice、approval、retention、customer rights	rule status、jurisdiction、entity role 和 data type 是否被持续更新?

架构上建议把每个 use case 建成 data sharing passport:

use_case_id
customer_requested_product_or_service
data_provider
data_recipient
aggregator_if_any
data_categories
purpose
duration
frequency
consent_version
revocation_route
ai_allowed_uses
ai_prohibited_uses
retention_rule
evidence_bundle_id

5. Data Scope and Open Finance Taxonomy

open banking 常从账户和支付相关数据起步, open finance 会扩展到更广泛的金融生活图谱。AI 架构必须避免把范围扩大解读为“任何有用数据都可以拿”。

Data family	Examples	AI value	Boundary
Account identity / verification	name, address, email, phone, account identifier, account ownership signal	account opening, payout setup, fraud review	不等于完整 KYC, 不应扩展为 unrelated identity profiling
Transaction history	amount, date, merchant, category, pending status, fees	cash-flow intelligence, affordability, budget, scam detection	sensitive inference risk 高, 需要 purpose and feature controls
Balance and liquidity	current / available balance, overdraft, credit limit	cash-flow warning, transfer timing, credit line review	freshness and timing critical, stale data can harm customers
Terms and conditions	fees, APR/APY, credit limit, rewards, overdraft opt-in	product comparison, fee optimization, advice	AI must cite source and avoid unsupported legal interpretation
Upcoming bills / scheduled payments	due amount, due date, biller, recurring obligations	proactive reminders, liquidity planning, hardship support	notification and action authorization must be separate
Payment initiation information	routing/account/tokenized information where applicable	account funding, verification, payment setup	high-risk; action permission and fraud controls separate from data access
Payroll / income	employer, pay frequency, net/gross pay, tax data	income verification, affordability, benefits	open finance extension; stronger consent, fairness, accuracy controls
Brokerage / pension / insurance	holdings, contributions, claims, premiums	holistic advice, risk planning	regulated advice and suitability boundaries likely apply
Merchant / small-business finance	POS sales, invoices, receivables, inventory, bank feeds	SME underwriting, cash-flow forecast, working capital	business consent, multi-user authority, data quality and role authority

Design rule:

Each data family must carry:
source, consent, purpose, freshness, quality, allowed uses,
AI use boundary, downstream sharing rule, retention, revocation action.

Consent 是客户允许访问和使用数据的表达。Authorization 是系统和政策允许某个 actor 做某件事。Authentication 是确认当前 session / user / client 的控制权。三者不可混用。

Layer	Question	Artifact
Customer disclosure	客户是否看到 third party、data provider、product/service、data categories、duration、revocation method?	authorization disclosure version, UI capture
Express consent	客户是否明确授权, 且签署/确认记录可重放?	consent receipt, timestamp, channel, language
API authorization	client / aggregator 是否获得 scoped token, 只能访问授权 categories?	OAuth grant, scope, token hash, mTLS / client credential
Runtime entitlement	每次 API / RAG / feature call 是否检查 consent state and purpose?	entitlement check log
Revocation	客户撤销后是否停止新 collection, 通知 provider/aggregator/downstream, 处理 cache and retention?	revocation receipt, propagation log
Reauthorization	长期或持续访问是否有重新授权机制?	reauthorization event
Action authorization	AI agent 或 app 要执行支付、提交申请、变更账户时是否另行批准?	action approval, transaction confirmation

Consent object 建议字段:

consent_id
customer_ref
data_provider_id
data_recipient_id
aggregator_id
requested_product_or_service
data_categories
field_scope
purpose
collection_frequency
duration_limit
granted_at
expires_at
revoked_at
revocation_method
language
disclosure_version
ai_allowed_uses
ai_prohibited_uses
training_allowed_flag
downstream_sharing_allowed_flag
evidence_bundle_id

弱设计:

Customer clicked "Connect bank"; all transaction data enters generic data lake.

强设计:

Customer authorized cash-flow assistant to access transaction history and balance
from Bank A for 90 days.
API tokens are scoped.
RAG and feature store check consent before use.
Cross-sell, targeted ads and model training are prohibited unless separately authorized and approved.
Revocation stops collection, disables derived features where required, and closes agent sessions.

7. Reference Architecture

参考架构:

customer
  -> consent and authorization disclosure UX
  -> data recipient / developer app
  -> aggregator or direct API connection
  -> data provider developer interface
  -> API gateway / auth / scope / rate limit
  -> raw consent-bound data zone
  -> normalization / quality / lineage service
  -> purpose-bound feature store
  -> customer-isolated vector / retrieval index
  -> AI orchestration and tool policy
  -> recommendation / decision / agent action workflow
  -> human review / fraud / operations
  -> customer communication
  -> evidence ledger / monitoring / revocation service

关键组件:

Component	Responsibility	Senior design question
Consent registry	保存 authorization disclosure、scope、duration、revocation、AI allowed uses	是否是 runtime enforcement source, 还是只做记录?
Developer onboarding portal	注册 app、certification、security review、sandbox、contract、data-use attestations	onboarding 是否能阻止低信任 app 进入生产 API?
API gateway	client authentication、token validation、scope、rate limit、mTLS/FAPI-style controls、logging	API 是否能按 data category and purpose 限制访问?
Data provider interface	标准化、机器可读、可靠、可监控的数据接口	是否有 uptime, error semantics, support, dispute and incident process?
Data normalization layer	merchant/category mapping、schema mapping、dedup、timezone、currency、pending/posted status	enrichment 是否保留 source lineage and uncertainty?
Quality service	completeness、freshness、accuracy feedback、provider variance、drift	AI 是否知道数据不完整、过期或待确认?
Feature boundary service	把 consent-purpose 映射到 allowed features and prohibited features	model features 是否可追踪到 consent and purpose?
Retrieval layer	customer-specific source retrieval、embedding lifecycle、source citation、entitlement check	RAG 是否避免跨客户、跨目的、跨产品泄露?
AI policy engine	控制 prompt、tools、outputs、action approval、human review	AI 是否能被 policy deny, 还是只能靠 prompt 提醒?
Fraud and scam controls	third-party anomaly、ATO、APP scam、malicious recipient、velocity、device	open banking data 是否成为 scammer 的 leverage?
Evidence ledger	串联 consent、API、data quality、AI run、decision、communication、revocation	投诉和审计能否重放整个 reliance chain?

架构边界:

consent registry
  -> entitlement checks
  -> API scopes
  -> data lake tags
  -> feature store policies
  -> vector index lifecycle
  -> prompt/tool policies
  -> evidence retention

如果 consent 只在前端展示, 不进入上述控制链, 就不是 production-grade consented data sharing。

8. API Contract and Developer Onboarding

Open banking 的产品体验由 API contract 决定, 不由单个 AI prompt 决定。API contract 应覆盖数据语义、可靠性、授权状态、错误、撤销和运营承诺。

8.1 API Contract

Domain	Contract requirement	AI impact
Data categories	transaction, balance, terms, bill, verification 等分类清晰	模型不会把字段误用到错误目的
Schema semantics	posted vs pending, merchant vs payee, balance type, timezone, currency	cash-flow model and RAG answer 不会错误推理
Consent state	active, expired, revoked, suspended, reauthorization needed	runtime entitlement and agent stop conditions
Field-level scope	read categories and optional fields tied to consent	minimization and feature gating
Freshness	last_updated_at, as_of_time, provider latency	AI 输出必须说明数据时点
Error model	unavailable, unauthorized, provider_denied, scope_missing, customer_action_needed	AI 不能把 API error 当成客户财务事实
Pagination and history	historical window, cursor, duplicates, revision handling	transaction analysis reproducible
Quality feedback	dispute, correction, categorization feedback, provider issue	model drift and data quality improvement
Rate and performance	response SLO, scheduled downtime, throttling	agent workflow degrade gracefully
Security	client auth, token binding, encryption, key rotation, logging	third-party access monitorable
Revocation	notification, token termination, downstream propagation, retention action	model features and embeddings lifecycle controlled
Audit	correlation id, consent id, request id, response hash	evidence replay possible

8.2 Developer / Third-Party Onboarding

Gate	Evidence	Reject / restrict when
Business purpose review	product/service description, data categories, customer value	purpose too broad, data request not necessary
Security review	secure SDLC, auth, encryption, secrets, vulnerability mgmt, incident response	credential sharing, weak token handling, no monitoring
Privacy review	minimization, retention, downstream sharing, AI use, customer rights	targeted ads / resale / unrelated training built into model
Operational review	support, uptime, reconciliation, error handling, revocation process	no customer support or revocation workflow
Model-risk review	AI use case, features, prompts, evals, human oversight	identity/eligibility/credit decisions unsupported
Fraud review	scam controls, anomalous access, device/account risk	high-risk journeys lack step-up or monitoring
Contracting	data-use restrictions, audit rights, subprocessors, breach notice, exit	cannot enforce obligations downstream
Sandbox certification	test cases, negative scenarios, consent/revocation tests	app fails scope, revocation, data minimization

Developer onboarding 要把“能调用 API”变成“能被治理”。对于 AI app, 还要额外收集:

model providers
prompt / tool architecture
whether customer data enters model training
feature store and vector store lifecycle
customer-facing claims
human review triggers
advice / recommendation boundaries
eval evidence
complaint and correction flow

9. Data Quality, Enrichment and Lineage

Open banking data 的可用性不等于高质量。交易数据尤其容易出现 merchant normalization、pending/posted duplication、category inconsistency、refund matching、subscription detection、timezone、currency、joint account 和 business/personal mix 的问题。

Quality issue	AI failure	Control
Pending and posted duplicates	AI 高估支出或误报异常	transaction lifecycle state and dedup logic
Merchant alias ambiguity	错误识别商户、订阅或诈骗	merchant enrichment confidence + source trace
Category inconsistency	预算建议错误、cash-flow forecast drift	model-aware taxonomy mapping and feedback
Missing historical data	affordability or trend model biased	coverage metric and answer caveat
Stale balance	AI 建议转账/付款造成透支	freshness gate and action block
Joint account context	AI 把共同账户行为归因给单个客户	account ownership and household boundary
Business/personal commingling	SME cash-flow model误读业务状态	account purpose classification with review
Data provider variance	不同银行字段语义不一致	provider-specific schema adapters
Customer correction ignored	错误持续影响推荐	correction workflow and feature recompute

Enrichment 要分层:

raw_provider_data
normalized_data
enriched_data
derived_features
model_inferences
customer_visible_answer

每层都应保留:

source
transformation_version
confidence
consent_id
purpose
quality_score
freshness
allowed_uses
retention_rule
revocation_action

高级原则:

Never let enrichment erase consent, uncertainty or lineage.

10. Model Feature Boundaries

AI 使用 open banking data 最常见的风险, 是把交易历史变成无边界的行为画像。Feature boundary 要在 feature store、model serving、RAG retrieval 和 policy engine 中可执行。

Feature class	Example	Allowed use	Prohibited or high-risk use
Customer-requested utility	recurring bill detection, cash-flow forecast, fee alert	budgeting, alerts, customer-selected recommendation	unrelated cross-sell without authorization
Risk / fraud	unusual payee, device + transaction anomaly, scam pattern	protect account, step-up, human review	opaque denial without evidence and review
Affordability / credit	income volatility, expense obligations, cash buffer	governed underwriting support where approved	proxy discrimination, unsupported adverse action
Vulnerability / hardship	missed payments, overdraft stress, income drop	supportive outreach and relief options	exploitative pricing or pressure sales
Marketing segmentation	spending category affinity, life event	only where separately permitted and governed	targeted ads or sale if outside purpose
Sensitive inference	health, religion, union, political, gambling, addiction, immigration	generally avoid, mask, or route to privacy review	model training or eligibility use without explicit policy

Feature record 建议:

feature_name
source_data_categories
consent_purpose
customer_requested_service
allowed_products
allowed_decisions
prohibited_decisions
model_ids
human_review_required
fairness_review_required
retention
revocation_action
evidence_refs

不要只问“模型能不能预测”。要问:

Should this feature exist under this customer authorization?
Can the customer understand the value?
Can the institution explain the use?
Can revocation unwind future use?
Can monitoring detect misuse?

11. RAG over Account Data

RAG over account data 是高价值场景, 也是高风险场景。账户数据不是通用知识库, 它是 consent-bound, customer-specific, time-sensitive financial evidence。

11.1 Safe RAG Pattern

user question
  -> authenticate user/session
  -> check consent and purpose
  -> determine allowed accounts/data categories/time window
  -> retrieve source transactions/balances/terms
  -> apply data quality/freshness filters
  -> ground answer with citations and caveats
  -> block prohibited advice/actions
  -> capture AI run and sources

11.2 RAG Controls

Control	What it prevents
Per-customer index or strict row-level retrieval	cross-customer leakage
Consent-aware retriever	use after revocation or outside purpose
Time-window filters	overcollection and stale answers
Source-grounded response	hallucinated financial facts
Quality and freshness tags	false precision in cash-flow advice
Prompt-injection filtering	merchant memo or transaction text manipulating the AI
Output policy	prohibited legal, tax, investment or credit conclusions
Embedding lifecycle	revoked data remaining searchable
Retrieval audit	inability to prove which data informed answer
Human escalation	hardship, scam, complaint or high-impact decision

弱回答:

"You can afford this loan because your bank data looks healthy."

强回答:

"Based on transactions you authorized for this budgeting feature,
your average monthly inflow over the selected three-month period was X
and recurring obligations we detected were Y.
This is a budgeting estimate, not a credit approval or financial advice.
Some transactions may be pending or miscoded."

RAG 不应:

使用 revoked consent 的历史 embeddings 继续回答。
把 transaction memo 当成绝对事实。
从交易推断敏感属性并用于 eligibility。
让 agent 自动发起付款或申请而无 action approval。
把单一账户数据说成客户完整财务状况。
在投诉中无法列出支撑回答的 source rows。

12. Agentic AI and Action Boundaries

AI agent 可以读、解释、建议, 但“行动”必须单独治理。Open banking data access 不等于 payment authorization、product application authorization、account change authorization 或 advice acceptance。

Agent capability	Example	Required control
Read	查询余额、交易、账单	consent + scope + customer/session auth
Reason	总结现金流、找订阅、识别费用	grounded RAG + quality caveats + prohibited-use guardrail
Recommend	建议预算调整、提醒账单、建议联系银行	recommendation boundary + customer explanation
Prepare	预填申请、生成 dispute / hardship package	source citation + customer review
Submit	提交申请、发送证明、开 case	action-bound approval + evidence
Initiate value movement	payment, transfer, withdrawal	separate payment authorization, step-up, scam controls
Change account	更新地址、取消订阅、关闭产品	business authorization + customer confirmation

Agent authorization record:

agent_id
workflow_run_id
customer_ref
data_access_consent_id
allowed_tools
allowed_data_categories
purpose
time_bound
action_bound_approval_required
approval_id
revocation_reference
human_review_triggers
evidence_bundle_id

关键原则:

Data access consent lets the agent know.
Action authorization lets the agent do.

13. Fraud, Scam and Access Controls

Open banking / open finance 会改变欺诈攻击面。数据分享本身可能成为 fraud vector, 尤其当恶意 app 获取客户授权、诈骗者诱导授权、或账户数据被用于 social engineering。

Threat	Pattern	Control
Malicious data recipient	app 以预算服务名义收集大量数据后外泄或滥用	developer onboarding, data-use attestation, monitoring, audit rights, suspension
Consent phishing	客户被引导授权给仿冒 app 或 scammer-controlled service	verified developer identity, customer warning, domain/app reputation
Account takeover	攻击者登录客户 app 后授权数据分享	MFA/step-up, device risk, session risk, revocation notification
Credential sharing fallback	第三方绕过 API 要客户交出 bank credentials	contractual prohibition, detection, education, ecosystem enforcement
API exfiltration	正常 token 被批量滥用	rate limit, anomaly detection, token binding, IP/device intelligence
Authorized push payment scam	AI / app 建议或准备向诈骗收款人付款	payee risk, cooling-off, step-up, scam intervention, human support
Data poisoning	merchant text / memo 注入 prompt 或错误分类	prompt-injection defense, data sanitation, source weighting
Synthetic affordability	欺诈者操纵账户流入流出制造收入	graph/velocity controls, payroll/source diversity, fraud review
Revocation bypass	downstream party 继续使用缓存或 derived data	revocation propagation, retention rules, control testing
Third-party outage	app 无法取数, AI 给出错误建议	degrade mode, freshness warnings, retry and customer message

FFIEC-style design lessons:

authentication risk assessment must include customers, employees, third parties, applications, service accounts and devices。
MFA or equivalent strength controls should be risk-proportionate, especially for high-risk access and actions。
layered controls matter because any single control can fail。
customer-permissioned entity access needs specific risk assessment, monitoring, logging and reporting。

14. Governance Model

将 NIST Privacy Framework、NIST AI RMF 和 ISO/IEC 42001 组合成 operating model:

Governance lens	Open banking question	AI control
Privacy risk	是否识别 processing purpose、data minimization、retention、customer rights 和 data sharing risk?	consent registry, purpose binding, privacy review, DPIA-like evidence
AI risk	AI 是否被 map 到客户影响、data inputs、model behavior、evals、monitoring 和 incident response?	AI RMF-style govern/map/measure/manage workflow
Management system	是否有 policy、roles、process、performance evaluation、internal audit 和 continual improvement?	ISO 42001-style AIMS operating rhythm
Security and access	是否对 customer / third party / app / API / service account 做 risk-based authentication?	FFIEC-style risk assessment and layered access controls
Regulatory horizon	rule status、guidance、litigation、standard-setting、state/international changes 是否被更新?	regulatory change workflow and product impact assessment

14.1 Operating Evidence

Evidence	Why it matters
Use case risk assessment	证明为何需要数据和 AI
Consent / disclosure artifact	证明客户看见并授权了 scope
Data minimization matrix	证明未过度收集
API contract review	证明技术边界可执行
Third-party due diligence	证明 recipient / aggregator 被治理
Model/data policy	证明 AI use bounded
Eval and red-team result	证明模型不越权、不幻觉、不泄露
Fraud threat model	证明 abuse scenarios 被覆盖
Revocation test	证明撤销影响 collection/use/cache/features/embeddings
Evidence replay test	证明投诉/审计能重放

15. Product / Architecture Decisions

Decision	Weak answer	Strong architecture answer
What data should we request?	“All transactions, just in case”	Request minimum categories, accounts and time window for the customer-requested service
How long should access last?	“Until user disconnects”	Purpose-specific duration, reauthorization, runtime consent check and expiration behavior
Can AI train on the data?	“It improves the product”	Separate explicit policy/consent, de-identification review, model-risk approval and opt-out handling
Can we use data for cross-sell?	“Customer connected bank data”	Only if separately permitted and consistent with purpose, policy and customer expectation
Direct API or aggregator?	“Fastest SDK”	Evaluate coverage, consent UX, contract, security, revocation propagation, data quality and exit
RAG architecture?	“Index all account data”	Customer/purpose-isolated retrieval, source grounding, freshness, revocation-aware embedding lifecycle
Fraud controls?	“Bank authenticated the customer”	Risk-based authentication, app reputation, access anomaly, scam intervention and human escalation
Revocation?	“Delete token”	Stop collection, notify ecosystem, enforce retention, disable derived features and purge/restrict embeddings
Advice boundary?	“AI gives financial advice”	Define information, education, recommendation, regulated advice and decision boundaries
Evidence?	“Logs exist”	Case-level bundle linking consent, API calls, data transforms, model run, decision and customer communication

16. Control Matrix

Control objective	Control activity	Evidence
Establish lawful/policy basis through governance	Legal/Compliance/Privacy review of role, jurisdiction, data type and product scope	approval record, interpretation memo
Minimize data collection	Data category/time-window/account selection tied to customer-requested service	minimization matrix, API scopes
Preserve informed authorization	Clear disclosure of recipient, provider, service, categories, duration and revocation	disclosure version, signed consent receipt
Enforce consent at runtime	Entitlement check for API, feature store, RAG and agent tools	consent check log, denied access events
Support revocation	Easy revocation, ecosystem notification, collection stop, retention and derived-data handling	revocation receipt, propagation log
Govern API access	Client registration, authentication, token binding, scopes, rate limits, anomaly monitoring	developer record, gateway logs
Manage third-party risk	Due diligence, contractual restrictions, audit rights, incident obligations, exit plan	TPRM file, contract clauses
Assure data quality	Freshness, completeness, dedup, correction, source lineage and quality scores	quality dashboard, lineage records
Bound AI features	Allowed/prohibited uses for features, inferences, training and decisions	feature policy, model card, eval result
Secure RAG	Customer/purpose isolation, source citations, prompt injection controls, retrieval logs	RAG eval, source trace
Control agent actions	Separate action-bound approval for submission/payment/account changes	approval id, action hash
Detect fraud/scam	Monitor malicious recipient, ATO, API exfiltration, APP scam and synthetic behavior	fraud dashboard, case evidence
Preserve audit replay	Link consent, API, data, AI, human and final customer message	evidence bundle
Improve continuously	Incident, complaint, model drift and data-quality CAPA	RCA, CAPA owner, closure evidence

17. Metrics and KRIs

Metric family	Examples
Customer value	successful connections, time-to-value, budgeting accuracy, fee savings, subscription cancellation success, cash-flow alert usefulness
Consent quality	disclosure comprehension, consent drop-off, over-scope defects, reauthorization completion, revocation success
Data minimization	average data categories requested, account/time-window scope, full-history request rate, unused data ratio
API ecosystem	developer approval time, API uptime, response time, error rates, provider coverage, schema variance
Data quality	freshness, completeness, duplicate rate, categorization confidence, correction rate, provider issue backlog
AI/RAG safety	grounded-answer rate, unsupported conclusion rate, source citation completeness, prompt-injection block rate
Feature governance	features with consent lineage, prohibited-use violations, sensitive inference incidents, revocation feature disablement
Fraud/scam	malicious recipient attempts, abnormal API calls, ATO-linked authorizations, scam escalation, manual review outcomes
Third-party risk	overdue reviews, subprocessor changes, audit findings, incident notifications, exit-test completion
Customer trust	complaints about data sharing, unclear use, revocation failure, wrong AI advice, perceived surveillance
Evidence	replay completeness, missing consent id rate, missing AI trace rate, retention-rule defects

Balanced executive dashboard:

Value: customers get useful financial outcomes.
Control: every data use is purpose-bound and revocable.
Safety: AI does not exceed consent, evidence or advice boundaries.
Trust: customers can understand, revoke and contest.
Ecosystem: APIs and third parties perform reliably and securely.
Evidence: each answer, recommendation and action can be replayed.

18. Failure Modes

Failure mode	Why dangerous	Better control
Consent screen as blanket waiver	客户授权被解释成无限使用	purpose-bound consent object and runtime enforcement
Generic data lake ingestion	open banking data loses consent, purpose and revocation metadata	consent-bound data zone and lineage tags
RAG indexes all account data	cross-purpose and post-revocation leakage	isolated, entitlement-aware retrieval
AI infers sensitive traits from transactions	unfairness, privacy harm, unsupported decisions	sensitive inference policy and prohibited uses
Revocation only deletes API token	cache/features/embeddings continue to use data	revocation propagation and lifecycle testing
API error treated as financial fact	model says customer has no funds when API failed	error-aware prompts and freshness gates
Aggregator selected only for coverage	hidden security, quality, contract and exit risks	TPRM and architecture review
Customer credentials still collected	defeats safer open banking model	prohibit credential sharing and monitor fallback
Developer app over-requests data	privacy harm and customer distrust	scope review, sandbox certification, monitoring
Model training on consented data by default	violates purpose expectation and data-use boundary	explicit AI training policy and approval
Agent initiates action under data consent	unauthorized payment/application/account changes	separate action authorization
Complaint lacks evidence chain	cannot explain or remediate	consent-to-AI-to-decision evidence bundle

19. Interview-Ready Takeaways

Consented data sharing 是 customer-requested service 下的 purpose-bound, scoped, revocable, evidenced data access。Data harvesting 是用宽泛同意收集尽可能多的数据, 再用于不相关目的、训练、广告、交叉销售或转售。架构区别在于 consent 是否进入 API scope、data lineage、feature store、RAG retrieval、agent tool 和 revocation enforcement。

Q2: AI open banking 架构中最重要的控制点是什么?

不是单一 API 或模型, 而是 consent registry 作为 runtime control plane。它要驱动 API scope、data lake tags、feature policies、RAG retrieval、agent permissions、retention and revocation。没有这个 control plane, AI 很容易把授权访问变成无边界使用。

Q3: 如何设计 RAG over account data?

按 customer、purpose、consent 和 time window 做隔离检索; 每次检索检查 entitlement; 输出必须引用 source rows 和 freshness; prompt-injection、sensitive inference、prohibited advice 和 agent action 都要受 policy engine 控制; revocation 后要停止未来检索并处理 embeddings lifecycle。

Q4: open banking 数据能否用于模型训练?

不能默认可以。训练用途要和 customer-requested service、授权、隐私政策、合同、适用法律/监管解释、de-identification、retention、model-risk approval 和 customer expectation 一起评估。尤其不能把为预算或账户验证授权的数据自动用于 unrelated model training。

Q5: Senior PM 如何评估 open banking AI feature?

同时看 value, consent, minimization, API reliability, data quality, third-party risk, model boundary, fraud/scam controls, revocation, evidence and customer trust。能上线的不是“模型很聪明”, 而是“客户授权清楚、系统边界可执行、撤销有效、结果可解释、风险可监控”。

20. Practical Templates

Field	Example
use_case	AI cash-flow assistant
customer_requested_service	monthly cash-flow insight and bill reminder
data_provider	customer-selected bank or wallet provider
data_recipient	financial planning app
aggregator	approved aggregator if used
data_categories	transaction history, balance, upcoming bills
time_window	last 6 months plus current balance
purpose	budgeting and bill reminder
excluded_uses	targeted ads, unrelated cross-sell, sale of covered data, default model training
AI allowed uses	summarize, forecast, explain, alert
AI prohibited uses	credit eligibility, identity proofing, regulated advice without approval
revocation_action	stop collection, disable refresh, restrict features, purge/restrict embeddings per policy
evidence	consent receipt, API request ids, source rows, AI run id, final answer

20.2 API Scope Contract

Scope name:
Data category:
Fields:
Purpose:
Required consent version:
Maximum lookback:
Refresh frequency:
Freshness requirement:
Error semantics:
Revocation behavior:
Downstream sharing:
AI allowed uses:
AI prohibited uses:
Retention:
Evidence fields:
Owner:

20.3 AI Feature Boundary Record

Feature:
Source data categories:
Customer-requested service:
Consent ids:
Transformation lineage:
Allowed model ids:
Allowed decisions:
Prohibited decisions:
Sensitive inference risk:
Fairness review:
Human review triggers:
Revocation action:
Monitoring metric:
Evidence reference:

20.4 RAG Answer Evidence Record

Question:
Customer/session:
Consent check:
Purpose:
Retriever policy:
Source records:
Data freshness:
Quality caveats:
Prompt version:
Model version:
Output policy checks:
Final answer:
Customer message id:
Retention rule:

20.5 Revocation Runbook

Revocation event:
Customer:
Data recipient:
Data provider:
Aggregator:
Consent ids:
API token action:
Data collection stop time:
Cache action:
Feature action:
Embedding action:
Agent/session action:
Downstream notification:
Customer receipt:
Exceptions:
Evidence bundle:
Owner:

21. Final Operating Principle

成熟的 AI open banking / open finance architecture 可以用一个问题检验:

Can the institution prove that every AI answer, feature, recommendation and action
used only the financial data the customer authorized for that purpose,
through governed APIs and third parties,
with quality and freshness understood,
with revocation propagated,
with fraud and scam controls active,
with inferences separated from facts,
and with evidence sufficient for audit, complaint and customer trust?

如果答案不清楚, 企业不是缺一个 open banking connector。它缺的是 consent control plane、API ecosystem governance、AI feature boundary、RAG evidence、third-party risk controls、fraud/scam monitoring、revocation lifecycle 和 customer trust operating model。