返回 Papers
AI 底层逻辑 / 经典论文

AI Open Banking / Open Finance:授权数据共享架构

本文是学习、架构训练和作品集材料, 不构成法律意见、监管意见、CFPB 1033 适用性判断、合规期限判断、实体义务判断、第三方授权充分性判断、隐私影响评估结论、模型验证报告、信息安全认证、消费者通知建议、供应商推荐或业务上线批准。

802ai-foundations/papers/135-ai-open-banking-open-finance-consented-data-sharing-architecture.md

AI Open Banking / Open Finance / Consented Data Sharing Architecture 解读

面向对象: Advanced AI PM / Senior BA / Product Architect / Enterprise Architect / Financial Retail Architect / Data Product Owner / API Platform Lead / Fraud Risk Architect / Privacy / Compliance / Model Risk / Third-Party Risk / Customer Trust Lead。 核心问题: AI 系统如何在 open banking / open finance 生态中使用 customer-authorized financial data, 既能提供真实客户价值, 又能控制 consent、data minimization、revocation、API contract、third-party risk、RAG grounding、model feature boundary、fraud/scam、audit evidence 和 customer trust? 学习目标: 建立 consented data sharing architecture、data provider / data recipient / aggregator role model、developer onboarding、API ecosystem governance、AI/RAG over account data、feature boundary、data quality lineage、revocation handling、fraud controls、third-party oversight、evidence replay 和 senior PM/architect decision framework。

0. Disclaimer

本文是学习、架构训练和作品集材料, 不构成法律意见、监管意见、CFPB 1033 适用性判断、合规期限判断、实体义务判断、第三方授权充分性判断、隐私影响评估结论、模型验证报告、信息安全认证、消费者通知建议、供应商推荐或业务上线批准。

正式项目必须由 Legal、Compliance、Privacy、Information Security、Fraud Risk、Financial Crime、Model Risk、Data Governance、API Platform、Third-Party Risk Management、Vendor Management、Customer Experience、Operations、Accessibility、Product Owner、Architecture、Internal Audit 和必要外部顾问共同判断。CFPB 1033 或其他 open banking / open finance 要求的具体适用性, 取决于 product、entity role、data type、customer relationship、jurisdiction、rule status、litigation / regulatory developments、contract structure、partner role、data flow、use case 和 Legal / Compliance interpretation。

本文只讨论架构和治理方法。它不推断任何机构是否是 data provider、authorized third party、data aggregator、service provider 或 covered entity; 不推断任何数据字段是否属于 covered data; 不推断任何合规日期、宽限期、豁免、诉讼影响或监管解释。所有 rule-status 和 implementation 相关问题必须以 Legal / Compliance 对官方来源的最新解释为准。


Source Anchors

SourceLink用途
CFPB Personal Financial Data Rights final rulehttps://www.consumerfinance.gov/rules-policy/final-rules/required-rulemaking-on-personal-financial-data-rights/用作 U.S. personal financial data rights、secure and reliable consumer / authorized third-party data access、privacy protections、open banking 方向的官方锚点
CFPB Regulation 1033 / Personal Financial Data Rightshttps://www.consumerfinance.gov/rules-policy/regulations/1033/用 § 1033 的 covered data、developer interface、authorized third party、authorization disclosure、third-party obligations、revocation 和 recordkeeping 结构组织架构问题
CFPB Personal financial data rights compliance resourcehttps://www.consumerfinance.gov/compliance/compliance-resources/other-applicable-requirements/personal-financial-data-rights/用作 rule status、implementation resources、official interpretations 和 regulatory-development watch 的入口; 不在本文推断适用性或期限
CFPB guidance / circulars landing pagehttps://www.consumerfinance.gov/compliance/circulars/用作 CFPB guidance / circulars / supervisory materials 的官方入口, 支持 horizon scanning 和 policy update workflow
FFIEC Authentication and Access to Financial Institution Services and Systemshttps://www.ffiec.gov/press/pr081121.htm用 authentication、access risk assessment、layered security、MFA / equivalent controls、third-party / customer-permissioned entity access 风险组织 access controls
NIST Privacy Frameworkhttps://www.nist.gov/privacy-framework用 privacy risk management、data processing purpose、governance、control design、customer trust 和 minimization 组织 privacy-by-architecture
NIST AI RMFhttps://www.nist.gov/itl/ai-risk-management-framework用 Govern / Map / Measure / Manage 思路组织 AI 风险识别、eval、monitoring、incident response 和 evidence
ISO/IEC 42001 overviewhttps://www.iso.org/standard/42001用 AI management system、policy、roles、operation、performance evaluation、internal audit 和 continual improvement 建立 operating model

一句话:

Consented financial data sharing is not "AI can read bank data". It is a purpose-bound, revocable, API-mediated trust architecture that lets AI consume only authorized data, reason within governed boundaries, and act only with explicit customer and policy authorization.


1. Thesis

AI open banking / open finance 的核心架构变化, 不是把 screen scraping 换成 API, 也不是把银行流水丢给 LLM 总结。真正变化是:

from: broad data harvesting + credential sharing + opaque enrichment
to: customer-authorized access + standardized API contract
  + purpose-bound use + revocation + evidence replay
  + AI data-use boundary + fraud and third-party controls

成熟架构必须同时回答十二个问题:

  1. 客户授权的是哪个 data recipient, 面向哪个 requested product or service, 访问哪些 data categories, 多长时间, 多高频率?
  2. data provider、data recipient、data aggregator、developer app、AI platform 和 downstream processor 各自是什么角色?
  3. API contract 如何表达 scope、field semantics、quality、latency、pagination、error、consent state、revocation 和 incident behavior?
  4. AI 是否只使用为当前 purpose 授权的数据, 是否阻止 cross-sell、targeted advertising、unrelated profiling 和 unrestricted training?
  5. 客户能否像授权一样容易地 revoke, revoke 后 collection、use、retention、cache、vector index、features 和 agent sessions 如何停止或降级?
  6. data minimization 是产品策略、API scope、feature engineering、prompt grounding 和 retention policy 的共同约束, 还是只是隐私文案?
  7. RAG over account data 是否做到 tenant / customer isolation、source citation、freshness、entitlement check、prompt-injection defense 和 evidence capture?
  8. model features 是否有 allowed uses / prohibited uses, 是否把 transaction patterns 变成未经授权的 sensitive inferences?
  9. fraud / scam controls 是否覆盖 account takeover、authorized push payment scam、malicious third-party app、data exfiltration、synthetic behavior 和 agent overreach?
  10. developer onboarding 是否有 security, privacy, data-use, operational, financial crime, model-risk 和 exit controls?
  11. 客户价值是否足够清楚, 让客户理解自己为什么要分享数据, 以及分享后能获得什么?
  12. 事后能否重放 authorization disclosure、consent receipt、API calls、data transformations、AI run、human decision、customer communication 和 revocation handling?

关键原则:

Consented does not mean unlimited.
Available through API does not mean usable for every model.
Data portability does not mean downstream data resale.
Account history does not mean behavioral surveillance license.
AI personalization does not override purpose limitation.
Revocation must affect data, features, embeddings, agents and evidence.

2. Why It Matters

金融零售 AI 正把 open banking 从 data-access program 推向 AI decision and action ecosystem:

Pressure表现架构含义
Consumer data portability客户希望把账户、交易、余额、账单、账户验证信息带到新服务需要 API-based, standardized, machine-readable, secure access, 不能依赖客户凭证共享
Personal financial managementAI 预算助手、现金流预测、订阅优化、债务建议、理财建议需要账户数据需要 purpose-bound access、explainability、data freshness、recommendation suitability boundary
Open finance expansionBanking data 之外扩展到 payroll、tax、brokerage、insurance、pension、merchant、wallet、loyalty需要跨域 consent registry、semantic mapping、data quality 和 jurisdiction policy
Embedded finance第三方 app 在贷款、支付、储蓄、BNPL、保险、商户工具中调用金融数据需要 developer onboarding、TPRM、API controls、consumer disclosure 和 monitoring
Agentic AIAI agent 代表客户查询账户、准备申请、比较产品、发起服务请求需要 read / reason / recommend / act 分级授权, action-bound approval 和 revoke propagation
Fraud industrialization恶意 app、scam scripts、ATO、social engineering、data exfiltration、credential stuffing需要 FFIEC-style layered access controls、behavioral monitoring、third-party anomaly controls
Customer trust客户担心“银行数据被拿去训练模型、卖广告、交叉销售、无法撤回”需要透明 value proposition、minimization、revocation receipt、complaint route 和 audit evidence

高级 PM / Architect 的问题不是:

How do we plug Plaid-like data into the AI feature?

而是:

For this customer-requested outcome,
what data is necessary,
which party is authorized to collect it,
how is access technically constrained,
how will AI use be bounded and evidenced,
what happens on revocation,
and how do we prove the ecosystem behaved as promised?

3. Consented Data Sharing vs Unrestricted Data Harvesting

consented data sharing 与 unrestricted data harvesting 的边界必须写进产品、API、数据和模型架构, 不能只写进 privacy notice。

DimensionConsented data sharingUnrestricted data harvesting
Customer intent客户为 requested product / service 授权特定 data access客户被迫接受宽泛条款, 数据被用于不相关目的
Access methodAPI / developer interface / scoped token / consent state / audit logscreen scraping、credential sharing、暗箱 SDK、brokered datasets
Scopedata categories、duration、frequency、provider、recipient、purpose 受限尽可能多拿、长期保留、跨产品复用
Data minimization只请求完成当前价值主张所需字段先收集再找用途
AI usegrounded、purpose-bound、feature boundary 明确用于训练、画像、广告、交叉销售或推断敏感属性
Revocationrevocation 影响 collection、API token、cache、feature、embedding、agent workflow只停止新抓取, 历史数据继续漂移使用
Evidenceconsent receipt、API calls、data lineage、AI runs、decision logs 可重放事后无法证明客户同意了什么
Customer control可查看、撤销、投诉、纠错、转移客户不知道谁拿了数据, 也不知道如何停止

判定测试:

If the customer cannot explain who gets what data for what purpose,
and the institution cannot technically enforce that boundary,
the architecture is data harvesting with a consent screen.

4. Ecosystem Role Model

AI open banking 必须先做 role clarity, 再做 data flow。一个机构可能在不同旅程中同时扮演 data provider、data recipient、aggregator client、model operator 或 service provider。

RoleResponsibilityAI architecture question
Customer / consumer授权、撤销、查看分享历史、请求服务、纠错客户是否理解 AI 会如何使用数据, 能否撤销和获得解释?
Data provider持有账户、交易、余额、条款、账单或账户验证信息的金融服务方developer interface 是否安全、可靠、标准化、可监控, 且不要求第三方使用客户凭证?
Authorized data recipient为客户请求的产品/服务访问数据的一方是否只收集、使用、保留合理必要的数据, 是否可证明目的限制?
Data aggregator连接 provider 和 recipient 的中介或 service provideraggregator 是否有 contract pass-through、security、revocation propagation、accuracy 和 incident controls?
Developer app客户面对的应用, 可能嵌入 AI assistantapp 是否通过 onboarding、security review、data-use attestation 和 monitoring?
AI system / model operator使用数据做总结、预测、推荐、自动化或 agent action模型是否只能访问授权上下文, 是否把 data class 和 purpose 作为 runtime control?
API platform管理 client registration、token、scope、rate limit、schema、observabilityscope 是否足够细, error / revocation / outage behavior 是否契约化?
Data governance管理 lineage、quality、retention、catalog、feature store、embedding storeopen banking data 是否被标记为 consent-bound asset?
Fraud / security防止 unauthorized access、scam、ATO、malicious recipient、exfiltration是否监控 third-party access pattern 和 customer-permissioned entity risk?
Legal / Compliance / Privacy解释义务、审查 notice、approval、retention、customer rightsrule status、jurisdiction、entity role 和 data type 是否被持续更新?

架构上建议把每个 use case 建成 data sharing passport:

use_case_id
customer_requested_product_or_service
data_provider
data_recipient
aggregator_if_any
data_categories
purpose
duration
frequency
consent_version
revocation_route
ai_allowed_uses
ai_prohibited_uses
retention_rule
evidence_bundle_id

5. Data Scope and Open Finance Taxonomy

open banking 常从账户和支付相关数据起步, open finance 会扩展到更广泛的金融生活图谱。AI 架构必须避免把范围扩大解读为“任何有用数据都可以拿”。

Data familyExamplesAI valueBoundary
Account identity / verificationname, address, email, phone, account identifier, account ownership signalaccount opening, payout setup, fraud review不等于完整 KYC, 不应扩展为 unrelated identity profiling
Transaction historyamount, date, merchant, category, pending status, feescash-flow intelligence, affordability, budget, scam detectionsensitive inference risk 高, 需要 purpose and feature controls
Balance and liquiditycurrent / available balance, overdraft, credit limitcash-flow warning, transfer timing, credit line reviewfreshness and timing critical, stale data can harm customers
Terms and conditionsfees, APR/APY, credit limit, rewards, overdraft opt-inproduct comparison, fee optimization, adviceAI must cite source and avoid unsupported legal interpretation
Upcoming bills / scheduled paymentsdue amount, due date, biller, recurring obligationsproactive reminders, liquidity planning, hardship supportnotification and action authorization must be separate
Payment initiation informationrouting/account/tokenized information where applicableaccount funding, verification, payment setuphigh-risk; action permission and fraud controls separate from data access
Payroll / incomeemployer, pay frequency, net/gross pay, tax dataincome verification, affordability, benefitsopen finance extension; stronger consent, fairness, accuracy controls
Brokerage / pension / insuranceholdings, contributions, claims, premiumsholistic advice, risk planningregulated advice and suitability boundaries likely apply
Merchant / small-business financePOS sales, invoices, receivables, inventory, bank feedsSME underwriting, cash-flow forecast, working capitalbusiness consent, multi-user authority, data quality and role authority

Design rule:

Each data family must carry:
source, consent, purpose, freshness, quality, allowed uses,
AI use boundary, downstream sharing rule, retention, revocation action.

Consent 是客户允许访问和使用数据的表达。Authorization 是系统和政策允许某个 actor 做某件事。Authentication 是确认当前 session / user / client 的控制权。三者不可混用。

LayerQuestionArtifact
Customer disclosure客户是否看到 third party、data provider、product/service、data categories、duration、revocation method?authorization disclosure version, UI capture
Express consent客户是否明确授权, 且签署/确认记录可重放?consent receipt, timestamp, channel, language
API authorizationclient / aggregator 是否获得 scoped token, 只能访问授权 categories?OAuth grant, scope, token hash, mTLS / client credential
Runtime entitlement每次 API / RAG / feature call 是否检查 consent state and purpose?entitlement check log
Revocation客户撤销后是否停止新 collection, 通知 provider/aggregator/downstream, 处理 cache and retention?revocation receipt, propagation log
Reauthorization长期或持续访问是否有重新授权机制?reauthorization event
Action authorizationAI agent 或 app 要执行支付、提交申请、变更账户时是否另行批准?action approval, transaction confirmation

Consent object 建议字段:

consent_id
customer_ref
data_provider_id
data_recipient_id
aggregator_id
requested_product_or_service
data_categories
field_scope
purpose
collection_frequency
duration_limit
granted_at
expires_at
revoked_at
revocation_method
language
disclosure_version
ai_allowed_uses
ai_prohibited_uses
training_allowed_flag
downstream_sharing_allowed_flag
evidence_bundle_id

弱设计:

Customer clicked "Connect bank"; all transaction data enters generic data lake.

强设计:

Customer authorized cash-flow assistant to access transaction history and balance
from Bank A for 90 days.
API tokens are scoped.
RAG and feature store check consent before use.
Cross-sell, targeted ads and model training are prohibited unless separately authorized and approved.
Revocation stops collection, disables derived features where required, and closes agent sessions.

7. Reference Architecture

参考架构:

customer
  -> consent and authorization disclosure UX
  -> data recipient / developer app
  -> aggregator or direct API connection
  -> data provider developer interface
  -> API gateway / auth / scope / rate limit
  -> raw consent-bound data zone
  -> normalization / quality / lineage service
  -> purpose-bound feature store
  -> customer-isolated vector / retrieval index
  -> AI orchestration and tool policy
  -> recommendation / decision / agent action workflow
  -> human review / fraud / operations
  -> customer communication
  -> evidence ledger / monitoring / revocation service

关键组件:

ComponentResponsibilitySenior design question
Consent registry保存 authorization disclosure、scope、duration、revocation、AI allowed uses是否是 runtime enforcement source, 还是只做记录?
Developer onboarding portal注册 app、certification、security review、sandbox、contract、data-use attestationsonboarding 是否能阻止低信任 app 进入生产 API?
API gatewayclient authentication、token validation、scope、rate limit、mTLS/FAPI-style controls、loggingAPI 是否能按 data category and purpose 限制访问?
Data provider interface标准化、机器可读、可靠、可监控的数据接口是否有 uptime, error semantics, support, dispute and incident process?
Data normalization layermerchant/category mapping、schema mapping、dedup、timezone、currency、pending/posted statusenrichment 是否保留 source lineage and uncertainty?
Quality servicecompleteness、freshness、accuracy feedback、provider variance、driftAI 是否知道数据不完整、过期或待确认?
Feature boundary service把 consent-purpose 映射到 allowed features and prohibited featuresmodel features 是否可追踪到 consent and purpose?
Retrieval layercustomer-specific source retrieval、embedding lifecycle、source citation、entitlement checkRAG 是否避免跨客户、跨目的、跨产品泄露?
AI policy engine控制 prompt、tools、outputs、action approval、human reviewAI 是否能被 policy deny, 还是只能靠 prompt 提醒?
Fraud and scam controlsthird-party anomaly、ATO、APP scam、malicious recipient、velocity、deviceopen banking data 是否成为 scammer 的 leverage?
Evidence ledger串联 consent、API、data quality、AI run、decision、communication、revocation投诉和审计能否重放整个 reliance chain?

架构边界:

consent registry
  -> entitlement checks
  -> API scopes
  -> data lake tags
  -> feature store policies
  -> vector index lifecycle
  -> prompt/tool policies
  -> evidence retention

如果 consent 只在前端展示, 不进入上述控制链, 就不是 production-grade consented data sharing。


8. API Contract and Developer Onboarding

Open banking 的产品体验由 API contract 决定, 不由单个 AI prompt 决定。API contract 应覆盖数据语义、可靠性、授权状态、错误、撤销和运营承诺。

8.1 API Contract

DomainContract requirementAI impact
Data categoriestransaction, balance, terms, bill, verification 等分类清晰模型不会把字段误用到错误目的
Schema semanticsposted vs pending, merchant vs payee, balance type, timezone, currencycash-flow model and RAG answer 不会错误推理
Consent stateactive, expired, revoked, suspended, reauthorization neededruntime entitlement and agent stop conditions
Field-level scoperead categories and optional fields tied to consentminimization and feature gating
Freshnesslast_updated_at, as_of_time, provider latencyAI 输出必须说明数据时点
Error modelunavailable, unauthorized, provider_denied, scope_missing, customer_action_neededAI 不能把 API error 当成客户财务事实
Pagination and historyhistorical window, cursor, duplicates, revision handlingtransaction analysis reproducible
Quality feedbackdispute, correction, categorization feedback, provider issuemodel drift and data quality improvement
Rate and performanceresponse SLO, scheduled downtime, throttlingagent workflow degrade gracefully
Securityclient auth, token binding, encryption, key rotation, loggingthird-party access monitorable
Revocationnotification, token termination, downstream propagation, retention actionmodel features and embeddings lifecycle controlled
Auditcorrelation id, consent id, request id, response hashevidence replay possible

8.2 Developer / Third-Party Onboarding

GateEvidenceReject / restrict when
Business purpose reviewproduct/service description, data categories, customer valuepurpose too broad, data request not necessary
Security reviewsecure SDLC, auth, encryption, secrets, vulnerability mgmt, incident responsecredential sharing, weak token handling, no monitoring
Privacy reviewminimization, retention, downstream sharing, AI use, customer rightstargeted ads / resale / unrelated training built into model
Operational reviewsupport, uptime, reconciliation, error handling, revocation processno customer support or revocation workflow
Model-risk reviewAI use case, features, prompts, evals, human oversightidentity/eligibility/credit decisions unsupported
Fraud reviewscam controls, anomalous access, device/account riskhigh-risk journeys lack step-up or monitoring
Contractingdata-use restrictions, audit rights, subprocessors, breach notice, exitcannot enforce obligations downstream
Sandbox certificationtest cases, negative scenarios, consent/revocation testsapp fails scope, revocation, data minimization

Developer onboarding 要把“能调用 API”变成“能被治理”。对于 AI app, 还要额外收集:

model providers
prompt / tool architecture
whether customer data enters model training
feature store and vector store lifecycle
customer-facing claims
human review triggers
advice / recommendation boundaries
eval evidence
complaint and correction flow

9. Data Quality, Enrichment and Lineage

Open banking data 的可用性不等于高质量。交易数据尤其容易出现 merchant normalization、pending/posted duplication、category inconsistency、refund matching、subscription detection、timezone、currency、joint account 和 business/personal mix 的问题。

Quality issueAI failureControl
Pending and posted duplicatesAI 高估支出或误报异常transaction lifecycle state and dedup logic
Merchant alias ambiguity错误识别商户、订阅或诈骗merchant enrichment confidence + source trace
Category inconsistency预算建议错误、cash-flow forecast driftmodel-aware taxonomy mapping and feedback
Missing historical dataaffordability or trend model biasedcoverage metric and answer caveat
Stale balanceAI 建议转账/付款造成透支freshness gate and action block
Joint account contextAI 把共同账户行为归因给单个客户account ownership and household boundary
Business/personal comminglingSME cash-flow model误读业务状态account purpose classification with review
Data provider variance不同银行字段语义不一致provider-specific schema adapters
Customer correction ignored错误持续影响推荐correction workflow and feature recompute

Enrichment 要分层:

raw_provider_data
normalized_data
enriched_data
derived_features
model_inferences
customer_visible_answer

每层都应保留:

source
transformation_version
confidence
consent_id
purpose
quality_score
freshness
allowed_uses
retention_rule
revocation_action

高级原则:

Never let enrichment erase consent, uncertainty or lineage.

10. Model Feature Boundaries

AI 使用 open banking data 最常见的风险, 是把交易历史变成无边界的行为画像。Feature boundary 要在 feature store、model serving、RAG retrieval 和 policy engine 中可执行。

Feature classExampleAllowed useProhibited or high-risk use
Customer-requested utilityrecurring bill detection, cash-flow forecast, fee alertbudgeting, alerts, customer-selected recommendationunrelated cross-sell without authorization
Risk / fraudunusual payee, device + transaction anomaly, scam patternprotect account, step-up, human reviewopaque denial without evidence and review
Affordability / creditincome volatility, expense obligations, cash buffergoverned underwriting support where approvedproxy discrimination, unsupported adverse action
Vulnerability / hardshipmissed payments, overdraft stress, income dropsupportive outreach and relief optionsexploitative pricing or pressure sales
Marketing segmentationspending category affinity, life eventonly where separately permitted and governedtargeted ads or sale if outside purpose
Sensitive inferencehealth, religion, union, political, gambling, addiction, immigrationgenerally avoid, mask, or route to privacy reviewmodel training or eligibility use without explicit policy

Feature record 建议:

feature_name
source_data_categories
consent_purpose
customer_requested_service
allowed_products
allowed_decisions
prohibited_decisions
model_ids
human_review_required
fairness_review_required
retention
revocation_action
evidence_refs

不要只问“模型能不能预测”。要问:

Should this feature exist under this customer authorization?
Can the customer understand the value?
Can the institution explain the use?
Can revocation unwind future use?
Can monitoring detect misuse?

11. RAG over Account Data

RAG over account data 是高价值场景, 也是高风险场景。账户数据不是通用知识库, 它是 consent-bound, customer-specific, time-sensitive financial evidence。

11.1 Safe RAG Pattern

user question
  -> authenticate user/session
  -> check consent and purpose
  -> determine allowed accounts/data categories/time window
  -> retrieve source transactions/balances/terms
  -> apply data quality/freshness filters
  -> ground answer with citations and caveats
  -> block prohibited advice/actions
  -> capture AI run and sources

11.2 RAG Controls

ControlWhat it prevents
Per-customer index or strict row-level retrievalcross-customer leakage
Consent-aware retrieveruse after revocation or outside purpose
Time-window filtersovercollection and stale answers
Source-grounded responsehallucinated financial facts
Quality and freshness tagsfalse precision in cash-flow advice
Prompt-injection filteringmerchant memo or transaction text manipulating the AI
Output policyprohibited legal, tax, investment or credit conclusions
Embedding lifecyclerevoked data remaining searchable
Retrieval auditinability to prove which data informed answer
Human escalationhardship, scam, complaint or high-impact decision

弱回答:

"You can afford this loan because your bank data looks healthy."

强回答:

"Based on transactions you authorized for this budgeting feature,
your average monthly inflow over the selected three-month period was X
and recurring obligations we detected were Y.
This is a budgeting estimate, not a credit approval or financial advice.
Some transactions may be pending or miscoded."

RAG 不应:

  • 使用 revoked consent 的历史 embeddings 继续回答。
  • 把 transaction memo 当成绝对事实。
  • 从交易推断敏感属性并用于 eligibility。
  • 让 agent 自动发起付款或申请而无 action approval。
  • 把单一账户数据说成客户完整财务状况。
  • 在投诉中无法列出支撑回答的 source rows。

12. Agentic AI and Action Boundaries

AI agent 可以读、解释、建议, 但“行动”必须单独治理。Open banking data access 不等于 payment authorization、product application authorization、account change authorization 或 advice acceptance。

Agent capabilityExampleRequired control
Read查询余额、交易、账单consent + scope + customer/session auth
Reason总结现金流、找订阅、识别费用grounded RAG + quality caveats + prohibited-use guardrail
Recommend建议预算调整、提醒账单、建议联系银行recommendation boundary + customer explanation
Prepare预填申请、生成 dispute / hardship packagesource citation + customer review
Submit提交申请、发送证明、开 caseaction-bound approval + evidence
Initiate value movementpayment, transfer, withdrawalseparate payment authorization, step-up, scam controls
Change account更新地址、取消订阅、关闭产品business authorization + customer confirmation

Agent authorization record:

agent_id
workflow_run_id
customer_ref
data_access_consent_id
allowed_tools
allowed_data_categories
purpose
time_bound
action_bound_approval_required
approval_id
revocation_reference
human_review_triggers
evidence_bundle_id

关键原则:

Data access consent lets the agent know.
Action authorization lets the agent do.

13. Fraud, Scam and Access Controls

Open banking / open finance 会改变欺诈攻击面。数据分享本身可能成为 fraud vector, 尤其当恶意 app 获取客户授权、诈骗者诱导授权、或账户数据被用于 social engineering。

ThreatPatternControl
Malicious data recipientapp 以预算服务名义收集大量数据后外泄或滥用developer onboarding, data-use attestation, monitoring, audit rights, suspension
Consent phishing客户被引导授权给仿冒 app 或 scammer-controlled serviceverified developer identity, customer warning, domain/app reputation
Account takeover攻击者登录客户 app 后授权数据分享MFA/step-up, device risk, session risk, revocation notification
Credential sharing fallback第三方绕过 API 要客户交出 bank credentialscontractual prohibition, detection, education, ecosystem enforcement
API exfiltration正常 token 被批量滥用rate limit, anomaly detection, token binding, IP/device intelligence
Authorized push payment scamAI / app 建议或准备向诈骗收款人付款payee risk, cooling-off, step-up, scam intervention, human support
Data poisoningmerchant text / memo 注入 prompt 或错误分类prompt-injection defense, data sanitation, source weighting
Synthetic affordability欺诈者操纵账户流入流出制造收入graph/velocity controls, payroll/source diversity, fraud review
Revocation bypassdownstream party 继续使用缓存或 derived datarevocation propagation, retention rules, control testing
Third-party outageapp 无法取数, AI 给出错误建议degrade mode, freshness warnings, retry and customer message

FFIEC-style design lessons:

  • authentication risk assessment must include customers, employees, third parties, applications, service accounts and devices。
  • MFA or equivalent strength controls should be risk-proportionate, especially for high-risk access and actions。
  • layered controls matter because any single control can fail。
  • customer-permissioned entity access needs specific risk assessment, monitoring, logging and reporting。

14. Governance Model

将 NIST Privacy Framework、NIST AI RMF 和 ISO/IEC 42001 组合成 operating model:

Governance lensOpen banking questionAI control
Privacy risk是否识别 processing purpose、data minimization、retention、customer rights 和 data sharing risk?consent registry, purpose binding, privacy review, DPIA-like evidence
AI riskAI 是否被 map 到客户影响、data inputs、model behavior、evals、monitoring 和 incident response?AI RMF-style govern/map/measure/manage workflow
Management system是否有 policy、roles、process、performance evaluation、internal audit 和 continual improvement?ISO 42001-style AIMS operating rhythm
Security and access是否对 customer / third party / app / API / service account 做 risk-based authentication?FFIEC-style risk assessment and layered access controls
Regulatory horizonrule status、guidance、litigation、standard-setting、state/international changes 是否被更新?regulatory change workflow and product impact assessment

14.1 Operating Evidence

EvidenceWhy it matters
Use case risk assessment证明为何需要数据和 AI
Consent / disclosure artifact证明客户看见并授权了 scope
Data minimization matrix证明未过度收集
API contract review证明技术边界可执行
Third-party due diligence证明 recipient / aggregator 被治理
Model/data policy证明 AI use bounded
Eval and red-team result证明模型不越权、不幻觉、不泄露
Fraud threat model证明 abuse scenarios 被覆盖
Revocation test证明撤销影响 collection/use/cache/features/embeddings
Evidence replay test证明投诉/审计能重放

15. Product / Architecture Decisions

DecisionWeak answerStrong architecture answer
What data should we request?“All transactions, just in case”Request minimum categories, accounts and time window for the customer-requested service
How long should access last?“Until user disconnects”Purpose-specific duration, reauthorization, runtime consent check and expiration behavior
Can AI train on the data?“It improves the product”Separate explicit policy/consent, de-identification review, model-risk approval and opt-out handling
Can we use data for cross-sell?“Customer connected bank data”Only if separately permitted and consistent with purpose, policy and customer expectation
Direct API or aggregator?“Fastest SDK”Evaluate coverage, consent UX, contract, security, revocation propagation, data quality and exit
RAG architecture?“Index all account data”Customer/purpose-isolated retrieval, source grounding, freshness, revocation-aware embedding lifecycle
Fraud controls?“Bank authenticated the customer”Risk-based authentication, app reputation, access anomaly, scam intervention and human escalation
Revocation?“Delete token”Stop collection, notify ecosystem, enforce retention, disable derived features and purge/restrict embeddings
Advice boundary?“AI gives financial advice”Define information, education, recommendation, regulated advice and decision boundaries
Evidence?“Logs exist”Case-level bundle linking consent, API calls, data transforms, model run, decision and customer communication

16. Control Matrix

Control objectiveControl activityEvidence
Establish lawful/policy basis through governanceLegal/Compliance/Privacy review of role, jurisdiction, data type and product scopeapproval record, interpretation memo
Minimize data collectionData category/time-window/account selection tied to customer-requested serviceminimization matrix, API scopes
Preserve informed authorizationClear disclosure of recipient, provider, service, categories, duration and revocationdisclosure version, signed consent receipt
Enforce consent at runtimeEntitlement check for API, feature store, RAG and agent toolsconsent check log, denied access events
Support revocationEasy revocation, ecosystem notification, collection stop, retention and derived-data handlingrevocation receipt, propagation log
Govern API accessClient registration, authentication, token binding, scopes, rate limits, anomaly monitoringdeveloper record, gateway logs
Manage third-party riskDue diligence, contractual restrictions, audit rights, incident obligations, exit planTPRM file, contract clauses
Assure data qualityFreshness, completeness, dedup, correction, source lineage and quality scoresquality dashboard, lineage records
Bound AI featuresAllowed/prohibited uses for features, inferences, training and decisionsfeature policy, model card, eval result
Secure RAGCustomer/purpose isolation, source citations, prompt injection controls, retrieval logsRAG eval, source trace
Control agent actionsSeparate action-bound approval for submission/payment/account changesapproval id, action hash
Detect fraud/scamMonitor malicious recipient, ATO, API exfiltration, APP scam and synthetic behaviorfraud dashboard, case evidence
Preserve audit replayLink consent, API, data, AI, human and final customer messageevidence bundle
Improve continuouslyIncident, complaint, model drift and data-quality CAPARCA, CAPA owner, closure evidence

17. Metrics and KRIs

Metric familyExamples
Customer valuesuccessful connections, time-to-value, budgeting accuracy, fee savings, subscription cancellation success, cash-flow alert usefulness
Consent qualitydisclosure comprehension, consent drop-off, over-scope defects, reauthorization completion, revocation success
Data minimizationaverage data categories requested, account/time-window scope, full-history request rate, unused data ratio
API ecosystemdeveloper approval time, API uptime, response time, error rates, provider coverage, schema variance
Data qualityfreshness, completeness, duplicate rate, categorization confidence, correction rate, provider issue backlog
AI/RAG safetygrounded-answer rate, unsupported conclusion rate, source citation completeness, prompt-injection block rate
Feature governancefeatures with consent lineage, prohibited-use violations, sensitive inference incidents, revocation feature disablement
Fraud/scammalicious recipient attempts, abnormal API calls, ATO-linked authorizations, scam escalation, manual review outcomes
Third-party riskoverdue reviews, subprocessor changes, audit findings, incident notifications, exit-test completion
Customer trustcomplaints about data sharing, unclear use, revocation failure, wrong AI advice, perceived surveillance
Evidencereplay completeness, missing consent id rate, missing AI trace rate, retention-rule defects

Balanced executive dashboard:

Value: customers get useful financial outcomes.
Control: every data use is purpose-bound and revocable.
Safety: AI does not exceed consent, evidence or advice boundaries.
Trust: customers can understand, revoke and contest.
Ecosystem: APIs and third parties perform reliably and securely.
Evidence: each answer, recommendation and action can be replayed.

18. Failure Modes

Failure modeWhy dangerousBetter control
Consent screen as blanket waiver客户授权被解释成无限使用purpose-bound consent object and runtime enforcement
Generic data lake ingestionopen banking data loses consent, purpose and revocation metadataconsent-bound data zone and lineage tags
RAG indexes all account datacross-purpose and post-revocation leakageisolated, entitlement-aware retrieval
AI infers sensitive traits from transactionsunfairness, privacy harm, unsupported decisionssensitive inference policy and prohibited uses
Revocation only deletes API tokencache/features/embeddings continue to use datarevocation propagation and lifecycle testing
API error treated as financial factmodel says customer has no funds when API failederror-aware prompts and freshness gates
Aggregator selected only for coveragehidden security, quality, contract and exit risksTPRM and architecture review
Customer credentials still collecteddefeats safer open banking modelprohibit credential sharing and monitor fallback
Developer app over-requests dataprivacy harm and customer distrustscope review, sandbox certification, monitoring
Model training on consented data by defaultviolates purpose expectation and data-use boundaryexplicit AI training policy and approval
Agent initiates action under data consentunauthorized payment/application/account changesseparate action authorization
Complaint lacks evidence chaincannot explain or remediateconsent-to-AI-to-decision evidence bundle

19. Interview-Ready Takeaways

Q1: Consented data sharing 和 data harvesting 的本质区别是什么?

Consented data sharing 是 customer-requested service 下的 purpose-bound, scoped, revocable, evidenced data access。Data harvesting 是用宽泛同意收集尽可能多的数据, 再用于不相关目的、训练、广告、交叉销售或转售。架构区别在于 consent 是否进入 API scope、data lineage、feature store、RAG retrieval、agent tool 和 revocation enforcement。

Q2: AI open banking 架构中最重要的控制点是什么?

不是单一 API 或模型, 而是 consent registry 作为 runtime control plane。它要驱动 API scope、data lake tags、feature policies、RAG retrieval、agent permissions、retention and revocation。没有这个 control plane, AI 很容易把授权访问变成无边界使用。

Q3: 如何设计 RAG over account data?

按 customer、purpose、consent 和 time window 做隔离检索; 每次检索检查 entitlement; 输出必须引用 source rows 和 freshness; prompt-injection、sensitive inference、prohibited advice 和 agent action 都要受 policy engine 控制; revocation 后要停止未来检索并处理 embeddings lifecycle。

Q4: open banking 数据能否用于模型训练?

不能默认可以。训练用途要和 customer-requested service、授权、隐私政策、合同、适用法律/监管解释、de-identification、retention、model-risk approval 和 customer expectation 一起评估。尤其不能把为预算或账户验证授权的数据自动用于 unrelated model training。

Q5: Senior PM 如何评估 open banking AI feature?

同时看 value, consent, minimization, API reliability, data quality, third-party risk, model boundary, fraud/scam controls, revocation, evidence and customer trust。能上线的不是“模型很聪明”, 而是“客户授权清楚、系统边界可执行、撤销有效、结果可解释、风险可监控”。


20. Practical Templates

20.1 Use Case Data Sharing Card

FieldExample
use_caseAI cash-flow assistant
customer_requested_servicemonthly cash-flow insight and bill reminder
data_providercustomer-selected bank or wallet provider
data_recipientfinancial planning app
aggregatorapproved aggregator if used
data_categoriestransaction history, balance, upcoming bills
time_windowlast 6 months plus current balance
purposebudgeting and bill reminder
excluded_usestargeted ads, unrelated cross-sell, sale of covered data, default model training
AI allowed usessummarize, forecast, explain, alert
AI prohibited usescredit eligibility, identity proofing, regulated advice without approval
revocation_actionstop collection, disable refresh, restrict features, purge/restrict embeddings per policy
evidenceconsent receipt, API request ids, source rows, AI run id, final answer

20.2 API Scope Contract

Scope name:
Data category:
Fields:
Purpose:
Required consent version:
Maximum lookback:
Refresh frequency:
Freshness requirement:
Error semantics:
Revocation behavior:
Downstream sharing:
AI allowed uses:
AI prohibited uses:
Retention:
Evidence fields:
Owner:

20.3 AI Feature Boundary Record

Feature:
Source data categories:
Customer-requested service:
Consent ids:
Transformation lineage:
Allowed model ids:
Allowed decisions:
Prohibited decisions:
Sensitive inference risk:
Fairness review:
Human review triggers:
Revocation action:
Monitoring metric:
Evidence reference:

20.4 RAG Answer Evidence Record

Question:
Customer/session:
Consent check:
Purpose:
Retriever policy:
Source records:
Data freshness:
Quality caveats:
Prompt version:
Model version:
Output policy checks:
Final answer:
Customer message id:
Retention rule:

20.5 Revocation Runbook

Revocation event:
Customer:
Data recipient:
Data provider:
Aggregator:
Consent ids:
API token action:
Data collection stop time:
Cache action:
Feature action:
Embedding action:
Agent/session action:
Downstream notification:
Customer receipt:
Exceptions:
Evidence bundle:
Owner:

21. Final Operating Principle

成熟的 AI open banking / open finance architecture 可以用一个问题检验:

Can the institution prove that every AI answer, feature, recommendation and action
used only the financial data the customer authorized for that purpose,
through governed APIs and third parties,
with quality and freshness understood,
with revocation propagated,
with fraud and scam controls active,
with inferences separated from facts,
and with evidence sufficient for audit, complaint and customer trust?

如果答案不清楚, 企业不是缺一个 open banking connector。它缺的是 consent control plane、API ecosystem governance、AI feature boundary、RAG evidence、third-party risk controls、fraud/scam monitoring、revocation lifecycle 和 customer trust operating model。