AI 扩展计划 / Playbooks

AI Capability-Based Planning / Business Architecture Playbook

这些来源作为学习锚点, 不构成法律、合规、监管、采购或认证建议。正式项目必须由 legal, compliance, risk, security, privacy, architecture board, data owner 和 business owner 审查。

854 行AI_CAPABILITY_BASED_PLANNING_BUSINESS_ARCHITECTURE_PLAYBOOK.md

AI Capability-Based Planning / Business Architecture / Capability Map Playbook

定位: 面向 AI BA / AI PM / Enterprise Architect / AI Transformation Lead 的高级业务架构与能力规划手册。目标: 把企业 AI 转型从 use case list 升级为 capability portfolio, value stream, business architecture, architecture roadmap 和 funding gate。核心观点: 企业 AI 成熟度不取决于试点数量, 而取决于哪些可复用业务能力被建设、治理、度量和持续演进。适用范围: 金融零售企业 AI 转型, 包括 AML, 客服, 信贷, 财富/分行, 企业 AI 平台与 AI operating model。

Source Anchors

Anchor	Official / Primary Source	用法
The Open Group TOGAF	https://www.opengroup.org/togaf	用 Enterprise Architecture 语言连接战略、业务、应用、数据、技术和治理
TOGAF Standard, 10th Edition	https://www.opengroup.org/togaf-standard-10th-edition-downloads	用 ADM, architecture governance, roadmap, implementation and migration 思路组织转型
TOGAF Capability-Based Planning	https://pubs.opengroup.org/architecture/togaf9-doc/arch/chap28.html	用 capability-based planning 把业务结果、能力增量、资源和路线图连接起来
TOGAF Business Architecture Foundation	https://help.opengroup.org/hc/en-us/articles/32127305940882-TOGAF-Business-Architecture-Foundation-Certification-Overview	用 business capabilities, value streams, business modeling 支撑组织变革
NIST AI RMF	https://www.nist.gov/itl/ai-risk-management-framework	用 Govern, Map, Measure, Manage 组织 AI 风险、治理、度量和持续管理
NIST AI RMF Playbook	https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook	把 AI risk management 转成可执行的控制、证据和责任
ISO/IEC/IEEE 42010	https://www.iso-architecture.org/ieee-1471/	用 architecture description, stakeholder, concern, viewpoint, view 规范架构表达
ISO/IEC 42001	https://www.iso.org/standard/81230.html	用 AI management system 思维连接责任、生命周期、政策和持续改进
BIAN	https://bian.org/	作为银行业务能力、服务领域和 API 设计的参考语言

1. One-Sentence Positioning

AI capability-based planning 是把 AI 投资从“做一批场景”升级为“建设一组可复用、可治理、可度量、可融资的企业业务能力”, 并通过 value stream, capability map, maturity model, architecture runway 和 funding gate 管理从战略到落地的全过程。

2. 为什么 CBAP 后需要能力规划升级

你已经是 CBAP, 下一阶段的重点不是再证明会写需求、画流程、做 stakeholder analysis, 而是把 BA 能力上升到企业 AI 转型设计。

AI 项目失败很少是因为“不会写 user story”。更常见的原因是:

业务部门把 AI 当作 use case collection, 没有企业级 capability thesis。
不同团队重复做知识库、RAG、prompt、模型接入、eval 和审计日志。
PoC 可以演示, 但无法进入受监管生产流程。
数据、流程、风控、架构、运营和资金节奏没有被放在同一张图上。
AI 预算按项目切碎, 复用能力没人投资, 平台能力又被做成不落地的“大平台”。
用单点 ROI 评价 AI, 但忽略 capability reuse, risk reduction, process redesign 和 workforce adoption。

CBAP 之后的升级方向:

CBAP 能力基线	AI 企业架构升级
需求分析	capability outcome, maturity gap, investment increment
流程建模	value stream to capability mapping
Stakeholder engagement	capability ownership, funding sponsor, risk accountability
Solution evaluation	portfolio value, architecture fit, capability reuse, eval evidence
Business case	capability funding gate, option value, reuse economics
Change strategy	operating model, adoption telemetry, control effectiveness
Requirements traceability	strategy -> value stream -> capability -> architecture decision -> eval gate -> KPI

关键变化:

从“这个场景能不能做 AI”转成“这个企业能力是否应该被 AI 增强”。
从“哪个部门提出需求”转成“哪个 capability owner 对结果、风险和预算负责”。
从“上线一个助手”转成“沉淀可复用的 knowledge, model, eval, workflow, control 和 adoption 能力”。
从“一次性项目验收”转成“能力成熟度、风险证据、业务结果和架构演进的季度治理”。

3. 从 Use Case List 到 Capability Portfolio

3.1 典型错误路径

业务部门提 30 个 AI 场景
  -> 按热度做 PoC
  -> 采购多个工具
  -> 每个团队各建知识库和 prompt
  -> 没有统一 eval, audit, access, cost, ownership
  -> PoC 数量很多, 生产价值很少

3.2 推荐路径

Enterprise strategy
  -> AI transformation thesis
  -> Priority value streams
  -> Capability map and heatmap
  -> Capability maturity gaps
  -> Capability portfolio
  -> Architecture runway
  -> Funding gates
  -> Pilot, production, scale
  -> Quarterly capability review

3.3 五个核心问题

Question	Good evidence
哪些 value streams 承载最重要的战略结果?	Revenue, cost-to-serve, risk exposure, cycle time, customer trust, regulatory pressure
哪些 capabilities 限制了 value stream 表现?	Capability heatmap, maturity gap, incident trend, manual effort, control failure
哪些 AI capabilities 可以复用到多个 value streams?	Shared knowledge, model gateway, eval, workflow automation, decision intelligence
哪些 architecture runway 必须先建?	Entitlement-aware retrieval, audit logging, model gateway, data contracts, eval pipeline
哪些 funding gates 能阻止无效扩张?	Data readiness, risk tier, eval pass rate, adoption threshold, cost per case, owner sign-off

3.4 Capability Portfolio 的最小结构

Portfolio layer	内容	例子
Strategic capabilities	与企业战略直接绑定的能力	AI-enabled financial crime operations, AI-assisted credit lifecycle, omnichannel service intelligence
Domain capabilities	属于具体业务域的能力	AML case intelligence, lending policy reasoning, branch advisor copilot
Shared AI capabilities	多业务复用的 AI 能力	model gateway, RAG, eval, prompt registry, AI observability, tool permission gateway
Control capabilities	让 AI 可控、可审计、可监管的能力	human review, evidence lineage, policy versioning, incident response, risk monitoring
Adoption capabilities	让组织真正改变工作方式的能力	frontline enablement, workflow redesign, champion network, quality calibration

4. AI Capability Taxonomy

AI capability taxonomy 不应按模型供应商或算法名称组织, 而应按企业能力、业务结果和复用边界组织。

4.1 L0 Capability Domains

L0 Domain	Definition	Typical Owners
AI Strategy and Portfolio	定义 AI 投资组合、战略 thesis、优先级和资金门控	CIO, COO, CDAO, Enterprise Architect, AI PM Lead
Business Architecture	用 value stream, capability map, operating model 管理转型	Enterprise Architect, Business Architect, AI BA
Decision Intelligence	用预测、推荐、评分、优化支持业务决策	Risk, Credit, Fraud, Operations, Data Science
Generative Experience	用 GenAI 改善知识、内容、对话和员工体验	Product, Service, Sales, Operations
Agentic Workflow	用工具调用、任务编排和审批链路执行受控动作	Product, Operations, Engineering, Risk
Data and Knowledge Foundation	管理源数据、知识、元数据、权限、检索和知识新鲜度	Data Owner, Knowledge Owner, Security
EvalOps and Quality	用黄金集、场景集、rubric、回归和监控管理 AI 质量	EvalOps, QA, Risk, Product
AI Platform and Integration	提供模型网关、RAG、工具网关、观测、部署和集成能力	Platform, Architecture, Engineering
Risk, Security and Compliance	管理隐私、安全、模型风险、合规、审计和事件	Risk, Compliance, Security, Privacy
Operating Model and Adoption	管理 RACI、流程变更、培训、激励、反馈和运营节奏	Operations, HR, Product Ops, Frontline Leaders
AI Economics and FinOps	管理 TCO, unit economics, budget, chargeback 和 vendor economics	Finance, Procurement, Platform Owner

4.2 L1 / L2 Capability Map

L0	L1 Capability	L2 Capabilities
AI Strategy and Portfolio	AI transformation thesis	strategic themes, outcome tree, risk appetite alignment, no-AI option
AI Strategy and Portfolio	Portfolio governance	intake, scoring, funding gate, quarterly review, scale/stop decision
Business Architecture	Value stream architecture	value stream map, pain metrics, control points, customer/employee outcome
Business Architecture	Capability management	capability inventory, heatmap, maturity model, owner registry, roadmap
Decision Intelligence	Predictive decision support	risk scoring, next-best-action, anomaly detection, forecast, propensity
Decision Intelligence	Human decision augmentation	rationale, evidence pack, challenger signals, decision record, override capture
Generative Experience	Knowledge assistance	policy Q&A, cited answers, case summarization, product guidance
Generative Experience	Content operations	customer response drafting, advisor notes, regulatory narrative support
Agentic Workflow	Tool and action orchestration	tool registry, action policy, approval workflow, idempotency, rollback
Agentic Workflow	Case workflow automation	task routing, evidence collection, checklist completion, exception escalation
Data and Knowledge Foundation	Data readiness	source inventory, data contract, lineage, quality score, retention
Data and Knowledge Foundation	Knowledge readiness	source of truth, versioning, effective date, jurisdiction, entitlement metadata
EvalOps and Quality	Offline eval	golden set, edge cases, rubric, regression, release threshold
EvalOps and Quality	Production quality monitoring	sampling, feedback loop, drift signal, incident taxonomy, quality dashboard
AI Platform and Integration	Model gateway	provider routing, model versioning, policy enforcement, telemetry, fallback
AI Platform and Integration	Retrieval and context platform	hybrid search, vector index, reranking, citation, context composer
Risk, Security and Compliance	AI risk controls	risk classification, control pack, human oversight, audit evidence
Risk, Security and Compliance	AI security	prompt injection defense, data exfiltration prevention, secrets handling, access control
Operating Model and Adoption	AI operating ownership	RACI, release calendar, incident runbook, change approval, owner cadence
Operating Model and Adoption	Workforce adoption	role redesign, training, champion network, trust metrics, productivity measurement
AI Economics and FinOps	Cost governance	cost per case, token budget, platform chargeback, vendor usage controls
AI Economics and FinOps	Benefit realization	baseline, benefit tracking, reuse credit, risk-adjusted ROI, value leakage review

4.3 Capability Map Design Rules

Capability 用稳定的名词短语表达, 不用项目名、产品名或 vendor 名。
Capability 描述“组织能做什么”, 不描述“某个系统如何实现”。
AI capability 必须绑定 business outcome, risk concern, owner, metric 和 architecture dependency。
区分 business capability 和 enabling platform capability, 但不要割裂两者。
每个 capability 至少要能回答: owner 是谁, 当前成熟度几级, 目标成熟度几级, 资金来自哪里, 复用到哪些 value streams。
不把“Chatbot”“RAG”“Agent”直接当作最高层能力, 它们通常是实现模式或平台能力。

5. Value Stream to Capability Mapping

Value stream 说明价值如何被交付, capability map 说明组织需要哪些能力才能稳定交付该价值。AI 转型要把两者连接起来, 否则会出现“流程痛点很多, 平台能力很多, 但投资无法排序”的问题。

5.1 Mapping Method

选择战略级 value stream, 例如 Resolve AML alert, Originate personal loan, Serve retail banking customer。
标出 value stream stages, 包括客户、员工、风险、合规和系统交互。
为每个 stage 记录 pain metric, control point 和 decision point。
映射所需 business capabilities, shared AI capabilities 和 control capabilities。
标出 capability maturity gap, architecture dependency 和 funding gate。
把 use cases 合并成 capability increments, 形成 roadmap。

5.2 Generic Matrix

Value Stream Stage	Business Outcome	Required Business Capabilities	Required AI Capabilities	Controls	Metrics
Sense / Trigger	及时发现机会或风险	event detection, customer/entity understanding	anomaly detection, intent classification, signal enrichment	threshold governance, bias checks, data lineage	detection rate, false positive rate, latency
Understand	建立事实和上下文	case evidence management, product/policy interpretation	summarization, retrieval, entity graph, explanation	citation, entitlement, evidence freshness	time to understand, missing evidence rate
Decide	做出可解释的业务判断	decision policy, risk assessment, approval authority	recommendation, decision support, challenger model	human oversight, override reason, model risk review	decision cycle time, override rate, error rate
Act	执行受控动作	workflow execution, customer communication, system update	agentic workflow, tool invocation, draft generation	action approval, idempotency, audit log	straight-through rate, rework rate, incident rate
Learn	反馈改进能力	QA, training, performance management	eval loop, production sampling, feedback mining	quality review, release gate, incident taxonomy	eval pass rate, adoption, benefit realization

5.3 AML Value Stream Example

AML Stage	Capability Gap	AI Capability Increment	Architecture Dependency	Gate Evidence
Alert intake	Alert context 分散, analyst 手动查多个系统	Alert enrichment and entity context assembly	customer 360, transaction graph, case API, entitlement-aware retrieval	baseline time, source inventory, access approval
Investigation	Narrative 编写慢, 证据引用不稳定	Evidence-grounded investigation copilot	citation store, policy-aware summarizer, audit log	golden set, hallucination eval, reviewer calibration
Disposition	决策理由不一致, override 不可分析	Decision support and rationale capture	decision record schema, policy version registry	false positive trend, QA pass rate, override taxonomy
SAR support	报告草稿和证据包拼装耗时	Controlled narrative drafting	approved templates, prohibited-decision guardrail, human approval	compliance sign-off, full audit reconstruction
QA / feedback	QA 发现的问题没有回流到系统	EvalOps and learning loop	eval dataset, defect taxonomy, prompt/index versioning	regression gate, production sampling dashboard

6. Capability Maturity Model

能力成熟度不是“AI 模型更强”这么简单。金融零售场景需要同时看业务结果、数据、知识、架构、风险、运营和经济性。

6.1 Six-Level Model

Level	Name	Signal
0	Fragmented	个人或团队零散试用 AI, 没有 owner, 没有生产路径
1	Experimenting	有 PoC, 但数据、权限、eval、风险、架构和 adoption 证据不完整
2	Controlled Pilot	有明确业务流程、样本集、风险分级、人审和 pilot 指标
3	Production Capability	能在受控生产流程运行, 有 owner, runbook, monitoring, audit 和 release gate
4	Reusable Enterprise Capability	多个 value streams 复用同一能力, 有平台接口、成本模型和季度治理
5	Adaptive Capability System	能根据反馈、风险、业务变化和模型变化持续演进, 并影响战略和组织设计

6.2 Maturity Assessment Dimensions

Dimension	Level 1 Evidence	Level 3 Evidence	Level 5 Evidence
Business ownership	Sponsor 支持 PoC	Capability owner 对 KPI, risk, budget 负责	Capability owner 参与季度 portfolio rebalancing
Value stream fit	场景来自痛点列表	映射到 value stream stage 和 baseline	价值流重构, 岗位和控制点同步变化
Data readiness	有样本数据	有 source of truth, lineage, quality and retention	数据合同、质量监控和知识新鲜度自动触发
Knowledge readiness	文档可上传	有 owner, version, jurisdiction, effective date	知识产品化, 政策变更自动进入 eval 和发布流程
Model and eval	Demo quality	Golden set, edge cases, release threshold	持续 eval, drift signal, failure mining, challenger strategy
Architecture	单点集成	标准模型网关、RAG、日志、权限、回滚	可替换 provider, 多业务复用, architecture decision traceability
Risk and compliance	风险口头评估	AI RMF mapped controls, human oversight, audit reconstruction	控制有效性趋势, incident learning, regulator-ready evidence
Adoption	用户试用反馈	Workflow training, champions, trust metric	Workforce redesign, incentive alignment, capability coaching
Economics	粗略 ROI	Cost per case, benefit baseline, budget cap	Reuse economics, chargeback, scale/stop rules

6.3 Heatmap Convention

Color	Meaning	Decision
Red	当前成熟度低且约束 value stream 结果	优先 discovery 或 stop, 不进入 production
Amber	有价值但缺关键依赖	进入 targeted runway 或 controlled pilot
Green	已具备生产能力	扩展复用或优化经济性
Blue	差异化优势能力	保护投资, 沉淀方法论, 打造成作品集证据

7. Portfolio Prioritization

AI portfolio prioritization 要避免两个极端:

只看业务热度, 导致高风险低准备度场景先上。
只看技术可行性, 导致做出没人改变工作方式的工具。

7.1 Prioritization Scorecard

Dimension	Weight	1 Point	3 Points	5 Points
Strategic alignment	12	局部效率	支撑部门目标	支撑企业战略主题
Value stream pain	12	轻微痛点	明确瓶颈	核心收入、风险或体验瓶颈
Capability reuse	12	单点使用	同域复用	跨业务域复用
Baseline and measurable outcome	10	无 baseline	有局部 baseline	有端到端 value stream baseline
Data and knowledge readiness	10	来源不清	来源可用但需治理	source of truth, owner, quality, entitlement 清晰
Risk acceptability	10	风险不可控	可用人审和限制控制	控制成熟且风险责任明确
Architecture fit	10	特殊集成	适配部分标准	适配企业 AI runway
Adoption readiness	8	用户参与弱	有 champion	流程 owner 承诺改变工作方式
Economic leverage	8	成本不清	有初步 TCO	成本 per case 和复用收益清晰
Time-to-learning	8	学习周期长	1-2 个季度可验证	30-60 天可产生高质量证据

Interpretation:

80-100: 候选为 portfolio priority, 进入 architecture and funding gate。
60-79: 候选为 controlled pilot, 必须补齐 red/amber dependency。
40-59: 适合 discovery 或 sandbox learning, 不承诺生产。
0-39: 暂缓, 除非监管、事故或战略压力改变优先级。

7.2 Funding Gates

Gate	Decision	Required Evidence	Stop Signal
Gate 0: Strategic fit	是否值得进入 discovery	AI transformation thesis, value stream candidate, sponsor	只有“想试 AI”, 没有业务结果
Gate 1: Capability discovery	是否形成 capability increment	value stream map, capability gap, baseline, owner	场景无法映射到能力或 owner
Gate 2: Architecture option	是否批准 pilot 架构	ADR, data/knowledge readiness, risk tier, build/buy/hybrid decision	架构绕过权限、审计、eval 或回滚
Gate 3: Controlled pilot	是否进入受控试点	eval set, pilot cohort, HITL, success metrics, runbook draft	没有 golden set 或人审责任
Gate 4: Production	是否进入生产	eval report, security/risk sign-off, audit reconstruction, operating RACI	质量、合规、成本或 adoption 未达阈值
Gate 5: Scale	是否扩展复用	adoption dashboard, benefit evidence, incident trend, cost per case	使用率高但质量差, 或价值不可证明
Gate 6: Refresh / retire	是否继续投资	maturity trend, vendor review, model/platform change impact	能力过时, 成本失控, 风险超出 appetite

7.3 Portfolio Balancing

一个成熟 AI portfolio 至少包含四类投资:

Portfolio Type	Purpose	Examples
Business outcome bets	直接改善关键 value stream	AML investigation, service containment, loan origination
Shared runway investments	提供多场景复用能力	model gateway, eval platform, knowledge governance
Risk reduction investments	降低监管、安全、运营风险	audit reconstruction, access controls, incident runbook
Learning options	快速验证新技术或新模式	agentic workflow sandbox, advisor copilot shadow mode

8. Architecture Runway

Architecture runway 是支持未来几个 capability increments 的技术、数据、治理和运营基础。它不是一次性“大平台采购”, 也不是每个项目各自搭一套。

8.1 Runway Principles

只建设未来 2-3 个季度明确会用到的 shared capabilities。
每个 runway item 必须绑定至少两个 value streams 或一个高风险强监管场景。
Runway backlog 由 capability gaps 驱动, 不由 vendor roadmap 驱动。
平台能力必须有消费方、SLO、成本模型和 owner。
对高风险业务, runway 必须先覆盖 access, audit, eval, incident 和 rollback。

8.2 Runway Components

Runway Component	Capability Enabled	Financial Retail Importance
Model gateway	provider routing, model versioning, fallback, usage telemetry	避免 vendor lock-in, 支撑审计和成本治理
Retrieval and context platform	cited knowledge, evidence grounding, entitlement-aware search	客服、AML、财富顾问、信贷政策都依赖
Knowledge governance	source owner, effective date, jurisdiction, versioning	防止过期政策和无权限内容进入回答
EvalOps platform	golden set, regression, release gate, failure taxonomy	高风险场景从 PoC 进入生产的门票
Tool permission gateway	action policy, approval, idempotency, audit	Agent 执行支付、case update、CRM action 前的控制层
AI observability	latency, cost, quality, retrieval, tool, incident telemetry	支撑 SLO, risk review, vendor review, FinOps
Data contracts	schema, lineage, quality, retention, ownership	避免模型输出建立在不可追溯数据之上
Human review workbench	review queue, reason codes, QA calibration	支撑 AML, lending, complaints, suitability
AI risk registry	use case inventory, risk tier, controls, evidence	对齐 NIST AI RMF, architecture board 和合规审查
Adoption dashboard	activation, frequency, trust, override, productivity	防止只上线不改变工作方式

8.3 Example Roadmap

Horizon	Capability Increment	Runway Focus	Gate
0-90 days	Service knowledge copilot pilot, AML investigation shadow mode	knowledge owner registry, model gateway MVP, golden set, audit schema	Gate 2 / Gate 3
3-6 months	Service production, AML controlled pilot, lending policy assistant pilot	entitlement-aware retrieval, QA workbench, production monitoring, incident runbook	Gate 4
6-12 months	Cross-domain knowledge platform, lending production, branch advisor pilot	tool permission gateway, integrated workflow, cost allocation, portfolio dashboard	Gate 5
12-18 months	Agentic operations, enterprise reuse, adaptive eval	automated regression, control effectiveness monitoring, multi-provider strategy	Gate 6

9. 金融零售案例

9.1 AML: 从 Alert Copilot 到 Financial Crime Intelligence Capability

Capability thesis:

AML AI 不是“帮 analyst 写总结”, 而是建设 financial crime operations 的证据组织、调查推理、叙事生成、QA 反馈和控制有效性能力。

Layer	Design
Value stream	Alert intake -> enrichment -> investigation -> disposition -> SAR support -> QA -> learning
Business capabilities	alert triage, entity risk understanding, evidence management, investigation narrative, QA calibration
Shared AI capabilities	evidence-grounded summarization, transaction graph context, policy retrieval, narrative drafting
Control capabilities	source citation, SAR decision boundary, human approval, audit reconstruction, model/prompt versioning
Architecture runway	case management API, transaction graph, entitlement-aware RAG, eval set, audit log
Funding gate	Pilot only after evidence lineage, golden set, compliance-reviewed narrative boundary and reviewer workflow
Metrics	investigation time, evidence completeness, QA pass rate, false positive reduction, reviewer override rate

Recommended roadmap:

Shadow mode: AI prepares evidence pack, analyst does not rely on output for final disposition。
Controlled pilot: AI drafts investigation summary with citations and uncertainty flags。
Production: AI integrated into case workflow, all outputs reviewed and logged。
Scale: QA defects feed eval, scenarios expand by typology and jurisdiction。

9.2 客服: 从 Chatbot 到 Omnichannel Service Intelligence

Capability thesis:

客服 AI 的核心不是“机器人回答问题”, 而是统一知识、意图、身份、上下文、服务流程和下一步动作, 降低 cost-to-serve 同时保护客户信任。

Layer	Design
Value stream	Customer contact -> authentication -> intent -> resolution -> follow-up -> feedback
Business capabilities	intent management, service policy interpretation, case resolution, complaint handling, knowledge operations
Shared AI capabilities	agent assist, cited policy Q&A, conversation summarization, next-best-action
Control capabilities	identity boundary, prohibited advice controls, escalation, complaint detection, transcript audit
Architecture runway	contact center integration, CRM context, knowledge versioning, channel policy, quality sampling
Funding gate	Production only after answer accuracy, escalation precision, policy freshness and supervisor QA metrics pass
Metrics	first contact resolution, average handle time, containment with quality, escalation accuracy, CSAT, complaint rate

Key design choice:

High-risk financial advice, fee dispute, hardship, fraud and complaint scenarios should route to human or constrained guidance.
Low-risk servicing, status inquiry, document guidance and internal agent assist can scale earlier。

9.3 信贷: 从 Policy Assistant 到 AI-Assisted Credit Lifecycle

Capability thesis:

信贷 AI 不能只做“审批建议”。更稳妥的企业能力路径是从 policy reasoning, document intelligence, underwriter assist, exception routing 和 monitoring 开始, 再逐步进入决策增强。

Layer	Design
Value stream	Application -> data collection -> verification -> underwriting -> offer -> closing -> monitoring
Business capabilities	borrower understanding, policy eligibility, credit risk assessment, exception management, adverse action support
Shared AI capabilities	document extraction, policy RAG, income reasoning support, risk signal explanation
Control capabilities	fair lending review, adverse action boundary, override capture, model risk management, explainability evidence
Architecture runway	LOS integration, document pipeline, policy versioning, feature lineage, decision record
Funding gate	No automated adverse decision without model risk, fair lending, human oversight and audit evidence
Metrics	application cycle time, stipulation rate, manual touch rate, policy exception rate, defect rate, fairness monitoring

Practical sequence:

Start with document intelligence and policy assistant。
Add underwriter evidence pack and exception checklist。
Add decision support with challenger signals and override reason capture。
Consider constrained automation only for low-risk, well-defined decisions with strong monitoring。

9.4 财富 / 分行: 从 Advisor Copilot 到 Relationship Intelligence

Capability thesis:

财富和分行 AI 的价值不只是提升销售话术, 而是增强客户理解、合规适当性、产品知识、关系经营和一线执行质量。

Layer	Design
Value stream	Customer review -> needs discovery -> suitability -> recommendation -> meeting notes -> follow-up
Business capabilities	relationship planning, product suitability, financial needs analysis, branch productivity, advisor supervision
Shared AI capabilities	meeting summarization, product/policy retrieval, next-best-conversation, portfolio insight
Control capabilities	suitability guardrails, disclosure prompts, approved language, supervisory review, complaint detection
Architecture runway	CRM, portfolio data, product catalog, policy knowledge, branch/advisor role permissions
Funding gate	Rollout only after suitability boundaries, approved content library and supervisory workflow are live
Metrics	preparation time, follow-up completion, advisor adoption, compliance defects, customer engagement, revenue quality

Design warning:

Advisor copilot 不能变成未审查的投资建议生成器。
对产品推荐、收益预期、风险等级和客户适配性必须设置明确边界。

9.5 企业 AI 平台: 从工具采购到 Shared Enterprise AI Capability

Capability thesis:

AI 平台不是“买一个 LLM 网关”或“建一个统一 RAG”。平台的价值在于让业务能力更快、更安全、更可复用地进入生产。

Platform Capability	Business Capability Enabled	Evidence of Value
Model gateway	多业务模型接入和回滚	provider change 不破坏业务流程, 成本可追踪
Retrieval platform	客服、AML、信贷、财富的可信知识	cited answers, entitlement, freshness, reduced duplicate indexes
EvalOps	试点到生产的质量门控	release blocked by eval failures, defect trend improving
Tool gateway	Agentic workflow 的受控动作	action approval, audit, idempotency, kill switch
AI observability	生产质量、成本和风险管理	model, prompt, retrieval, tool, user feedback traces
Governance registry	AI inventory 和风险证据	architecture board, risk review, audit package

Platform funding rule:

不以“统一平台愿景”拿预算。
以 2-3 个高价值 capability increments 的共性依赖拿预算。
每个 shared component 都要证明 reuse, adoption, SLO, cost model 和 retirement rule。

10. Templates

10.1 AI Capability Brief

Field	Content
Capability name	稳定名词短语, 例如 Evidence-Grounded AML Investigation
Capability owner	对 KPI, risk, budget 和 roadmap 负责的人
Strategic theme	对应企业战略主题
Value streams supported	支撑的端到端价值流
Current maturity	Level 0-5, 附证据
Target maturity	目标级别和时间窗口
Business outcome	revenue, cost, risk, experience, resilience, speed
AI pattern	RAG, decision support, agentic workflow, predictive model, document intelligence
Data / knowledge dependencies	source of truth, owner, quality, entitlement, retention
Architecture dependencies	model gateway, eval, workflow, audit, integration, security
Risk tier	Low, medium, high, regulated critical
Control design	human review, guardrail, audit, incident, fallback
Metrics	business KPI, quality KPI, risk KPI, adoption KPI, cost KPI
Funding ask	discovery, pilot, production, scale, refresh
Exit rule	stop condition, retire condition, vendor exit trigger

10.2 Capability Heatmap

Capability	Owner	Current Level	Target Level	Value Stream Impact	Risk Exposure	Reuse Potential	Priority
Evidence-grounded knowledge retrieval	Knowledge Platform Owner	2	4	Customer service, AML, wealth	High	High	Priority 1
AI eval and release gate	EvalOps Owner	1	4	All AI value streams	High	High	Priority 1
Advisor meeting intelligence	Wealth Ops Owner	1	3	Wealth and branch	Medium	Medium	Priority 2

10.3 Value Stream Capability Matrix

Value Stream Stage	Pain Metric	Business Capability	AI Capability	Control Capability	Runway Dependency	Decision
Investigation evidence assembly	40 minutes per case	Case evidence management	Evidence summarization	Citation and access control	Case API, retrieval platform	Pilot
Customer policy answer	25 percent escalation due knowledge gap	Service knowledge management	Cited Q&A	Approved policy and escalation	Knowledge registry	Production candidate

10.4 Maturity Assessment

Dimension	Evidence Observed	Current Level	Target Level	Gap	Next Investment
Business ownership	Sponsor named, no capability owner yet	1	3	Accountability	Appoint owner and define KPI/RACI
EvalOps	Manual sample review only	1	3	Release gate	Build golden set and regression runner
Architecture	Direct vendor UI, no integration	1	3	Audit and workflow	Define ADR and integrate with system of record

10.5 Portfolio Prioritization Scorecard

Candidate	Strategic	Pain	Reuse	Baseline	Readiness	Risk	Architecture	Adoption	Economics	Learning	Total	Decision
AML investigation capability	5	5	4	4	3	3	3	4	4	4	78	Controlled pilot after runway gap closure
Customer service knowledge copilot	4	4	5	4	4	4	4	5	4	5	87	Production candidate
Advisor autonomous recommendation agent	4	3	3	2	2	1	2	3	3	3	52	Discovery only

10.6 Architecture Runway Backlog

Runway Item	Enables	Consumers	Owner	Done Evidence	Sequence
Model gateway MVP	model routing, telemetry, fallback	service, AML, lending	Platform Owner	versioned model calls, logs, budget caps	First
Knowledge registry	source owner, version, freshness	service, wealth, lending	Knowledge Owner	owner map, effective dates, access metadata	First
Eval release gate	regression and production promotion	all AI capabilities	EvalOps Owner	golden set, threshold, release report	First
Tool permission gateway	controlled agent action	operations, payments, CRM	Security / Platform	action policy, approval log, kill switch	Later

10.7 Funding Gate Decision Memo

# Funding Gate Decision Memo

## Decision
Approve controlled pilot for [capability name] / Do not approve production expansion for [capability name].

## Business architecture evidence
- Strategic theme:
- Value stream:
- Capability gap:
- Capability owner:
- Current maturity:
- Target maturity:

## Architecture evidence
- Chosen pattern:
- ADR summary:
- Data and knowledge sources:
- Integration boundary:
- Audit and rollback:

## Risk and control evidence
- Risk tier:
- Human oversight:
- Eval result:
- Security/privacy controls:
- Incident runbook:

## Economics
- Baseline:
- Expected benefit:
- Cost per case or user:
- Reuse potential:
- Budget cap:

## Conditions
- Production entry condition:
- Scale condition:
- Stop condition:

10.8 Capability Owner Charter

Field	Content
Owner	Name and role
Scope	Capabilities, value streams, user groups
Accountability	KPI, risk, funding, adoption, quality
Decision rights	Scope, release, stop, scale, vendor escalation
Cadence	Weekly pilot review, monthly risk review, quarterly portfolio review
Evidence pack	KPI dashboard, eval report, incident log, cost report, adoption dashboard

11. Review Checklist

Strategy and Portfolio

Is there a clear AI transformation thesis beyond isolated use cases?
Are priority value streams named and tied to enterprise outcomes?
Is the candidate mapped to a capability, not only a feature or vendor tool?
Does the portfolio balance business bets, shared runway, risk reduction and learning options?
Are stop, scale and refresh rules defined before funding approval?

Business Architecture

Is the value stream mapped end to end, including controls and exceptions?
Are capability gaps visible as heatmap evidence?
Is each capability assigned to a real owner with budget and KPI accountability?
Are organization, role, policy and workflow changes included?
Are customer, employee, risk and regulatory concerns represented as architecture concerns?

Data and Knowledge

Are source of truth, owner, quality, lineage, retention and access documented?
Are knowledge sources versioned by effective date, jurisdiction and product?
Are retrieved documents treated as evidence, not instructions?
Is entitlement enforced before retrieval and generation?
Is stale or conflicting evidence handled explicitly?

AI Quality and Eval

Is there a golden set with realistic positive, negative and edge cases?
Does eval include domain quality, citation quality, refusal, escalation and control behavior?
Can eval failures block release?
Are production feedback and incidents converted into regression cases?
Are model, prompt, retrieval index and tool versions traceable?

Architecture and Security

Is the architecture described through stakeholder concerns and views?
Are model gateway, retrieval, tool, audit and fallback boundaries explicit?
Are prompt injection, data exfiltration, excessive agency and unsafe tool use addressed?
Are human review, approval and rollback implemented in workflow, not only policy text?
Can the enterprise reconstruct who saw what, which evidence was used, which version produced output and who approved final action?

Operating Model and Adoption

Is there a RACI for product, process, data, knowledge, model, eval, risk, security and operations?
Are frontline users trained on when to trust, challenge, escalate and ignore AI output?
Are adoption metrics linked to workflow redesign instead of login counts only?
Are supervisors and QA reviewers calibrated?
Does the operating cadence include quality, risk, cost and benefit review?

Funding

Does the funding request distinguish discovery, pilot, production, scale and runway?
Is reuse value credited to shared capabilities?
Is cost per case, user, workflow or decision measured?
Are vendor and platform costs visible across model, storage, retrieval, observability and support?
Is there a retirement or exit trigger?

12. Anti-Patterns

Anti-Pattern	Symptom	Better Pattern
Use case zoo	50 AI ideas, no architecture thesis	Capability portfolio tied to value streams
PoC theater	Demo success, no production owner	Funding gates with eval, risk and operating evidence
Model-first architecture	Team starts with model benchmark	Start with capability gap, risk tier and workflow
Platform moonshot	Huge AI platform before business consumers	Runway built for named capability increments
Vendor-led architecture	Vendor demo becomes target architecture	Enterprise-owned ADR, control pack and exit plan
RAG as strategy	Every problem becomes document search	Match AI pattern to decision, workflow and risk
One-size copilot	Same assistant for analyst, advisor, agent and customer	Role-specific context, permissions, output and controls
HITL as decoration	Human reviewer rubber-stamps AI output	Reviewer authority, reason codes, QA calibration
Eval after launch	Quality checked by anecdotal feedback	Golden set and release gate before production
Governance theater	Policy deck exists, system has no controls	Controls embedded in workflow, logs and approval paths
ROI theater	Benefits assumed from time saved	Baseline, adoption, quality and cost per case tracked
Capability without owner	Everyone wants platform, nobody owns outcomes	Capability owner charter and quarterly review
Architecture roadmap as procurement list	Roadmap equals vendor modules	Roadmap equals capability increments plus runway
Automation before redesign	AI accelerates broken workflow	Redesign value stream, controls and roles first
Compliance as final sign-off	Risk sees solution after build	Risk and compliance join at Gate 0 and Gate 1

13. 30 天训练计划

目标: 在 30 天内形成一个可展示的 AI capability-based planning 作品集包, 面向金融零售 AI 转型、企业架构和高级 AI PM/BA 面试。

Day	Focus	Output
1	选择一个战略主题: financial crime, service transformation, credit lifecycle, branch/wealth, AI platform	AI transformation thesis 一页
2	选择 1-2 条 priority value streams	Value stream scope and baseline metrics
3	画 AS-IS value stream, 标出 pain, decisions, controls	Value stream map v1
4	定义 target outcomes and risk appetite	Outcome tree and risk boundary
5	建立 L0-L2 capability map	Capability map v1
6	做 capability heatmap	Heatmap with owner and maturity
7	把 use cases 合并成 capability increments	Use case to capability consolidation table
8	设计 capability maturity model	Maturity assessment v1
9	定义 data and knowledge readiness	Source inventory and ownership map
10	定义 AI patterns	Pattern decision matrix
11	写 architecture concerns and viewpoints	Stakeholder concern matrix
12	写第一组 ADR: RAG/model/eval/workflow	ADR set v1
13	定义 EvalOps strategy	Golden set outline and quality rubric
14	定义 control pack	AI RMF mapped control table
15	设计 architecture runway	Runway backlog v1
16	建立 portfolio scorecard	Prioritization model and scored candidates
17	定义 funding gates	Gate evidence checklist
18	设计 operating model	RACI and governance cadence
19	设计 adoption dashboard	Activation, trust, quality and benefit metrics
20	做 economics	Cost per case, TCO and reuse economics
21	AML case deep dive	AML capability brief
22	客服 case deep dive	Service intelligence capability brief
23	信贷 case deep dive	Credit lifecycle capability brief
24	财富/分行 case deep dive	Relationship intelligence capability brief
25	AI 平台 case deep dive	Shared AI platform capability brief
26	整合 roadmap	0-18 month architecture roadmap
27	写 executive decision memo	Funding gate memo
28	准备 interview story	5-minute portfolio narrative
29	自审 anti-patterns and gaps	Review checklist evidence
30	形成作品集包	Final deck outline and artifact index

Weekly practice rule:

每周至少把一个场景从 use case 重写为 capability。
每周至少写一个 funding gate decision。
每周至少用一个风险问题挑战自己的 architecture roadmap。
每周至少把一个模板填成完整样例。

14. 面试答案

Q1: 你如何把企业 AI 转型从 use case list 升级为 capability portfolio?

30 秒版本:

我不会从“收集 AI 场景”开始排序, 而是先看企业战略和关键 value streams, 找出限制业务结果的 capability gaps。然后把零散 use cases 合并成 capability increments, 用 maturity model, architecture runway, risk controls 和 funding gates 管理投资。这样能避免 PoC 泛滥, 也能把平台能力、风险治理和业务价值放在同一张 roadmap 上。

2 分钟版本:

我的方法是五步。第一, 明确 AI transformation thesis, 例如降低金融犯罪调查成本、提升客服一次解决率或缩短信贷周期。第二, 选择优先 value streams, 画出流程阶段、决策点、控制点和 baseline。第三, 建立 capability map, 区分业务能力、共享 AI 能力、控制能力和 adoption 能力。第四, 对 capability 做成熟度和投资优先级评估, 评分维度包括战略价值、痛点强度、复用潜力、数据准备度、风险可控性、架构适配、adoption 和经济性。第五, 用 funding gates 控制 discovery, pilot, production 和 scale, 每个 gate 都要求业务、架构、eval、风险、运营和成本证据。最终交付的不是场景清单, 而是能力组合、路线图和治理节奏。

Q2: AI capability map 和传统 capability map 有什么不同?

30 秒版本:

传统 capability map 关注组织能做什么。AI capability map 还要显式表达 AI pattern、数据/知识依赖、eval、风险控制、架构 runway、adoption 和 unit economics, 因为 AI 能力的生产稳定性取决于这些运行条件。

2 分钟版本:

我会保留 capability map 的稳定性原则, 不按系统或项目命名能力。但在 AI 场景下, 每个能力必须额外连接六类信息: 第一, 它增强的是哪个 value stream stage; 第二, 它使用什么 AI pattern, 例如 RAG, decision support, document intelligence 或 agentic workflow; 第三, 它依赖哪些数据和知识源, 这些源是否有 owner, lineage, retention 和权限; 第四, 它如何 eval, 包括 golden set, rubric 和 release gate; 第五, 它有哪些风险和控制, 例如 human oversight, audit reconstruction, prompt injection defense; 第六, 它的 adoption 和成本如何度量。这样 capability map 才能从静态业务图变成 AI 投资和架构治理工具。

Q3: 如何优先排序 AML、客服、信贷、财富和 AI 平台这些 AI 投资?

30 秒版本:

我会用 portfolio scorecard, 不只看 ROI。核心维度包括战略对齐、value stream pain、复用潜力、baseline、数据/知识准备度、风险可控性、架构适配、adoption、经济性和 time-to-learning。高风险场景即使价值大, 也必须先补足 eval、审计、人审和风险控制。

2 分钟版本:

客服知识 copilot 可能较早进入 production, 因为知识边界清晰、用户量大、复用度高, 但仍需 policy freshness 和 escalation controls。AML 价值高且监管重要, 但应从 shadow mode 和 controlled pilot 开始, 重点补 evidence lineage, audit reconstruction 和 reviewer calibration。信贷需要更谨慎, 我会先做 document intelligence 和 policy assistant, 再做 underwriter assist, 不会直接进入自动拒贷或定价。财富和分行适合从 meeting summary, approved product knowledge 和 next-best-conversation 开始, 严格控制 suitability 和投资建议边界。AI 平台投资则必须绑定这些业务能力的共性依赖, 例如 model gateway, entitlement-aware retrieval, eval release gate 和 observability, 不能脱离业务消费者单独建设。

Q4: Architecture runway 在 AI 转型中怎么定义?

30 秒版本:

AI architecture runway 是未来几个 capability increments 需要提前建设的共享技术、数据、控制和运营基础, 包括 model gateway, retrieval, knowledge governance, EvalOps, tool gateway, observability, audit 和 FinOps。它必须由业务能力缺口驱动, 不能变成泛化平台建设。

2 分钟版本:

我会从 capability roadmap 反推 runway。比如未来两个季度要落地客服 copilot、AML investigation 和信贷 policy assistant, 那么共性 runway 就包括知识 owner registry、entitlement-aware retrieval、model gateway、eval golden set、audit schema、production monitoring 和 incident runbook。对 agentic workflow, 还需要 tool permission gateway, action approval, idempotency 和 kill switch。每个 runway item 都要有消费者、owner、SLO、成本模型和完成证据。这样平台投资既不会滞后业务, 也不会变成没有消费方的大平台。

Q5: 如何设计 AI funding gate?

30 秒版本:

我会把 AI funding 分成 discovery, architecture option, controlled pilot, production, scale 和 refresh gates。每个 gate 要求不同证据, 从 value stream 和 capability owner, 到 data readiness, ADR, eval, risk controls, operating model, adoption, cost per case 和 stop rules。

2 分钟版本:

Gate 0 看战略 fit, 防止只因为热度做 AI。Gate 1 看 capability discovery, 要求 value stream map, maturity gap, baseline 和 owner。Gate 2 看 architecture option, 要求 ADR, data/knowledge readiness, risk tier, build/buy/hybrid 决策和回滚思路。Gate 3 批准 controlled pilot, 要求 golden set, pilot cohort, human review 和 runbook draft。Gate 4 才允许 production, 要求 eval report, risk/security sign-off, audit reconstruction 和 operating RACI。Gate 5 管 scale, 看 adoption, benefit, incident trend 和 cost per case。Gate 6 管 refresh or retire, 看模型、vendor、风险和经济性变化。这个机制能把 AI 从一次性项目治理成长期能力。

Q6: 你如何把 NIST AI RMF 和业务架构结合?

30 秒版本:

我会把 NIST AI RMF 的 Govern, Map, Measure, Manage 嵌入 capability lifecycle。Govern 对应 owner 和 funding gate, Map 对应 value stream 和风险语境, Measure 对应 eval 和 monitoring, Manage 对应控制、incident、release 和持续改进。

2 分钟版本:

在业务架构层, 我先定义 value stream, stakeholders, concerns, capability gaps 和 risk appetite, 这对应 Map。然后把每个 capability 纳入 portfolio governance, 明确 owner, RACI, review cadence 和 funding gates, 这对应 Govern。在 solution 和 operating 层, 我设计 golden set, rubric, production sampling, drift signals, adoption metrics 和 cost metrics, 这对应 Measure。最后, 我把 human oversight, escalation, rollback, incident response, model/prompt/index versioning 和 quarterly review 放进 operating model, 这对应 Manage。这样 AI RMF 不是合规清单, 而是 capability planning 的控制系统。

Q7: 你如何向高管解释为什么不能只做 use cases?

30 秒版本:

Use case list 可以启动讨论, 但不能管理企业转型。高管真正需要的是: 哪些能力会形成可持续优势, 哪些能力能复用, 哪些风险可控, 哪些投资应该继续、停止或扩展。Capability portfolio 能把这些问题放到同一个决策框架。

2 分钟版本:

我会用一个例子解释。客服、AML、信贷和财富都可能提出“知识助手”。如果按 use case 分别做, 会产生四套知识库、四套权限、四套 eval 和四套审计。短期看每个 PoC 都快, 长期看成本、风险和治理复杂度都上升。如果把它抽象成 enterprise evidence-grounded knowledge capability, 就可以统一建设 knowledge registry, entitlement-aware retrieval, citation, eval 和 audit, 然后按不同业务角色配置输出和控制。这样既能保留业务差异, 又能形成复用经济和治理一致性。这就是从 use cases 升级到 capability portfolio 的价值。

15. 作品集交付物

一个高级 AI 企业架构 / 产品战略 / 能力规划作品集包应包含以下交付物:

Artifact	Purpose	Interview Signal
AI transformation thesis	说明战略选择和边界	能从企业目标而不是技术热点出发
Priority value stream maps	展示端到端业务、控制和痛点	能做业务架构, 不停留在需求列表
AI capability map	展示 L0-L2 capability taxonomy	能把 use cases 抽象成可复用能力
Capability heatmap	展示成熟度、owner、优先级	能做投资排序和组织对齐
Value stream to capability matrix	连接流程、能力、AI、控制和指标	能把业务、架构和风险放在同一张表
Maturity assessment	说明 current/target gaps	能设计能力演进路径
Portfolio scorecard	说明 prioritization logic	能处理资源有限和风险约束
Architecture runway	说明共享平台、数据、eval、审计和集成依赖	能规划企业级 AI 架构
Funding gate memo	说明是否批准 discovery/pilot/production/scale	能做高管决策材料
AI control and eval pack	说明质量、风险和 release gate	能把 AI RMF 转成执行证据
Financial retail case briefs	AML, 客服, 信贷, 财富/分行, AI 平台	能展示行业理解和迁移能力
Operating model / RACI	说明上线后谁负责	能避免“上线即结束”的项目思维
Adoption and benefit dashboard	说明使用、信任、质量、收益和成本	能证明转型价值
Executive narrative deck	讲清从战略到能力到路线图到资金的故事	能面向 CIO/COO/CDAO/业务高管沟通

Recommended storyline:

The problem is not lack of AI ideas.
The problem is lack of reusable, governed, measurable AI capabilities.

I start from enterprise strategy and value streams.
I identify capability gaps and maturity.
I consolidate use cases into capability increments.
I design architecture runway and control gates.
I prioritize portfolio funding using value, readiness, risk, reuse and economics.
I prove the approach through AML, service, lending, wealth/branch and AI platform cases.

16. Practical Operating Cadence

Cadence	Meeting	Inputs	Decisions
Weekly	Capability pilot review	eval results, user feedback, incidents, cost, defects	prompt/index/workflow fixes, pilot scope adjustments
Biweekly	Value stream transformation review	baseline movement, blockers, role changes, control issues	process redesign, adoption actions, dependency escalation
Monthly	AI architecture and risk review	ADRs, risk register, control evidence, security findings	release, rollback, new controls, architecture exceptions
Quarterly	Capability portfolio review	heatmap, scorecard, benefit, cost, incidents, reuse	scale, stop, refresh, fund runway, rebalance portfolio

Quarterly review questions:

Which capabilities moved maturity level?
Which capabilities created reusable assets?
Which value streams show measurable improvement?
Which controls failed or required manual compensation?
Which platform components are underused or over-centralized?
Which use cases should be merged, stopped or reframed?
Which funding gates need stronger evidence next quarter?

17. Final Mental Model

Use case thinking asks:

What AI thing can we build for this department?

Capability-based planning asks:

Which enterprise capabilities must become AI-enabled,
which value streams will improve,
which architecture runway is required,
which controls make it trustworthy,
which funding gates prove it deserves to scale,
and which owners will operate it after launch?

This is the shift from AI experimentation to AI enterprise transformation.