AI Portfolio Management / Funding Governance Playbook
这些来源作为学习锚点, 不构成法律、合规、审计、财务或监管咨询意见。
AI Portfolio Management / Funding Governance Playbook
定位: 面向 AI PM / AI Portfolio Lead / AI Product Architect / Enterprise Architect / 金融零售 AI 转型负责人的 AI 组合管理与资金治理手册。 目标: 把 AI use case portfolio、portfolio kanban、investment thesis、capacity allocation、risk-adjusted value、funding gate、benefits realization、platform runway 和 scale/stop rule 连接成一套可运营的治理系统。 核心观点: AI portfolio management 不是把用例排优先级, 而是持续管理“价值证据、风险证据、平台能力和组织容量”的投资组合。
1. Source Anchors
这些来源作为学习锚点, 不构成法律、合规、审计、财务或监管咨询意见。
| Anchor | Link | 在本 playbook 中的用法 |
|---|---|---|
| NIST AI RMF | https://www.nist.gov/itl/ai-risk-management-framework | 用 Govern / Map / Measure / Manage 组织 AI 风险分层、证据门禁、监控和持续治理 |
| SAFe Lean Portfolio Management | https://scaledagileframework.com/lean-portfolio-management/ | 借鉴 portfolio vision、portfolio kanban、lean budgets、guardrails 和 capacity allocation 的组合治理语言 |
| COSO ERM | https://www.coso.org/guidance-on-erm | 用 enterprise risk management 视角连接战略、风险偏好、绩效、审查和信息沟通 |
2. One-Sentence Positioning
AI Portfolio Management / Funding Governance 是一套把 AI 机会从 idea funnel 管到 scale/stop decision 的投资操作系统: 用 portfolio kanban 管流量, 用 investment thesis 管方向, 用 capacity allocation 和 funding guardrails 管资金, 用 risk-adjusted scoring 管优先级, 用 gate evidence 管放行, 用 benefits realization 管承诺兑现, 用 platform runway 管长期复用能力。
更短的面试版:
我不会把 AI 用例当成项目清单管理, 而会把它当成一个风险调整后的产品投资组合: 每个用例都必须证明业务价值、数据可用性、架构复用、风险可控和 adoption 可落地, 否则就缩小、暂停或停止。
3. 为什么 AI Portfolio 不能按传统项目集管理
传统项目集管理通常围绕 scope、budget、timeline、dependency、milestone 和 resource plan 运转。AI portfolio 多了几个持续变量:
- 价值不确定: demo 价值和生产价值之间差距很大, 需要 time-to-evidence 而不是只看 business case。
- 能力不稳定: model capability、cost、latency、tooling、vendor terms、policy 和 regulation 会持续变化。
- 数据依赖重: 很多用例失败不是模型不强, 而是 data readiness、knowledge ownership、lineage、label quality 或 workflow integration 不成熟。
- 风险非线性: 同样是 summarization, 用在内部知识检索和客户拒贷解释的风险完全不同。
- 平台能力复用强: 一个 model gateway、eval harness、RAG pipeline、policy-as-code、observability 或 human review queue 可以改变几十个用例的经济性。
- adoption 是瓶颈: AI 价值常被卡在员工信任、流程改造、绩效指标、责任划分和一线培训上。
- 停止也创造价值: 早停低质量或高风险用例, 可以释放容量给高杠杆平台能力和更清晰的业务问题。
因此 AI portfolio 的核心问题不是:
Which project should we fund this quarter?
而是:
Which AI bets deserve capacity now?
Which bets need discovery before funding?
Which bets should share platform runway?
Which bets create unacceptable risk or adoption friction?
Which bets have enough evidence to scale?
Which bets should be stopped before they consume more scarce capacity?
3.1 传统项目集 vs AI Portfolio
| 维度 | 传统项目集管理 | AI portfolio management |
|---|---|---|
| 资金对象 | 项目或需求包 | 产品能力、平台能力、实验组合、风险控制能力 |
| 立项依据 | business case、sponsor priority、年度预算 | investment thesis、risk-adjusted value、time-to-evidence、platform leverage |
| 进度语言 | milestone、delivery date、scope completion | evidence stage、eval result、adoption signal、risk trend、benefits realization |
| 成功标准 | 按时按预算上线 | 业务结果被验证、风险可控、可运营、可扩展或被及时停止 |
| 风险治理 | 上线前审批 | 从 idea intake 到 scale/retire 的持续 gate |
| 资源管理 | 人月和项目预算 | capacity allocation、risk budget、platform runway、SME review capacity |
| 变更管理 | change request | learning-based reallocation |
| 退出机制 | 项目关闭 | stop / retire / merge / convert to platform capability |
3.2 AI Portfolio 的三个账户
AI 组合治理要同时管理三个账户, 不能只看财务预算。
| 账户 | 管什么 | 如果忽略会发生什么 |
|---|---|---|
| Value account | 收入、成本节省、风险降低、客户体验、员工效率、合规质量 | 大量 demo 看起来有趣, 但没有可兑现收益 |
| Risk account | 客户伤害、监管风险、隐私、安全、模型风险、运营韧性、声誉 | 上线速度快, 但风险事件吞噬信任和管理层支持 |
| Capacity account | 产品、工程、数据、风险、SME、平台、运营培训、变更管理容量 | 组合表看起来满, 实际队伍被评审、集成和支持压垮 |
4. AI Portfolio Kanban / Funnel
AI portfolio kanban 的作用是控制投资流动, 不是做状态汇报。每一列都应该有 entry criteria、exit criteria、WIP limit、evidence requirement 和 decision owner。
Idea
-> Discovery
-> Pilot
-> Release
-> Scale
-> Retire / Stop
4.1 Portfolio Kanban 总览
| Stage | 主要问题 | 典型证据 | 关键决策 | 推荐 WIP 管控 |
|---|---|---|---|---|
| Idea | 是否值得进入发现? | business problem、owner、受影响流程、初步价值、初步风险 | accept to discovery / park / reject | 大量入口可以接受, 但必须有 sponsor 和 problem statement |
| Discovery | 是否值得花 pilot capacity? | baseline metric、workflow map、data readiness、risk tier、AI fit、no-AI option | fund pilot / refine / stop | 限制并行发现, 防止产品和 SME 被访谈耗尽 |
| Pilot | 是否有足够证据进入受控发布? | eval result、SME review、prototype usage、cost/latency、control design | release candidate / pivot / stop | pilot 必须短周期, 有明确 success criteria |
| Release | 是否可在真实流程中上线? | production architecture、monitoring、rollback、training、RACI、risk approval | release / conditional release / no-go | 高风险用例必须小范围 release |
| Scale | 是否值得扩大到更多队列、产品线、地区或渠道? | realized benefits、adoption cohort、incident trend、unit economics、platform capacity | scale / hold / restrict / stop | scale 使用单独 capacity, 不能挤掉 discovery |
| Retire / Stop | 是否应该停止、合并或替换? | value miss、risk breach、low adoption、better platform option、cost drift | retire / merge / rebuild / archive | stop 决策必须被视为成熟治理, 不是失败遮掩 |
4.2 Idea Intake: 从“想用 AI”改成“问题投资”
Idea 不应以“我们要做一个 chatbot”开头, 而应以业务问题和投资假设开头。
| Intake 字段 | 合格写法 |
|---|---|
| Business problem | AML analyst 每天处理大量低风险 alert, 重复查证和 narrative drafting 占用高价值调查时间 |
| Target outcome | 降低低风险 alert 平均处理时间, 同时保持 SAR escalation quality 和 auditability |
| Process owner | Financial Crime Operations |
| User group | L1 AML analysts、QA reviewers、team leads |
| AI role | 检索 case context、生成可审阅摘要、推荐下一步调查 checklist, 不自动关闭 alert |
| Baseline metric | AHT、false positive rate、QA defect rate、escalation rate、backlog age |
| Risk hypothesis | 错误摘要可能影响调查质量; 敏感客户信息必须在受控环境处理 |
| Platform dependency | RAG、case system connector、model gateway、trace logging、SME feedback queue |
4.3 Discovery: 证明“值得做”和“适合 AI”
Discovery 的输出不是 PRD, 而是投资证据。
| Discovery 证据 | 要回答的问题 | 合格标准 |
|---|---|---|
| Workflow evidence | AI 插入哪个工作步骤? 上下游责任如何变化? | 有现状流程、目标流程、人工决策点和异常路径 |
| Baseline metric | 现在的成本、时长、质量、风险是多少? | 有可量化 baseline 和数据来源 |
| AI fit | 为什么 AI 比规则、流程改造、UI 改造或培训更合适? | 有 no-AI alternative 比较 |
| Data readiness | 数据、知识源、标签和权限是否足够? | 有 owner、freshness、quality、access、retention 判断 |
| Risk tier | 客户、财务、合规、隐私、安全和运营影响多大? | 已给出风险级别和初步 control path |
| Adoption friction | 用户为什么会用? 为什么可能不用? | 有用户证据和变更管理假设 |
| Time-to-evidence | 多久能得到可决策证据? | pilot 能在 4-8 周内验证关键假设 |
4.4 Pilot: 不是“做小版本”, 而是验证关键假设
AI pilot 应该验证最不确定、最能改变投资决策的假设。
| Pilot 类型 | 适用场景 | 验证重点 |
|---|---|---|
| Concierge pilot | 后台人工模拟 AI 能力 | workflow value、adoption、decision journey |
| Offline eval pilot | 历史样本上测试模型或 RAG | accuracy、coverage、failure modes、risk controls |
| Shadow mode pilot | 生产流程旁路运行, 不影响客户或员工决策 | model behavior、latency、cost、case variance |
| Limited live pilot | 小队列、小用户群、小地区真实使用 | adoption、operational load、incident handling、benefit signal |
| Platform pilot | 多个用例共享能力验证 | reusable architecture、developer experience、unit economics |
4.5 Release: 从实验证据进入受控生产
Release gate 不只问“功能是否完成”, 还要问“是否可运行、可监控、可解释、可回滚、可审计”。
| Release 维度 | 必须明确 |
|---|---|
| Scope boundary | 哪些用户、队列、渠道、产品、语言、地区可用 |
| AI authority | AI 是 search、draft、recommend、triage、decision support 还是 automated action |
| Human oversight | 哪些输出必须人工确认, 哪些可自动提交, 哪些必须二线复核 |
| Monitoring | quality、risk、cost、latency、adoption、drift、incident 指标 |
| Rollback | 触发条件、执行人、技术路径、用户沟通、数据处理 |
| Evidence retention | prompt/model/index/tool version、source citation、trace、review decision、approval record |
4.6 Scale: 把成功用例变成可重复能力
Scale 不是复制上线, 而是验证在更复杂环境中的经济性和治理能力。
| Scale 问题 | 判断标准 |
|---|---|
| Value repeatability | 在不同队列、产品线、地区或用户群中是否仍有相近收益 |
| Risk stability | 事故率、误用率、投诉、QA defects 是否随规模上升 |
| Platform capacity | gateway、retrieval、logging、eval、support、SME review 是否能承载 |
| Operating model | RACI、training、runbook、support desk、change management 是否成熟 |
| Unit economics | 每次任务成本、节省工时、风险成本、平台摊销是否可接受 |
| Architecture leverage | 是否沉淀 reusable connector、policy、eval set、workflow pattern |
4.7 Retire / Stop: 停止规则要提前写进组合治理
Stop rule 的目标不是惩罚团队, 而是保护组合容量。
| Stop signal | 决策动作 |
|---|---|
| 关键价值假设在 pilot 中未成立 | 停止或回到 discovery 重定义问题 |
| 数据质量或权限无法支撑生产 | 暂停用例, 转成 data product / governance investment |
| 风险控制成本高于业务价值 | 停止或缩小到低风险辅助场景 |
| adoption 长期低于阈值 | 停止 scale, 重新设计 workflow 和 change plan |
| 平台已有更优通用能力 | 合并或迁移到平台能力, 关闭独立实现 |
| vendor 成本、锁定或合规风险不可接受 | 退出供应商路径, 转 build / alternative vendor |
| 线上 incident 超出 risk appetite | 立即限制范围、回滚或 retire |
5. Funding Governance
Funding governance 的目标是让资金、容量和风险边界支持学习速度。AI 资金治理不应只按部门申请预算, 而要按投资类别、证据阶段和风险级别动态分配。
5.1 Investment Thesis
一个成熟的 AI portfolio 应先有 investment thesis, 再看单个用例。
We invest in AI capabilities that reduce operational decision load,
improve regulated workflow quality,
reuse approved platform controls,
and produce measurable evidence within one quarter.
We avoid AI investments where data rights are unclear,
human accountability is ambiguous,
or value depends on broad behavior change without operational sponsorship.
中文表达:
我们优先投资能降低运营判断负荷、提升受监管流程质量、复用平台控制、并能在一个季度内产生证据的 AI 能力。对于数据权属不清、人工责任不清、或价值依赖大规模行为改变但缺少运营 sponsor 的用例, 先不放大投资。
5.2 Capacity Allocation
AI portfolio 的 capacity allocation 要覆盖产品、工程、数据、风险、SME 和平台团队, 不能只看开发人力。
| Capacity bucket | 建议比例 | 资金对象 | 典型投入 |
|---|---|---|---|
| Strategic bets | 25% | 高价值、高不确定、战略相关用例 | 信贷运营智能化、AML investigation copilot、客户服务智能分流 |
| Core workflow optimization | 25% | 明确业务价值、风险可控、流程内 AI | case summary、agent assist、QA sampling、knowledge retrieval |
| Platform runway | 25% | 横向复用能力 | model gateway、eval platform、RAG service、observability、policy controls |
| Risk and assurance | 15% | 风险、审计、模型治理、合规证据 | risk tiering、red team、audit evidence binder、incident drill |
| Discovery and option creation | 10% | 早期探索和快速证据 | opportunity discovery、offline eval、shadow mode、vendor proof |
比例不是固定配方, 而是季度治理起点。金融零售高监管场景通常不能把 platform runway 和 risk assurance 压到过低, 否则短期用例会在 release 和 scale 阶段反复卡住。
5.3 Funding Guardrails
Funding guardrails 定义团队在预算内可以自主调整的边界, 也定义必须升级审批的触发条件。
| Guardrail | Team 可自主决策 | 必须升级 |
|---|---|---|
| Spend guardrail | 在已批准 experiment envelope 内调整 prompt、eval、workflow prototype 成本 | 进入生产、扩大用户范围、引入新供应商、超出季度 envelope |
| Risk guardrail | 低风险内部辅助场景内迭代 | 影响客户权益、信贷、AML、投资建议、投诉、监管报告 |
| Data guardrail | 使用已批准数据源和访问模式 | 新增敏感数据、跨境处理、外部模型处理 PII、改变 retention |
| Model guardrail | 使用 allowlist 模型和 gateway policy | 新模型、新区域、新 contractual terms、模型能力改变风险边界 |
| Scope guardrail | 在批准 cohort 内做 A/B、shadow 或 limited live | 跨渠道、跨产品线、跨地区、客户可见输出 |
| Benefit guardrail | 调整 leading indicators | 改变承诺收益、财务归因方法或绩效激励 |
5.4 Platform Runway
Platform runway 是未来多个 AI 用例可以复用的能力储备。它不应被看作“非业务项目”, 因为它直接影响 portfolio throughput、unit economics 和 risk posture。
| Platform capability | 为什么属于 runway | 组合层收益 |
|---|---|---|
| Model gateway | 统一模型访问、策略、日志、成本、供应商切换 | 降低供应商风险和接入成本 |
| Eval harness | 统一 golden set、rubric、regression、release threshold | 提高 release 信心, 缩短门禁周期 |
| RAG / knowledge service | 统一索引、权限、引用、freshness、知识 owner | 避免每个用例重复搭建检索 |
| Policy-as-code | 把合规、数据、权限和输出规则机器可执行化 | 降低人工审批负担, 提高一致性 |
| Observability | traces、quality signals、cost、latency、drift、incident signals | 支持 benefits realization 和风险监控 |
| Human review workflow | SME sampling、QA、appeal、escalation、feedback loop | 把 human oversight 做成可运营能力 |
| Audit evidence binder | 自动沉淀版本、决策、审批、评估和事件证据 | 降低审计和监管响应成本 |
5.5 Risk Budget
Risk budget 不是允许团队制造风险, 而是把组织的 risk appetite 转成 portfolio 决策语言。
| Risk budget 类型 | 适用问题 | 组合层控制 |
|---|---|---|
| Experiment exposure budget | 一季度允许多少用户、案件或交易进入 AI 试验 | cohort size、shadow mode、limited live cap |
| Error budget | 可接受的错误率、误分流率、draft defect rate | eval threshold、QA sampling、automatic rollback |
| Review capacity budget | SME 和 risk reviewer 每周能审多少样本 | WIP limit、sampling plan、review SLA |
| Regulatory sensitivity budget | 高监管影响用例占组合比例 | 高风险用例必须配套 assurance capacity |
| Vendor concentration budget | 单一模型或供应商依赖度 | gateway abstraction、exit plan、multi-vendor strategy |
| Operational disruption budget | 允许多少流程变更同时发生 | rollout wave、training load、support readiness |
5.6 Benefits Realization
AI funding 必须把 benefits realization 写进 gate, 否则组合会停留在“上线数量”。
| Benefit 类型 | Measurement | 常见陷阱 | 治理要求 |
|---|---|---|---|
| Productivity | AHT、case throughput、rework rate、employee time saved | 把节省时间重复计算, 或没有转成 capacity release | 定义 baseline、cohort、财务归因和释放路径 |
| Quality | QA defect rate、first contact resolution、SAR narrative quality、credit memo quality | 只看速度, 不看质量损失 | 质量指标必须和效率指标配对 |
| Risk reduction | false negative reduction、control coverage、audit finding reduction | 风险收益难以归因 | 使用 risk proxy 和 control evidence |
| Customer experience | CSAT、complaint rate、wait time、resolution accuracy | AI 体验指标与业务指标脱节 | 客户可见场景必须监控投诉和误导风险 |
| Employee experience | adoption、override rate、trust score、training completion | 员工被迫使用导致表面 adoption | 结合 usage、survey 和 workflow observation |
| Platform leverage | reuse count、time-to-integrate、cost per use case、release cycle time | 平台只算技术指标 | 连接到用例吞吐和风险门禁周期 |
6. Scoring Model
Scoring model 的目标是支持 portfolio conversation, 不是制造伪精确分数。高质量评分要同时看 value、feasibility、data readiness、risk tier、architecture leverage、adoption friction 和 time-to-evidence。
6.1 两阶段评分: eligibility gate + rank score
先做 eligibility gate, 再做 rank score。否则高风险、数据不可用或责任不清的用例会因为价值想象过高而挤占组合。
| Eligibility check | Go 条件 | No-go / hold 条件 |
|---|---|---|
| Business owner | 有可决策 owner 和可量化 outcome | 只有技术兴趣或泛泛 sponsor |
| Data rights | 数据 owner、访问边界、retention、PII 处理路径清楚 | 数据权属不清或外发边界不清 |
| Human accountability | AI authority 和人工责任明确 | 责任落在“系统建议”或无人承接 |
| Risk pathway | 风险级别和控制路径可定义 | 涉及受监管决策但无 review owner |
| Evidence path | 4-8 周内能验证关键假设 | 价值需要一年后才能判断且无 leading indicator |
6.2 Portfolio Scorecard
评分建议使用 1-5 分, 并保留 confidence。分数没有解释和证据来源时不能进入 funding decision。
| Dimension | Weight | 1 分 | 3 分 | 5 分 | Evidence |
|---|---|---|---|---|---|
| Value | 25% | 价值模糊或不可量化 | 有明确局部效率或质量收益 | 影响核心收入、成本、风险或客户体验 | baseline、business case、process metric |
| Feasibility | 15% | 集成复杂、能力不成熟 | 可用现有组件完成 pilot | 技术路径清晰, 生产依赖可控 | architecture sketch、dependency map |
| Data readiness | 15% | 数据不可访问或质量未知 | 数据可访问但需要治理修复 | 权威数据源、标签、权限和 lineage 清楚 | data readiness assessment |
| Risk tier | 15% | 高风险且控制成本高 | 中风险, 可通过 HITL 和 monitoring 控制 | 低风险或风险收益明显 | risk tier memo、control matrix |
| Architecture leverage | 10% | 单点实现, 难复用 | 可复用部分 connector 或 eval | 形成平台能力或 reusable pattern | platform reuse map |
| Adoption friction | 10% | 用户动机弱、流程改变大 | 有 sponsor, 需培训和流程调整 | 嵌入现有 workflow, 用户收益直接 | user evidence、change impact |
| Time-to-evidence | 10% | 超过一季度才有信号 | 4-8 周可得 pilot 信号 | 2-4 周可得关键证据 | experiment plan、leading indicator |
6.3 Risk-Adjusted Value
组合层可用一个简单公式做讨论起点:
Risk-adjusted value =
(Value score * Confidence)
+ Architecture leverage
- Risk cost
- Adoption friction
- Platform drag
定义:
| Term | 解释 |
|---|---|
| Value score | 业务收益、风险降低、客户体验或战略价值 |
| Confidence | 证据可信度, 来自 baseline、用户研究、pilot、eval、财务测算 |
| Architecture leverage | 该用例是否沉淀 reusable capability |
| Risk cost | 控制、审批、监控、人工复核、监管证据和 incident response 的成本 |
| Adoption friction | 培训、流程改变、激励冲突、员工信任和管理负担 |
| Platform drag | 临时架构、定制集成、供应商锁定和不可复用能力对未来组合造成的负担 |
6.4 Scoring Anti-Patterns
| Anti-pattern | 为什么危险 | 修正方式 |
|---|---|---|
| 只按 ROI 排序 | 忽略风险、数据和平台依赖 | 使用 risk-adjusted value 和 eligibility gate |
| 把高层 sponsor 当价值证据 | sponsor priority 不等于业务结果 | 要求 baseline metric 和 benefit owner |
| 低估 adoption friction | AI 很准但没人用 | 每个 use case 必须有 change impact 和 user evidence |
| 忽略 platform runway | 每个团队重复做 gateway、RAG、eval | 给 architecture leverage 和 platform drag 明确权重 |
| 不记录 confidence | 低证据高分用例挤占容量 | 分数旁边必须有 evidence level |
| 用平均分掩盖红线风险 | 高价值不能抵消隐私或合规红线 | 红线风险先 gate, 后评分 |
7. Gate Templates
Gate 不是审批表, 而是把“下一笔投资是否值得”说清楚。每个 gate 都应该输出明确决策: proceed、condition、pivot、hold、stop、scale、retire。
7.1 Discovery Gate
目标: 决定是否给 pilot capacity。
| Field | 内容 |
|---|---|
| Decision | Fund pilot / refine discovery / park / stop |
| Required owners | Business owner、AI product lead、architecture lead、data owner、risk partner |
| Evidence | problem statement、baseline、workflow map、user evidence、AI fit analysis、data readiness、risk tier、time-to-evidence |
| Key questions | 问题是否足够重要? AI 是否是合适路径? 数据是否可用? 风险路径是否清楚? pilot 能否快速产生证据? |
| Exit criteria | pilot hypothesis、success criteria、scope boundary、sample set、review plan、funding envelope 明确 |
| Stop criteria | 无 owner、无 baseline、AI 不是合适解法、数据不可用且无修复路径、风险责任不清 |
Discovery Gate Memo 样例
| Item | Example |
|---|---|
| Use case | AML alert triage assistant |
| Business thesis | 低风险 alert 处理存在重复查证和 narrative drafting, AI 可减少 analyst 操作负荷并提升 case note 一致性 |
| Baseline | 低风险 alert AHT 28 分钟, QA defect 9%, backlog age 6.4 天 |
| AI fit | RAG + draft assistant 适合生成可审阅摘要; 不自动关闭 alert |
| Pilot scope | 2 个队列, 500 个历史 case offline eval, 20 名 analyst shadow mode |
| Gate decision | Fund pilot for 6 weeks, capped at approved data sources and internal model gateway |
7.2 Pilot Gate
目标: 决定是否进入 release candidate。
| Field | 内容 |
|---|---|
| Decision | Release candidate / extend pilot / pivot / stop |
| Required owners | Product、engineering、data、risk、ops、SME reviewer |
| Evidence | eval results、failure taxonomy、SME review、user adoption signal、cost/latency、security review、control design |
| Key questions | 关键假设是否成立? 失败模式是否可控? 用户是否愿意在 workflow 中使用? 成本和延迟是否可接受? |
| Exit criteria | release scope、monitoring plan、rollback plan、training plan、RACI、benefit measurement plan |
| Stop criteria | 质量低于最低门槛、失败模式不可控、SME review 负担过高、用户无意愿使用、单位成本高于收益 |
Pilot Gate Decision Table
| Evidence area | Green | Amber | Red |
|---|---|---|---|
| Quality | 关键任务达到 release threshold | 局部队列达标, 需缩小范围 | 高风险错误不可控 |
| Adoption | 目标用户主动使用并反馈节省时间 | 需要 workflow 或 training 调整 | 用户绕开或不信任 |
| Risk | 控制有效, 无 red flag | 需要额外 HITL 或 sampling | 触及不可接受风险 |
| Cost | unit economics 清楚且可接受 | 成本可接受但需平台优化 | 成本随规模不可承受 |
| Operations | support 和 rollback 可执行 | runbook 需完善 | 一线无法承接 |
7.3 Release Gate
目标: 决定是否进入生产或受控发布。
| Field | 内容 |
|---|---|
| Decision | Release / conditional release / no-go |
| Required owners | Product accountable owner、Tech lead、Risk approval owner、Ops owner、Support owner |
| Evidence | architecture review、security/privacy approval、model/prompt/index versioning、monitoring dashboards、incident runbook、training completion |
| Key questions | 生产边界是否明确? 监控是否覆盖质量、风险、成本、adoption? 回滚是否演练? 审计证据是否自动沉淀? |
| Exit criteria | release cohort、control evidence、go-live checklist、benefit baseline、support path、rollback trigger |
| No-go criteria | 无监控、无 rollback、无 owner、无 audit trail、训练未完成、高风险控制未批准 |
7.4 Scale / Stop Gate
目标: 决定是否扩大、保持、限制、重构或停止。
| Field | 内容 |
|---|---|
| Decision | Scale / hold / restrict / rebuild / stop / retire |
| Required owners | Portfolio owner、business owner、platform owner、risk owner、finance partner |
| Evidence | realized benefits、adoption cohort、quality trend、incident trend、unit economics、platform capacity、risk trend |
| Key questions | 价值是否兑现? 风险是否随规模稳定? 平台和运营是否承载得住? 是否优先于其他投资? |
| Scale criteria | business benefit 达标, risk trend 稳定, platform runway 足够, operating model 成熟 |
| Stop criteria | 连续两个 review 周期价值未兑现, adoption 无改善, 控制成本高于价值, incident 超出 appetite, 或更优平台能力替代 |
Scale / Stop Rule 样例
Scale if:
- 目标队列 AHT 降低 >= 15%
- QA defect rate 不上升
- analyst weekly active usage >= 70%
- high-risk failure = 0
- cost per case <= approved threshold
- support tickets trend stable for 4 weeks
Hold if:
- value signal positive but adoption, training or workflow design is unstable
Stop if:
- two consecutive monthly reviews miss benefit threshold
- any critical privacy or regulated-output control fails
- SME review load exceeds agreed capacity without quality gain
8. Financial Retail Case: AML Alert Triage / Credit Ops / Customer Service AI Portfolio
8.1 Portfolio Context
一家金融零售机构正在建立 AI use case portfolio。目标不是“上线更多 AI”, 而是在受监管流程中降低运营负荷、提升质量、缩短客户等待时间, 同时建立可复用的 AI platform runway。
Portfolio theme:
Regulated operations intelligence
Investment thesis:
Prioritize AI that assists employees in high-volume regulated workflows,
keeps human accountability explicit,
reuses approved data and model controls,
and produces benefit evidence within one quarter.
8.2 Candidate Use Cases
| Use case | Business outcome | AI role | Risk tier | Platform dependency | Initial decision |
|---|---|---|---|---|---|
| AML alert triage assistant | 降低低风险 alert AHT, 提升 narrative consistency | RAG + case summary + checklist recommendation | High | case connector、RAG、trace、SME review、audit evidence | Discovery -> Pilot |
| Credit ops memo assistant | 缩短信贷补件和审批 memo 准备时间 | document extraction + policy retrieval + draft memo | High | document AI、policy RAG、reason code controls、HITL | Discovery |
| Customer service AI assist | 提升客服首解率, 降低知识查询时间 | internal agent assist, approved answer snippets | Medium | knowledge service、agent desktop integration、quality monitoring | Pilot -> Release |
| Complaint classification | 提高投诉分流准确率和 SLA tracking | classification + routing recommendation | Medium | taxonomy, workflow integration, QA sampling | Idea -> Discovery |
| Branch knowledge assistant | 降低员工查制度时间 | internal RAG Q&A | Low/Medium | enterprise knowledge index, citation, feedback | Release -> Scale |
| Collections contact strategy | 提升回收效率 | decisioning recommendation | High | policy rules、fairness monitoring、human approval | Park until risk pathway matures |
8.3 Portfolio Kanban Snapshot
| Stage | Items | Governance focus |
|---|---|---|
| Idea | Complaint classification、collections contact strategy | 检查 owner、risk tier 和 data rights |
| Discovery | Credit ops memo assistant、complaint classification | 建立 baseline、AI fit 和 data readiness |
| Pilot | AML alert triage assistant、customer service AI assist | eval、SME review、failure taxonomy、adoption signal |
| Release | Branch knowledge assistant | release cohort、monitoring、training、rollback |
| Scale | Customer service AI assist if pilot passes | unit economics、support capacity、knowledge freshness |
| Retire / Stop | Legacy FAQ bot | 迁移到 enterprise knowledge service, 停止重复维护 |
8.4 Scoring Snapshot
| Use case | Value | Feasibility | Data readiness | Risk tier score | Architecture leverage | Adoption friction | Time-to-evidence | Decision |
|---|---|---|---|---|---|---|---|---|
| AML alert triage assistant | 5 | 3 | 3 | 2 | 5 | 3 | 3 | Fund pilot with strict risk controls |
| Credit ops memo assistant | 4 | 3 | 2 | 2 | 4 | 3 | 3 | Continue discovery; data readiness first |
| Customer service AI assist | 4 | 4 | 4 | 3 | 4 | 4 | 5 | Move to limited release |
| Branch knowledge assistant | 3 | 5 | 4 | 4 | 5 | 4 | 5 | Scale as platform pattern |
| Collections contact strategy | 4 | 2 | 3 | 1 | 3 | 2 | 2 | Park; risk and fairness path not mature |
说明:
- Risk tier score 不是“风险越高越好”, 而是“风险控制可行性”。高风险但控制路径成熟可以得中等分; 高风险且控制路径不清应 hold。
- Branch knowledge assistant 单点价值不是最高, 但 architecture leverage 高, 可以作为 RAG、citation、feedback 和 knowledge ownership 的平台样板。
- Collections contact strategy 有潜在价值, 但容易触及公平性、客户伤害、监管和声誉风险, 在风险路径成熟前不应吃掉核心 capacity。
8.5 Funding Allocation Example
| Bucket | Allocation | Funded items |
|---|---|---|
| Strategic bets | 25% | AML alert triage assistant、credit ops memo assistant |
| Core workflow optimization | 25% | customer service AI assist、complaint classification |
| Platform runway | 25% | enterprise knowledge service、model gateway、eval harness、audit logging |
| Risk and assurance | 15% | high-risk use case control matrix、SME sampling design、incident drill |
| Discovery and option creation | 10% | collections contact strategy risk discovery、vendor scan |
8.6 Benefits Realization Plan
| Use case | Leading indicator | Lagging benefit | Benefit owner | Review cadence |
|---|---|---|---|---|
| AML alert triage | analyst active usage、summary acceptance、QA sample pass rate | AHT reduction、backlog age reduction、QA defect stability | Financial Crime Ops | monthly scale/stop review |
| Credit ops memo | extraction accuracy、memo edit distance、policy citation accuracy | approval cycle time、rework reduction、policy exception clarity | Credit Operations | monthly discovery/pilot review |
| Customer service assist | snippet usage、agent handle time、escalation rate | FCR improvement、wait time reduction、CSAT stability | Contact Center Ops | biweekly release review |
| Branch knowledge | citation helpfulness、deflection from support desk | lower policy query volume、faster branch response | Branch Operations | monthly platform review |
8.7 Scale / Stop Decisions
| Use case | 90-day evidence | Decision | Rationale |
|---|---|---|---|
| Customer service AI assist | AHT down 11%, FCR up 4%, no high-risk incidents, support tickets stable | Scale to two additional queues | Value and risk evidence strong; platform can support |
| AML alert triage | AHT down 8%, QA defects stable, analysts like summaries, but SME review load high | Hold scale, invest in sampling and feedback workflow | Value positive but review capacity is bottleneck |
| Credit ops memo | extraction works, policy citation weak, data ownership fragmented | Continue discovery as data product dependency | 用例价值存在, 但先修 data readiness |
| Legacy FAQ bot | Low usage, duplicate content, no audit trail | Retire and migrate content | Frees maintenance and risk capacity |
9. Artifact Templates
以下模板使用具体字段和示例写法, 目标是支持组合决策, 不是制造空表。
9.1 Portfolio Scorecard
# AI Portfolio Scorecard
Use case: AML alert triage assistant
Portfolio theme: Regulated operations intelligence
Business owner: Financial Crime Operations
Product owner: AI Operations Product Lead
Risk owner: Financial Crime Risk
Stage: Pilot
Decision date: 2026-06-29
## Investment thesis fit
This use case reduces regulated operations workload,
keeps analyst accountability explicit,
and reuses approved RAG, model gateway, trace logging and SME review capabilities.
## Score
| Dimension | Score | Confidence | Evidence |
|---|---:|---:|---|
| Value | 5 | Medium | AHT baseline 28 minutes; backlog age 6.4 days |
| Feasibility | 3 | Medium | Case system connector available; narrative generation needs eval |
| Data readiness | 3 | Medium | Case notes and alerts available; disposition labels need QA |
| Risk control feasibility | 2 | Medium | High-risk workflow; AI limited to draft and recommendation |
| Architecture leverage | 5 | High | Reuses RAG, audit logging and SME feedback platform |
| Adoption | 3 | Medium | Analysts want summary support; trust depends on citation quality |
| Time-to-evidence | 3 | High | Six-week shadow pilot can produce evidence |
## Decision
Fund six-week pilot with capped scope:
- two low-risk alert queues
- no automatic alert closure
- mandatory analyst review
- weekly QA sampling
- rollback if high-risk failure occurs
9.2 Funding Memo
# AI Funding Memo
Portfolio theme: Regulated operations intelligence
Quarter: 2026 Q3
Funding request: Pilot funding envelope for AML alert triage assistant
Requested capacity:
- Product: 0.5 FTE
- Engineering: 2 FTE
- Data engineering: 1 FTE
- Risk/Compliance: 0.3 FTE
- SME reviewers: 8 hours/week
- Platform: shared RAG, model gateway, eval, trace logging
## Investment rationale
The use case targets high-volume AML alert work where analysts spend material time gathering context and drafting case notes.
The AI does not make final compliance decisions.
It supports summary, retrieval and checklist recommendation inside an analyst-owned workflow.
## Expected evidence within funding period
- Offline eval on 500 historical cases
- Shadow mode with 20 analysts
- Summary acceptance rate
- QA defect comparison
- AHT directional signal
- Failure taxonomy and control gaps
## Funding guardrails
- Approved internal model gateway only
- Approved case and policy data sources only
- No customer-facing output
- No automatic closure or SAR escalation
- Weekly risk review
- Stop on critical privacy or regulated-output failure
## Decision requested
Approve six-week pilot envelope and reserve SME review capacity.
Scale funding will require separate scale/stop gate evidence.
9.3 Scale-Stop Memo
# AI Scale / Stop Memo
Use case: Customer service AI assist
Current stage: Limited release
Review period: 2026 Q3 month 2
Decision requested: Scale to two additional servicing queues
## Evidence summary
- AHT decreased by 11% in pilot queue
- First contact resolution improved by 4%
- Approved snippet usage reached 76% weekly active agents
- No high-risk customer misinformation incidents
- Cost per assisted interaction within approved threshold
- Knowledge freshness SLA met for 96% of high-use articles
## Risk and control status
- AI remains internal agent assist
- Customer-visible responses require agent send action
- High-risk topics use approved snippets and escalation
- QA sampling shows no increase in complaint-triggering errors
## Platform readiness
- Model gateway capacity approved
- Knowledge service supports additional queue taxonomy
- Observability dashboard covers usage, latency, cost and QA signals
- Support desk has runbook and incident escalation path
## Decision
Scale to two queues over four weeks.
Do not expand to lending hardship queue until approved language and complaint controls are validated.
9.4 Quarterly Review Agenda
# Quarterly AI Portfolio Review Agenda
Meeting goal:
Decide funding, scale, hold, stop and platform runway allocation for the next quarter.
Participants:
Executive sponsor, portfolio owner, finance partner, AI platform owner,
risk/compliance owner, data owner, product leads, operations leads.
## 1. Portfolio thesis refresh
- Which strategic outcomes remain valid?
- Which risk appetite or regulatory expectations changed?
- Which platform capabilities changed the economics of use cases?
## 2. Portfolio health
- Stage distribution: idea, discovery, pilot, release, scale, retire
- WIP by team and review capacity
- Risk tier distribution
- Platform dependency heatmap
- Benefits realization status
## 3. Funding decisions
- Continue strategic bets
- Fund new discovery
- Convert repeated patterns into platform runway
- Increase or reduce risk assurance capacity
## 4. Scale / stop decisions
- Use cases ready to scale
- Use cases held for adoption, data or risk reasons
- Use cases to retire, merge or stop
## 5. Capacity allocation
- Product, engineering, data, risk, SME and platform capacity
- Review bottlenecks
- Training and change management load
## 6. Decisions and records
- Approved funding envelope
- Conditions and owners
- Stop rules
- Next review date
10. Interview Answers
10.1 30 秒版本
AI portfolio 不能像普通项目集那样只看 ROI、排期和上线数量。我会用 portfolio kanban 管 idea 到 scale/stop 的流动, 用 investment thesis 和 capacity allocation 管资金方向, 用 risk-adjusted score 看 value、data readiness、risk、architecture leverage、adoption 和 time-to-evidence, 再用 discovery、pilot、release、scale/stop gates 决定继续投、扩大、暂停还是停止。这样能避免 AI demo 泛滥, 把钱投到真正可验证、可治理、可复用的能力上。
10.2 2 分钟版本
我会先把 AI portfolio 从项目清单改成投资组合。第一步是定义 investment thesis, 例如金融零售里优先投能降低运营判断负荷、提升受监管流程质量、复用平台控制、并能在一个季度内产生证据的 AI 能力。
第二步是建立 portfolio kanban: idea、discovery、pilot、release、scale、retire/stop。每一列都有 entry criteria、exit criteria、WIP limit 和 evidence requirement。idea 阶段看 business owner 和问题价值; discovery 阶段看 baseline、workflow、AI fit、data readiness 和 risk tier; pilot 阶段看 eval、SME review、adoption 和 unit economics; release 阶段看 monitoring、rollback、RACI 和 audit evidence; scale 阶段看 realized benefits、risk trend、platform capacity 和 operating model。
第三步是资金治理。我不会把所有预算给单个热门用例, 而会分配到 strategic bets、core workflow optimization、platform runway、risk assurance 和 discovery options。这样既能做业务价值, 也能建设 model gateway、eval、RAG、observability、audit evidence 这些横向能力。
最后是 scale/stop discipline。AI 用例如果价值没有兑现、数据条件不成熟、风险控制成本高于收益, 或 adoption 长期低, 就应该暂停或停止。成熟的 AI 组织不是上线最多, 而是能快速学习、快速扩大有效能力, 也能及时停止低质量投资。
10.3 CTO 版本
如果我是和 CTO 讨论, 我会把 AI portfolio governance 连接到 architecture runway 和 platform economics。单个 use case 的 ROI 不足以决定投资, 因为很多 AI 成本来自重复集成、重复 eval、重复日志、重复审批和供应商锁定。
我会建议把至少一部分 capacity 明确留给 platform runway: model gateway、eval harness、RAG service、policy-as-code、observability、human review workflow 和 audit evidence binder。每个 use case 在评分时都要看 architecture leverage 和 platform drag: 它是沉淀 reusable pattern, 还是制造一次性定制负担。
Gate 上, discovery gate 要求 architecture sketch 和 data boundary; pilot gate 要求 eval、failure taxonomy 和 cost/latency; release gate 要求 monitoring、rollback、versioning 和 incident path; scale gate 要求证明 platform capacity 可以承载更多队列或地区。这样 CTO 看到的不是一堆 AI PoC, 而是一条从业务价值到平台能力再到可运营规模的投资路径。
10.4 Chief Product Officer 版本
如果我是和 Chief Product Officer 讨论, 我会强调 AI portfolio 的产品投资纪律。AI 很容易被 demo 和高层热度带偏, 所以组合必须围绕 outcome、evidence 和 adoption 管理。
我会先定义 portfolio themes, 例如 regulated operations intelligence、customer service augmentation、credit decision support。每个 theme 有 investment thesis 和 benefit owner。然后用 scorecard 比较 value、feasibility、data readiness、risk control feasibility、architecture leverage、adoption friction 和 time-to-evidence。
在 funding 上, 我会保留 discovery capacity, 因为早期证据能避免大额误投; 同时保留 platform runway 和 risk assurance capacity, 因为没有横向能力和风险证据, 高价值用例也无法 scale。CPO 最需要的不是“今年上线多少 AI 功能”, 而是“哪些 AI 能力真正改变了业务指标, 哪些能力可以复用, 哪些投入应该停止”。这也是我会在季度 portfolio review 中强制讨论 scale/stop decisions 和 benefits realization 的原因。
10.5 面试追问: 如何处理高价值但高风险用例?
我不会直接因为价值高就放行。高价值高风险用例先通过 eligibility gate: 数据权属、人工责任、风险 owner、控制路径和 evidence path 必须清楚。然后缩小 pilot scope, 例如 shadow mode、低风险队列、内部辅助、不客户可见、不自动决策。只有当 eval、SME review、risk controls、monitoring 和 rollback 都达标, 才进入 release。scale 需要单独 gate, 不能把 pilot approval 当成规模化批准。
10.6 面试追问: 如何向 CFO 解释 platform runway 的钱?
我会把 platform runway 从“技术平台预算”翻译成 portfolio throughput 和 risk cost。没有 model gateway、eval、RAG、observability、policy controls 和 audit evidence, 每个用例都会重复建设, release gate 会变慢, 供应商和合规风险会上升, 单位经济性也会变差。平台投资的收益可以用 time-to-integrate、release cycle time、reuse count、cost per use case、incident reduction 和 audit response effort 来衡量。
10.7 面试追问: 如何防止 portfolio review 变成形式主义?
我会让 portfolio review 只做真实决策: fund、scale、hold、stop、retire、increase platform runway 或调整 risk capacity。会议输入必须是 scorecard、gate evidence、benefits realization、risk trend 和 capacity bottleneck, 不是项目状态汇报。每个决策必须有 owner、conditions、review date 和 stop rule。只要会议不能改变资金和容量分配, 它就不是 portfolio governance。
11. 作品集展示方式
这个 playbook 可以转化为一个高级 AI PM / AI Architect 作品集包:
| Artifact | 展示能力 |
|---|---|
| AI use case portfolio map | 能把多个 AI 机会放进同一套投资组合语言 |
| Portfolio kanban | 能管理 idea 到 retire/stop 的流动和 WIP |
| Investment thesis | 能连接战略、风险偏好和资金方向 |
| Scoring model | 能做 risk-adjusted prioritization, 不只看 ROI |
| Funding memo | 能把业务、技术、数据、风险和容量写成投资请求 |
| Gate templates | 能用证据做 discovery、pilot、release、scale/stop 决策 |
| Benefits realization dashboard | 能证明上线后的价值兑现 |
| Platform runway map | 能解释为什么平台能力是组合经济性的核心 |
| Financial retail case | 能用 AML、credit ops、customer service 讲清金融零售 AI 治理 |
最终面试信号:
我不仅能定义单个 AI 产品, 还能设计一套让组织持续选择、投资、放大和停止 AI 用例的 portfolio operating system。