目录
定位: 面向 AI PM、Platform PM、Product Architect、Enterprise Architect、AI Governance Lead 和金融零售 AI 转型负责人的产品线工程手册。
目标: 把 software product line engineering、core assets、variation points、platform capabilities、domain architecture、asset governance、tenant / product customization、eval reuse、architecture runway 和 funding model 连接成一套可运营的 AI 平台资产体系。
核心观点: AI 规模化不是复制更多 POC, 而是把可复用资产、可变性管理、风险证据和平台投资放进同一个产品线治理系统。
1. Source Anchors
这些来源作为术语和治理锚点, 不构成法律、合规、审计、采购、财务或监管咨询意见。正式项目必须结合机构政策、地区监管、数据边界和供应商合同复核。
2. One-Sentence Positioning
AI Product Line Engineering 是把多个 AI 应用看成同一产品族的不同变体: 通过 domain architecture 定义共性和可变点, 通过 core assets 沉淀 prompt、RAG、eval、ontology、connector、policy、workflow、monitoring 和 UX pattern, 再用治理和资金模型持续决定哪些资产该平台化、哪些该本地化、哪些该退役。
更短的面试版:
我不会把 AI use case 当成一个个孤立项目交付, 而会把它们放进产品线工程体系: 共性沉淀为平台资产, 差异通过 variation point 管理, 质量和风险通过可复用 eval 证明, 投资通过 architecture runway 和 reuse ROI 持续治理。
2.1 核心框架: AI-PLE Loop
Business capability family
-> domain architecture
-> commonality / variability analysis
-> core asset map
-> variation model
-> application assembly
-> reusable eval and controls
-> adoption telemetry
-> reuse ROI and funding decision
-> architecture runway refresh
Step 核心问题 关键输出 Capability family 哪些业务线的问题属于同一产品族 AI product line scope、capability family map Domain architecture 这个领域的流程、数据、规则、风险和系统边界是什么 Domain model、reference architecture、decision boundary Commonality / variability 哪些东西必须统一, 哪些必须按业务线变化 Commonality matrix、variation points Core assets 哪些资产值得沉淀为平台能力 Core asset map、asset backlog、owner model Application assembly 单个产品或租户如何装配平台能力 Configuration profile、integration plan、release gate Eval and controls 复用资产如何证明质量、风险和审计证据 Eval pack、control pack、evidence binder Telemetry and ROI 复用是否真的降低交付和运营成本 Adoption telemetry、reuse ROI、retirement signal Funding runway 平台能力如何被持续投资 Platform funding memo、capacity allocation、roadmap
2.2 高级判断
判断问题 成熟答案 何时做平台资产 至少两个高价值场景共享同一问题结构, 且复用能降低风险、交付时间或运营成本 何时保持本地化 差异来自监管、客户承诺、产品规则或数据边界, 强行统一会增加风险 何时重构为产品线 POC 重复出现相同 RAG、eval、connector、policy、workflow 和 monitoring 需求 何时停止平台投资 资产 adoption 低、变体成本高、业务线绕开平台、维护成本超过节省价值 何时进入 architecture runway 资产是多个季度用例的前置能力, 且没有它会导致重复建设或上线风险
3. 为什么 AI POC 会失败在“不可复用”
AI POC 往往不是因为 demo 不好看而失败, 而是因为 demo 背后没有产品线工程。一个 POC 能回答“这个模型是否能完成样例任务”, 但不能回答“这个能力能否被多个业务线安全、可配置、可评估、可运营地使用”。
3.1 不可复用的典型失败模式
失败模式 表面症状 深层原因 产品线工程修复 Prompt 藏在应用代码里 每个团队复制 prompt, 改动不可追溯 没有 prompt / template registry 和版本治理 把 prompt、system instruction、output schema、policy text 作为 core asset RAG 每个项目重建 知识库、chunking、权限、引用质量不一致 没有 shared ingestion / retrieval pipeline 建立可配置 RAG pipeline, 按 domain 和 data boundary 管变化 Eval 只服务一次验收 上线后无法比较版本退化 没有 reusable golden set、rubric 和 release gate 建立 eval asset, 绑定风险等级和产品变体 Connector 点对点集成 每个 POC 都重新接 case system、CRM、文档库 工具接入没有 tool catalog 和 owner 把 read / write connector、权限、审计、幂等控制平台化 合规控制后补 demo 后才发现客户可见、地区、隐私限制 风险分层没有进入 variation model 把 risk tier、jurisdiction、data boundary 作为一等可变点 UX 每次重做 用户不信任, 引用、人工复核、纠错路径不同 没有 AI trust UX pattern 沉淀 citation、confidence、handoff、feedback、override pattern 平台资金来自项目剩余预算 资产做了一半无人维护 funding model 只奖励单点交付 用 platform runway、lean budget、reuse ROI 管平台资产投资 业务线定制失控 “复用平台”最后变成多套分支 可变性没有被建模和治理 用 variation matrix、configuration profile 和 deprecation policy 管变体
3.2 从 POC 债务到产品线资产
POC asset debt
= hidden assumptions
+ copied prompts
+ unversioned knowledge
+ one-off eval
+ unmanaged connectors
+ local controls
+ no telemetry
+ project-only funding
产品线工程的目标不是把所有 AI 应用做成同一个系统, 而是把重复能力变成受治理的资产, 把合理差异变成显式可变点, 把交付经验变成下一条产品线的 architecture runway。
3.3 高级 BA / PM / Architect 要问的问题
维度 问题 Business capability 这个 POC 属于哪个能力家族, 是否会在多个业务线重复出现 Commonality 哪些流程步骤、知识源、工具、风险控制和 UX 模式是共性的 Variability 哪些差异来自 domain、jurisdiction、risk tier、channel、segment、language、model 或 data boundary Core asset 哪些资产可被注册、版本化、授权、评估和监控 Assembly 新产品或租户如何通过配置装配, 而不是 fork 一套代码 Evidence 复用资产如何保留 eval、审批、生产监控和审计证据 Funding 谁为平台资产付费, 谁从复用中受益, ROI 如何回流到 runway
4. Core Asset Taxonomy
Core asset 不是“公共代码库”的同义词。在 AI 产品线里, core asset 包含 prompt、数据、知识、规则、评估、工具、工作流、监控、交互和治理证据。只有被明确 owner、版本、适用边界、变更流程、采用指标和退役规则管理的资产, 才算真正的产品线资产。
4.1 资产分类表
Core asset type 可复用内容 典型 variation points 必备治理证据 反模式 Prompt / template system instruction、task prompt、few-shot examples、output schema、tone guide domain、language、risk tier、customer segment、channel version diff、owner、approved use、eval result、rollback path prompt 复制到每个应用, 由个人临时改 Eval set golden set、red-team set、slice set、human rubric、judge criteria、critical failure list domain、jurisdiction、risk tier、language、model/provider dataset card、coverage matrix、calibration report、release threshold 只用平均分, 不按风险和场景切片 RAG pipeline ingestion、chunking、metadata、permission filter、retriever、reranker、citation、freshness check domain、data boundary、jurisdiction、document type、language source owner、lineage、index version、retrieval eval、access control 每个团队各建一个向量库 Ontology domain entities、relationship、event taxonomy、case status、risk reason code business line、product、jurisdiction、customer segment glossary owner、change log、mapping rules、semantic review 只有技术字段表, 没有业务语义 Tool connector CRM、case system、document repository、payment system、ticketing、workflow engine connector read/write scope、risk tier、tenant、channel、data boundary tool card、RBAC、audit log、idempotency、approval rule agent 直接调用业务系统, 缺少权限和审计 Policy control policy-as-code、guardrail、eligibility rule、disclosure rule、refusal rule、HITL rule jurisdiction、risk tier、customer status、product policy owner、effective date、test cases、exception log 合规规则写在 prompt 里, 无法测试 Workflow component intake、triage、summarization、recommendation、approval queue、handoff、case close support domain、role、SLA、risk tier、system integration process owner、RACI、SLO、failure path、training asset AI 输出没有进入真实工作流 Monitoring dashboard quality、risk、cost、latency、adoption、drift、incident、override product、tenant、risk tier、model/provider、channel metric definition、alert threshold、owner、review cadence 只看 token 用量和调用次数 UX pattern citation view、confidence signal、edit-and-approve、source drilldown、feedback、explainability panel customer-facing / employee-facing、channel、language、risk tier design rationale、usability evidence、trust metric、accessibility review 每个应用重新设计 AI 信任体验
4.2 Asset card 必须包含的字段
字段 说明 Asset ID 稳定编号, 支持 lineage 和审计追踪 Asset type prompt、eval、RAG、ontology、connector、policy、workflow、monitoring、UX pattern Business capability 支撑的能力家族, 例如 case assist、policy assist、decision support Owner 产品 owner、技术 owner、风险 / 合规 owner、业务 SME Approved uses 可用于哪些产品、租户、渠道、风险等级 Excluded uses 禁止或需单独审批的用途 Variation points 允许配置的维度和配置范围 Dependencies 模型、数据源、工具、系统、政策、流程依赖 Version 语义化版本、变更说明、兼容性说明 Eval evidence 对应数据集、指标、门槛、最近一次结果 Operational telemetry adoption、failure、cost、latency、override、incident Deprecation rule 何时冻结、替换、迁移或退役
4.3 资产成熟度
Level 状态 进入下一等级的证据 L0 Local artifact 只存在于单个 POC 或团队 有第二个潜在复用场景和明确 owner L1 Registered asset 已登记 owner、用途、版本和适用边界 有 eval evidence 和变更流程 L2 Reusable asset 至少两个应用复用, 通过标准配置接入 有 adoption telemetry 和 support model L3 Platform capability 成为平台 API、SDK、workflow 或 controlled service 有 SLO、成本归因、risk control 和 roadmap L4 Product line asset 支撑多个产品线变体, 纳入 funding 和 portfolio review 有 reuse ROI、deprecation policy 和 architecture runway
5. Variation Points
Variation point 是产品线工程的核心。AI 平台不是靠“所有业务都用同一套配置”实现规模化, 而是靠清楚地区分哪些差异可以配置、哪些差异必须隔离、哪些差异应该被禁止。
5.1 八类关键可变点
Variation point 变化内容 对架构的影响 对治理的影响 Domain AML、KYC、credit、payments、claims、customer service 等领域差异 ontology、RAG source、workflow、tools、eval slices 不同 需要 domain owner 和 SME review Jurisdiction 国家、州、省、市、监管框架和政策生效日期 policy control、disclosure、data residency、retention 不同 需要 legal / compliance signoff 和 effective date Risk tier 内部辅助、员工建议、客户可见、高影响决策支持、自动行动 HITL、release gate、monitoring、rollback 强度不同 需要 NIST AI RMF 风险证据、control pack 和例外审批 Channel Web、mobile、branch、contact center、back office、API UX pattern、latency、handoff、authentication 不同 需要渠道 owner 和客户体验审查 Customer segment 零售、小微、商户、高净值、弱势客户、内部员工 tone、eligibility、disclosure、accessibility 不同 需要公平性、可解释性和投诉路径审查 Language 中文、英文、西班牙文、多语种混合 prompt、eval set、RAG index、UX copy、policy translation 不同 需要语言质量、术语一致性和地区合规证据 Model / provider 不同 LLM、embedding、reranker、judge、vendor、deployment model routing、fallback、cost、latency、eval replay、data sharing 不同 需要 vendor risk、model risk、security 和采购审查 Data boundary PII、PCI、bank secrecy、跨境数据、租户隔离、内部公开 / 限制级 retrieval filter、logging、redaction、storage、network boundary 不同 需要 privacy、security、data owner 和审计证据
5.2 可变性管理原则
原则 操作化做法 Variation points must be explicit 每个 AI product instance 都有 configuration profile, 不允许隐式假设 Shared by default, isolated when justified 共性资产默认共享, 监管、风险、数据边界和客户承诺可触发隔离 Policy is configuration, not prose 地区、风险、客户分层规则应进入 policy control 和 test cases Eval follows variation 每个关键变体必须有对应 eval slice, 不能只跑全局平均 Tenant customization is bounded 租户或产品定制必须走允许范围、兼容性检查和版本策略 Forking requires governance fork core asset 必须有业务理由、owner、迁移或长期维护方案
5.3 Variation matrix 示例
Product instance Domain Jurisdiction Risk tier Channel Segment Language Model/provider Data boundary Required eval slices AML case-assist Financial crime US federal + state policy High decision support Back office Internal analyst English Approved enterprise LLM Restricted case data SAR quality、under-escalation、citation accuracy、PII handling KYC policy assistant Onboarding compliance US / Canada Employee advisory Back office Analyst / reviewer English / French Enterprise LLM + controlled RAG Policy documents + customer metadata policy grounding、jurisdiction split、effective date Dispute case assistant Card operations US Medium employee support Contact center + ops Agent / supervisor English / Spanish Low-latency model with fallback Card dispute data next-action accuracy、customer language、handoff quality Branch service copilot Retail banking State-specific disclosure Customer-adjacent draft Branch desktop Frontline employee English / Spanish Provider selected by channel SLA Customer profile + product terms disclosure compliance、tone、refusal、escalation
6. Product Line Architecture: Domain Engineering vs Application Engineering
Software product line engineering 的关键分工是 domain engineering 和 application engineering。迁移到 AI 语境后, domain engineering 负责定义产品族的共性、可变点、参考架构和 core assets;application engineering 负责用这些资产装配具体业务线、产品、租户或渠道实例。
6.1 双循环架构
Domain engineering loop
capability family
-> domain model
-> reference architecture
-> core asset backlog
-> variation model
-> platform runway
-> asset release
Application engineering loop
product / tenant need
-> configuration profile
-> asset selection
-> local integration
-> variation-specific eval
-> release gate
-> adoption telemetry
-> feedback to domain engineering
6.2 Domain engineering
Domain engineering concern AI 产品线输出 决策问题 Capability family Case assist、policy assist、decision support、agent workflow 等产品族范围 哪些业务线共享问题结构 Domain model entity、event、case lifecycle、risk reason、policy taxonomy 哪些业务语义必须统一 Reference architecture model gateway、RAG、tool gateway、policy engine、eval harness、observability 哪些组件必须作为平台能力 Core asset backlog prompt、eval、RAG、connector、workflow、UX、monitoring asset 哪些资产进入 runway Variation model domain、jurisdiction、risk tier、channel、segment、language、model、data boundary 哪些差异可以配置或需要隔离 Release policy asset version、compatibility、migration、deprecation 平台资产如何升级而不破坏产品实例 Funding thesis platform investment、reuse ROI、capacity allocation 为什么投资平台比继续做点状项目更划算
6.3 Application engineering
Application engineering concern AI 产品实例输出 决策问题 Configuration profile 业务线、租户、地区、风险等级、模型、数据边界配置 这个实例使用哪些标准变体 Asset selection 选用的 prompt、RAG、eval、connector、policy、UX pattern 哪些 core assets 可直接复用 Local extension 本地政策、知识源、字段映射、流程步骤、语言 哪些差异需要扩展而不是 fork Eval adaptation 增加 domain slices、jurisdiction slices、red-team cases 本实例的风险是否被覆盖 Release gate go / limited go / no-go / rollback decision 是否满足产品族和本地门禁 Production telemetry adoption、quality、override、incident、cost 复用资产是否在真实流程中有效 Feedback loop 新失败案例、配置缺口、资产需求 哪些经验应回流为平台资产
6.4 Reference architecture
AI Product Line Control Plane
Business capability portfolio
-> product line registry
-> core asset registry
-> variation model
-> configuration service
-> model gateway
-> RAG / knowledge service
-> tool gateway
-> policy control service
-> workflow orchestration
-> eval harness
-> observability and cost telemetry
-> evidence binder
Layer Shared platform capability 可变配置 Product line registry 产品族、实例、owner、风险等级、资产依赖 业务线、租户、渠道、地区 Core asset registry asset metadata、version、dependency、approval、lineage asset selection、compatibility rule Configuration service central config、environment、feature flag、policy bundle tenant profile、risk tier、language Model gateway routing、fallback、quota、logging、cost attribution model/provider、latency tier、data boundary RAG / knowledge service ingestion、permission filter、retrieval、citation、freshness source set、ontology、jurisdiction、index Tool gateway tool catalog、RBAC、audit、approval、idempotency read/write scope、system connector、risk rule Policy control guardrail、eligibility、disclosure、refusal、HITL jurisdiction、product rule、customer segment Workflow orchestration reusable workflow components、handoff、approval queue process step、role、SLA、exception path Eval harness dataset registry、runner、judge、human review、release gate eval slice、threshold、rubric、critical failure Observability traces、quality、risk、cost、latency、adoption、incident dashboard view、alert threshold、review cadence Evidence binder lineage、approval、eval run、production sample、exception product instance、audit scope、retention rule
6.5 Architecture runway backlog
Runway item 为什么是 runway 退出标准 Product line registry 没有 registry 就无法知道哪些应用复用哪些资产 每个 AI product instance 都能追踪资产依赖和风险等级 Configurable RAG pipeline 多业务线都需要权限、引用、freshness 和 retrieval eval 至少三个知识助手通过同一 pipeline 运行, 且 eval 可比较 Eval pack library 每个应用重建 eval 会拖慢 release, 也无法管理回归 关键产品族都有 reusable golden set、rubric、red-team set Tool gateway case-assist、ops agent、workflow automation 都需要受控工具调用 工具调用具备 RBAC、approval、audit、idempotency Policy control service 地区和风险差异需要可测试规则, 不能散落在 prompt 关键 policy 有 owner、effective date、test cases 和 exception log Adoption telemetry 平台价值必须证明真实使用和业务结果 能按资产、产品线、业务线看 adoption、override、cost、benefit
7. Asset Governance
AI core asset governance 的目标不是审批更多文档, 而是让资产在被复用时仍然可理解、可追溯、可评估、可监控、可退出。治理重点包括 versioning、ownership、adoption telemetry、deprecation 和 reuse ROI。
7.1 Governance model
Governance area 机制 决策问题 Versioning semantic version、environment promotion、compatibility rule、rollback plan 资产升级是否影响已上线产品实例 Ownership product owner、engineering owner、domain SME、risk / compliance owner、operations owner 谁有权改、谁批准、谁值守、谁解释风险 Adoption telemetry asset usage、active product instances、workflow usage、override、support ticket 资产是否真的被复用并产生价值 Deprecation freeze、migration window、replacement asset、exception approval、retirement date 低价值或高风险资产如何退出 Reuse ROI avoided build cost、time-to-evidence、control reuse、incident reduction、run cost 平台投资是否比继续做点状项目更优 Evidence retention eval runs、approval records、config version、trace sample、incident log 审计、模型风险管理和复盘能否还原当时版本 Exception management approved deviation、risk acceptance、expiry date、review owner 本地差异是合理变体还是失控分叉
7.2 Versioning strategy
资产类型 版本策略 兼容性判断 Prompt / template major 版本改变任务行为, minor 版本优化表达, patch 修复格式或小错误 output schema、policy behavior、eval threshold 是否变化 Eval set 新增 slice 或 critical case 需要记录门槛影响 历史版本是否可 replay, 新门槛是否适用于旧产品 RAG pipeline chunking、retriever、reranker、citation、permission filter 版本独立记录 retrieval eval、引用准确率、权限过滤是否退化 Ontology entity / relationship / code set 变更需有 migration map 下游字段映射、reporting、eval labels 是否受影响 Tool connector tool schema、权限、side effect、idempotency 变更需单独 gate agent workflow、approval、audit log 是否仍有效 Policy control effective date、jurisdiction、risk tier 与 test cases 绑定 新规则是否影响历史 case、客户承诺或披露 Workflow component process step、role、SLA、handoff 和 exception path 版本化 用户培训、RACI、monitoring 是否更新 Monitoring dashboard metric definition 和 alert threshold 版本化 趋势比较是否仍有意义 UX pattern component、copy、interaction、accessibility evidence 版本化 用户理解、信任和合规披露是否变化
7.3 Ownership model
Role 对 core asset 的责任 Asset Product Owner 定义资产价值、适用范围、roadmap、adoption 指标和退役决策 Asset Engineering Owner 负责实现、接口、版本、兼容性、SLO、运维和安全修复 Domain SME Owner 确认领域语义、样本、政策解释、失败案例和 eval rubric Risk / Compliance Owner 审查 risk tier、控制设计、例外、客户影响和监管证据 Data / Knowledge Owner 管知识源、数据质量、权限、retention、freshness 和 lineage Platform Operations Owner 管支持流程、incident、cost、capacity、release calendar Product Instance Owner 对本地配置、adoption、业务结果和生产反馈负责
7.4 Adoption telemetry
Metric 为什么重要 解释方式 Active product instances 衡量资产是否跨业务线复用 结合风险等级和业务价值看, 不单看数量 Reused asset ratio 新应用中多少能力来自 core assets 高比例不一定好, 关键看是否减少风险和交付时间 Time-to-first-pilot 平台资产是否缩短 discovery 到 pilot 的时间 与同类历史项目对比 Time-to-release eval、RAG、connector、policy 复用是否减少上线摩擦 分低风险和高风险 use case 看 Config vs fork ratio 本地差异是否在可控范围内 fork 增多说明 variation model 或平台边界有问题 Override / edit rate 用户是否信任 AI 输出 与输出质量、培训、workflow fit 联合解释 Asset incident rate 复用资产是否放大问题 按资产版本和产品实例切片 Cost per successful task 平台成本是否换来真实任务完成 和人工 baseline、质量、风险一起看 Avoided duplicate build 复用节省了多少重复工程和治理工作 用实际项目估算, 不做夸大宣传
7.5 Reuse ROI
Reuse ROI =
avoided duplicate build cost
+ avoided duplicate control / review cost
+ faster time-to-evidence value
+ reduced incident / rework cost
+ operational productivity gain
- platform build and run cost
- governance and support cost
- migration cost
ROI 组成 取数方式 Avoided duplicate build cost 对比每个业务线独立建设 prompt、RAG、connector、eval、monitoring 的工程量 Avoided control cost 复用 risk control、policy test、release gate、evidence binder 节省的评审和返工 Faster time-to-evidence pilot 周期缩短带来的早决策、早停止或早收益 Reduced incident / rework 版本治理、eval replay、monitoring 提前发现问题减少的损失 Operational productivity case handling time、QA defect、backlog、rework 的实际改善 Platform cost 工程、基础设施、支持、合规、SME、运维和供应商成本
7.6 Deprecation rules
Deprecation signal 治理动作 资产 2 个季度无活跃实例 进入 freeze review, 判断是否退役或合并 fork 数持续上升 重新审查 variation model、API、配置范围和平台边界 eval 连续退化 停止推广新实例, 启动 remediation 或替换 供应商、模型或数据边界变化 启动 migration assessment 和 replay eval 维护成本超过节省价值 进入 portfolio review, 决定退役、重建或缩小范围 监管或政策变化 标记受影响实例, 更新 policy control 和 release gate
8.1 背景
一家金融零售机构有多个高成本 case-based operations 场景:
AML investigation: 分析交易、客户资料、历史 case、red flags, 草拟调查 narrative。
KYC onboarding review: 查政策、核对文件、解释补件原因。
Card dispute operations: 汇总争议材料、推荐下一步动作、生成客户沟通草稿。
Lending operations: 汇总贷款申请材料、检查缺失字段、草拟贷审 memo。
Collections hardship support: 汇总客户情况、政策选项、下一步联系建议。
Contact center escalation: 把客户对话、产品条款和历史 case 转成可复核处理建议。
单点 POC 做法会导致每条业务线各自建 RAG、prompt、case connector、eval、monitoring 和 UX。产品线工程做法是建设一个 AI case-assist platform, 让多个业务线通过可变点配置装配不同产品实例。
8.2 Product line scope
产品族 共性能力 明确不做 AI case-assist platform case intake、context retrieval、evidence summarization、policy citation、next-action recommendation、draft generation、human approval、quality feedback、audit trace 不自动做高影响最终决策, 不绕过业务系统权限, 不替代法务或合规审查
8.3 Shared core assets
Shared asset 复用方式 本地化方式 Case summary prompt template 所有 case 场景使用统一摘要结构: facts、timeline、evidence、open questions、recommended next step 按领域加入 risk reason、policy language、customer tone RAG pipeline 统一 ingestion、permission filter、citation、freshness、retrieval eval 按 domain source set、jurisdiction、document type 配置 Case system connector 统一 case read、attachment read、note write draft、audit trace 按系统、字段映射、read/write 权限配置 Policy control 统一 risk tier、HITL、refusal、customer disclosure、data redaction 框架 按地区、产品、客户分层和政策生效日期配置 Eval set library 共用 groundedness、citation accuracy、PII handling、format、handoff eval 按 AML、KYC、dispute、lending、collections 增加 domain slices Workflow component 统一 review queue、approve/edit、feedback-to-eval、incident escalation 按角色、SLA、case status 和 QA 流程配置 Monitoring dashboard 统一 quality、cost、latency、adoption、override、incident、drift 按业务线、风险等级、产品实例和渠道切片 Trust UX pattern 统一引用展开、证据高亮、编辑确认、低信心提醒、人工接管 按 employee-facing 或 customer-adjacent 调整语言和披露
8.4 Product instances
Product instance Core assets reused Key variations Release gate emphasis AML investigation copilot Case summary template、entity ontology、RAG pipeline、case connector、eval harness、audit trace Financial crime domain、high risk tier、restricted case data、English back office under-escalation、SAR narrative quality、source citation、PII handling、analyst override KYC policy assistant Policy RAG、jurisdiction policy control、citation UX、eval set onboarding domain、US / Canada jurisdiction、analyst advisory、English / French policy effective date、jurisdiction split、document checklist accuracy Card dispute assistant Workflow component、case connector、customer communication draft template、monitoring card operations domain、contact center + back office, English / Spanish next-action accuracy、tone、customer disclosure、handoff Lending ops memo copilot Structured extraction、loan memo template、policy citation、HITL review credit domain、risk appetite、product rules、customer segment missing data detection、policy grounding、adverse-action boundary Collections hardship support Policy option retrieval、conversation summary、recommendation template collections domain、vulnerable customer segment、customer-adjacent draft fair treatment、tone、escalation, no unauthorized commitment
8.5 Domain architecture decisions
Decision Recommended stance Rationale Shared platform or separate applications Shared platform with product-line instances Case-assist workflow, evidence retrieval, human approval and monitoring are common Single ontology or domain ontologies Shared meta-model plus domain extensions Case, customer, document, action are common; red flags and policy reasons differ One eval set or many Common eval pack plus domain / jurisdiction slices Grounding and PII are common; under-escalation and policy accuracy are domain-specific One model or model routing Gateway routing by risk, latency, cost and data boundary High-risk cases need stronger controls and replay eval; low-risk drafting may optimize latency Unified UX or local UX Shared trust UX pattern with channel-specific presentation Citation, edit, approve and feedback should be consistent; branch and back office differ Central funding or business funding Hybrid: central runway for platform assets, business contribution for product instances Platform assets create cross-line value; local extensions reflect business demand
8.6 Reuse economics
Investment Without product line With product line RAG 6 independent pipelines, inconsistent permissions and citation 1 configurable pipeline, domain source sets and reusable retrieval eval Eval 6 separate spreadsheets and subjective signoff Common eval harness, domain slices, comparable release gates Connector 6 point-to-point integrations Controlled connector catalog and tool gateway Policy Rules embedded in prompts and local code Policy control with jurisdiction, risk and customer segment variation Monitoring Dashboards only for active POC teams Platform telemetry by asset, product instance, cost, quality and adoption Funding 每个项目临时争预算 Platform runway funds reusable assets; business lines fund local configuration and adoption
8.7 Portfolio governance cadence
Cadence Review focus Output Monthly product line review 新实例 intake、asset adoption、incident、eval regression、support load asset backlog adjustment、release decision、risk issue Quarterly funding review reuse ROI、capacity allocation、runway progress、platform cost、business benefit continue / expand / constrain / retire funding decision Semiannual architecture review reference architecture, variation model, model/provider strategy, data boundary architecture runway update、migration plan、deprecation decision Annual management review business outcomes、risk posture、audit findings、operating model maturity executive memo、budget guardrails、strategic roadmap
9. Artifact Templates
这些模板用于作品集、面试和真实项目沟通。重点不是把表格填满, 而是形成可被 portfolio、architecture、risk 和 product leadership 共同审查的决策证据。
9.1 Core Asset Map
# Core Asset Map: AI Case-Assist Product Line
## Product Line Scope
- Capability family: case assist / policy assist / decision support
- Business lines: AML, KYC, disputes, lending ops, collections, contact center
- Excluded uses: final automated high-impact decision, unauthorized system action, unmanaged customer commitment
## Core Assets
| Asset ID | Asset type | Asset name | Business capability | Owner | Approved uses | Variation points | Dependencies | Eval evidence | Telemetry | Deprecation rule |
|---|---|---|---|---|---|---|---|---|---|---|
| CA-RAG-001 | RAG pipeline | Case evidence retrieval | evidence gathering | Platform PO / Knowledge Owner | employee case assist | domain, jurisdiction, data boundary | document repository, case system, model gateway | retrieval recall, citation accuracy, permission test | active instances, retrieval failure, citation complaint | retire when replacement pipeline has passed replay eval |
| CA-EVAL-001 | Eval set | Case summary quality pack | summary and narrative draft | EvalOps Lead / Domain SME | case-assist release gate | domain, risk tier, language | golden set, judge rubric, human review | groundedness, completeness, critical failures | release pass rate, regression trend | version migration after policy or workflow change |
| CA-UX-001 | UX pattern | Evidence citation and edit approval | trust and human oversight | Design Lead / Compliance Owner | employee-facing and customer-adjacent draft | channel, language, risk tier | design system, citation service | usability review, disclosure review | edit rate, source drilldown, feedback rate | retire after new pattern meets adoption and compliance gate |
## Asset Dependency Map
```text
Product instance
-> configuration profile
-> prompt/template
-> RAG pipeline
-> ontology
-> tool connector
-> policy control
-> workflow component
-> eval pack
-> monitoring dashboard
-> evidence binder
Governance
Versioning rule:
Approval path:
Exception path:
Review cadence:
Evidence retention:
### 9.2 Variation Matrix
```markdown
# Variation Matrix: AI Case-Assist Product Line
## Standard Variation Points
| Variation point | Allowed values | Configuration owner | Gate impact | Eval slice required | Isolation trigger |
|---|---|---|---|---|---|
| Domain | AML, KYC, disputes, lending ops, collections, contact center | Domain PO / SME | domain SME approval | domain-specific failures | unique ontology or policy |
| Jurisdiction | US, Canada, state-specific rules | Compliance Owner | legal / compliance signoff | jurisdiction-specific policy cases | data residency or disclosure conflict |
| Risk tier | internal assist, employee recommendation, customer-adjacent draft, high-risk decision support | Risk Owner | control pack strength | critical failure and HITL cases | automated action or high-impact use |
| Channel | back office, contact center, branch, mobile, API | Channel Owner | UX and SLA review | channel-specific workflow cases | customer-facing disclosure |
| Customer segment | retail, small business, vulnerable customer, employee | Segment Owner | fairness and treatment review | segment harm / tone cases | special legal duty or customer promise |
| Language | English, French, Spanish, Chinese | Language Owner | language quality review | multilingual eval cases | policy translation risk |
| Model/provider | enterprise LLM A, enterprise LLM B, local model, approved vendor | Platform Owner | replay eval and vendor review | model comparison and regression | data sharing or retention conflict |
| Data boundary | public, internal, restricted, PII, PCI, cross-border | Data Owner / Security | privacy and security gate | redaction and permission cases | tenant isolation or regulated data |
## Product Instance Profile
| Field | Selected configuration | Rationale | Required evidence |
|---|---|---|---|
| Product instance | | | |
| Business owner | | | |
| Domain | | | |
| Jurisdiction | | | |
| Risk tier | | | |
| Channel | | | |
| Customer segment | | | |
| Language | | | |
| Model/provider | | | |
| Data boundary | | | |
| Local extensions | | | |
| Release gate | | | |
9.3 Reuse Decision Memo
# Reuse Decision Memo
## Decision
- Reuse decision: reuse existing asset / extend asset / create new core asset / keep local / retire
- Asset or capability:
- Product instance:
- Decision owner:
- Decision date:
## Business Context
- Capability family:
- Workflow step:
- Business outcome:
- Current baseline:
- Risk tier:
## Reuse Analysis
| Question | Assessment |
|---|---|
| Which existing core assets match the need | |
| Which variation points explain the local difference | |
| Which differences can be configured | |
| Which differences require asset extension | |
| Which differences require isolation | |
| Which eval slices prove the reused asset works here | |
| Which controls or policies must change | |
## Options
| Option | Delivery impact | Risk impact | Cost impact | Long-term architecture impact |
|---|---|---|---|---|
| Reuse as-is | | | | |
| Extend core asset | | | | |
| Create local asset | | | | |
| Create new product-line asset | | | | |
## Recommendation
- Recommended option:
- Why this option:
- Conditions:
- Release gate:
- Telemetry to watch:
- Review date:
# Platform Funding Memo: Reusable AI Assets
## Investment Thesis
- Platform capability:
- Product line served:
- Business lines served:
- Strategic rationale:
- Architecture runway rationale:
## Problem
- Repeated work observed:
- POC debt created by current approach:
- Risk created by fragmented assets:
- Delivery delay created by missing platform capability:
## Proposed Platform Assets
| Asset | Current duplication | Proposed shared capability | Owner | Release milestone |
|---|---|---|---|---|
| Configurable RAG pipeline | | | | |
| Eval pack library | | | | |
| Tool connector catalog | | | | |
| Policy control service | | | | |
| Monitoring dashboard | | | | |
## Funding Model
- Central platform budget covers:
- Business line contribution covers:
- Capacity allocation:
- Guardrails:
- Stop / continue rule:
## Reuse ROI
| ROI component | Evidence source | Estimate |
|---|---|---|
| Avoided duplicate build cost | | |
| Avoided review and control cost | | |
| Faster time-to-evidence | | |
| Reduced incident and rework cost | | |
| Productivity improvement | | |
| Platform build and run cost | | |
| Governance and support cost | | |
## Governance
- Portfolio review cadence:
- Asset owner model:
- Adoption telemetry:
- Deprecation rule:
- Risk and compliance gate:
- Architecture board review:
## Decision Requested
- Fund / constrain / sequence / stop:
- Budget period:
- Expected product instances:
- Evidence required at next review:
10. Interview Answers
10.1 30 秒版本
AI POC 失败经常不是模型问题, 而是不可复用问题。我的做法是用 AI Product Line Engineering 管理它: 先定义同一能力家族的 domain architecture, 再把 prompt、eval、RAG、ontology、connector、policy、workflow、monitoring 和 UX pattern 做成 core assets。差异不靠复制代码解决, 而是通过 domain、jurisdiction、risk tier、channel、customer segment、language、model/provider 和 data boundary 这些 variation points 管理。最后用 adoption telemetry、reuse ROI 和 platform funding gate 决定继续投资、扩展、合并或退役。
10.2 2 分钟版本
我会把企业 AI 规模化从“项目交付”提升到“产品线工程”。第一步不是建大平台, 而是识别 capability family, 比如金融零售里的 case assist、policy assist、decision support。然后做 commonality / variability analysis: 哪些资产在多个业务线共享, 哪些差异来自领域、地区、风险等级、渠道、客户分层、语言、模型供应商或数据边界。
对共性部分, 我会沉淀 core assets, 包括 prompt/template、eval set、RAG pipeline、ontology、tool connector、policy control、workflow component、monitoring dashboard 和 UX pattern。对差异部分, 我会建立 variation matrix 和 configuration profile, 让业务线通过配置和受控扩展装配产品实例, 而不是 fork 一套应用。
治理上, 每个资产必须有 owner、versioning、approved use、excluded use、eval evidence、adoption telemetry、deprecation rule 和 reuse ROI。投资上, 我会把高复用、高风险控制价值的资产放进 architecture runway, 用 lean funding 支持平台团队持续建设, 业务线为本地配置、流程改造和 adoption 负责。这样做的结果是: 新 use case 更快进入 pilot, release gate 更一致, 风险证据可复用, 平台价值可以被度量, 低价值资产也能及时退役。
作为 Platform PM, 我会把 AI 平台资产当成产品来管, 而不是内部工具清单。我的核心指标不是 API 调用量, 而是 active product instances、reused asset ratio、time-to-first-pilot、time-to-release、config vs fork ratio、cost per successful task、incident rate 和 avoided duplicate build。
我会优先平台化四类资产: 第一, 每个 AI 应用都需要的 model gateway、RAG pipeline、eval harness、observability 和 policy control;第二, 多业务线重复接入的 tool connectors;第三, 能降低高风险上线摩擦的 eval pack 和 evidence binder;第四, 能提高用户信任的 citation、edit-and-approve、feedback-to-eval UX pattern。
资金上, 我会用 hybrid funding: central platform budget 投 architecture runway, 业务线为本地配置和 adoption 投入。每个季度 review reuse ROI、adoption、support load 和 fork signal。平台不是越大越好, 如果一个资产没有真实复用、维护成本高或导致业务绕开平台, 我会推动合并、收缩或退役。
10.4 Enterprise Architect 版本
作为 Enterprise Architect, 我会关注 AI 产品线的 reference architecture、variation model、control points 和 architecture runway。我的目标是防止 AI POC 在企业里形成一批不可维护的影子架构。
架构上, 我会把 domain engineering 和 application engineering 分开。domain engineering 负责产品族边界、domain model、reference architecture、core asset backlog、variation model、release policy 和 funding thesis。application engineering 负责具体产品或租户的 configuration profile、asset selection、local extension、variation-specific eval、release gate 和 production telemetry。
治理上, 我会要求关键 AI assets 注册到 core asset registry, 具备版本、owner、依赖、批准用途、风险等级、eval evidence、生产指标和退役规则。对高风险金融零售场景, 我会把 NIST AI RMF 的风险管理语言落到可执行控制: risk tier 决定 eval 强度、HITL、monitoring、rollback、evidence retention 和 exception approval。对投资治理, 我会把可复用平台能力纳入 portfolio funding 和 architecture runway, 让技术债、风险债和重复建设在投资层面可见。
10.5 常见追问
追问 回答要点 如何判断一个资产值得平台化 看重复性、风险控制价值、交付瓶颈、运维成本、可配置性和至少两个高价值实例的证据 如何避免过早平台化 先注册 local artifact, 通过第二个场景和复用证据升级;没有 adoption 和 variation evidence 不进入平台 runway 如何处理业务线强定制 把差异放进 variation matrix, 判断是配置、扩展、隔离还是禁止;fork 必须有 owner、成本和迁移策略 Eval 如何复用 复用 common eval pack, 再按 domain、jurisdiction、risk tier、language、model/provider 加 slice;release gate 看关键失败而不只看平均分 平台投资如何说服高层 用 avoided duplicate build、faster time-to-evidence、reduced control cost、lower incident risk、cost per successful task 和 adoption telemetry 讲 ROI 这和普通平台 PM 有何不同 AI 平台资产还必须管理模型变化、prompt 版本、RAG 知识血缘、eval replay、风险控制、human oversight 和证据留存