AI 扩展计划 / Playbooks

AI Platform Service Catalog / Golden Paths Playbook

以下来源作为术语和设计锚点。正式项目落地时应按访问日期复核最新版本、许可证、地区可用性、安全条款和机构内部政策。

677 行AI_PLATFORM_SERVICE_CATALOG_GOLDEN_PATHS_PLAYBOOK.md

AI Platform Service Catalog / Golden Paths Playbook

目的: 训练 AI Platform PM / AI Architect / AI Transformation Lead 把企业 AI 平台从“能力清单”产品化为可发现、可自助、可治理、可复用、可度量的内部产品。适用对象: AI Platform PM、平台架构师、AI BA、Enterprise Architect、CTO office、模型风险和金融零售 AI 转型负责人。核心观点: AI 平台的价值不在于暴露更多 API, 而在于用 service catalog 和 golden paths 把模型、RAG、eval、tool、policy、observability、HITL 和 evidence 串成可重复交付路径。

1. Source Anchors

以下来源作为术语和设计锚点。正式项目落地时应按访问日期复核最新版本、许可证、地区可用性、安全条款和机构内部政策。

Source	Link	本手册使用方式
CNCF Platforms White Paper	https://tag-app-delivery.cncf.io/whitepapers/platforms/	将 internal platform 视为支持产品团队交付的共享能力组合, 强调价值流、平台团队、采用和成功度量。
CNCF Platform Engineering Maturity Model / Glossary	https://tag-app-delivery.cncf.io/whitepapers/platform-eng-maturity-model/ / https://tag-app-delivery.cncf.io/wgs/platforms/glossary/	用于校准 platform engineering、platform team、golden path templates、developer experience、adoption、self-service interface 等概念。
Backstage Docs	https://backstage.io/docs/	作为 service catalog、software templates、developer portal、ownership metadata 和平台入口体验的参考。
NIST AI RMF	https://www.nist.gov/itl/ai-risk-management-framework	用 Govern / Map / Measure / Manage 的风险管理思路, 把 policy guardrails、risk tier、approval、monitoring 和 evidence 纳入平台路径。
OpenTelemetry Docs	https://opentelemetry.io/docs/	用 trace、metrics、logs 和 semantic conventions 思路, 设计 AI service telemetry、SLO、成本、质量和 adoption 观测。

2. One-Sentence Positioning

AI Platform Service Catalog / Golden Paths = 用产品化目录和经过验证的交付路径, 让业务团队能自助构建 AI use case, 同时默认继承模型接入、RAG、评估、工具、策略、观测、人工复核、成本归因和审计证据。

这份手册不讲“什么是平台”或“为什么需要 AI”。它面向已经能做需求、流程、风险和架构判断的人, 训练更高阶的问题:

高阶问题	平台 PM 要给出的答案
如何避免每个 AI 项目重复搭 model gateway、RAG、eval、日志和审批?	用 service catalog 把平台能力做成可发现、可申请、可复用、可运营的内部产品。
如何让团队不是只拿到 API, 而是走完从 pilot 到 release 的路径?	用 golden paths 提供 reference implementation、template、policy、telemetry、eval gate 和 runbook。
如何既自助又受控?	用 risk tier、data boundary、policy-as-code、approval routing、telemetry 和 evidence binder 把治理内嵌进路径。
如何证明平台不是成本中心?	用 time-to-first-pilot、reuse rate、quality gate pass rate、cost per case、risk exceptions、developer satisfaction 和 adoption 证明。

3. 为什么 AI 平台不能只提供模型 API

模型 API 只解决“能调用模型”。企业 AI 产品真正卡住的地方通常在模型调用之外:

断点	只提供模型 API 的结果	平台化做法
用例发现	团队不知道哪些场景适合 AI, 重复走探索	catalog 提供 use case patterns、risk tier 和 golden path 选择。
数据和知识	每个团队各自解析文档、建索引、处理权限	RAG service 管理 source、metadata、permission、freshness、citation 和 eval。
质量验证	上线靠 demo 感觉和少量人工试用	eval service 提供 golden set、rubric、regression、release gate 和 evidence。
工具调用	Agent 随项目接系统, 权限、审计、幂等不一致	tool gateway 提供 approved tools、schema、risk level、HITL 和 audit。
合规控制	风险规则散落在 prompt、代码和人工流程里	policy engine 统一执行 data boundary、allowed use、prohibited action 和 exception。
可运营性	出问题时只能看应用日志或供应商账单	observability 默认记录 trace、span、cost、quality、latency、safety 和 adoption。
采用	POC 能演示, 业务团队不会接入真实流程	golden path 给出 reference app、workflow integration、training、support 和 dashboard。
证据	审计时补材料, 变更和审批追溯困难	evidence binder 自动汇聚 eval、approval、exception、incident 和 monitoring。

AI Platform PM 的产品判断不是“要不要做统一 API”, 而是:

哪些重复能力应该平台化?
哪些业务差异应该留给 product team?
哪些治理动作必须默认发生?
哪些自助路径能缩短 pilot 时间, 同时不牺牲风险控制?

一个成熟 AI 平台应该像内部产品组合, 而不是一个 SDK 包。它应同时服务四类用户:

用户	想要什么	平台应提供什么
Product team	快速验证真实业务场景	golden path、reference implementation、metrics baseline、adoption dashboard。
Developer / AI engineer	少写重复胶水代码	templates、SDK、model/RAG/tool/eval services、local examples、CI gate。
Risk / Compliance	控制默认发生且可审计	risk tier、policy guardrails、approval workflow、evidence binder。
Platform operator	能运营、优化和收费归因	SLO、cost ledger、usage analytics、incident workflow、roadmap feedback。

4. Service Catalog Taxonomy

Service catalog 不是平台能力列表, 而是内部产品入口。每张 catalog card 都应说明: 解决什么问题、适合哪些 use case、风险边界、接入方式、owner、SLO、成本模型、证据输出和支持渠道。

4.1 Catalog 分层

Catalog layer	代表服务	Product promise	典型消费者
AI control plane	model gateway、policy engine、eval service、observability、evidence binder	统一接入、门禁、观测、审计	全部 AI use case
AI application services	RAG service、tool gateway、HITL queue、document AI、decision service runtime	复用应用级能力	业务产品团队
Developer experience	templates、SDK、reference implementations、sample datasets、CI/CD recipes	缩短从想法到 pilot 的路径	开发者、AI engineer
Governance experience	risk tier workflow、approval flow、control library、exception process	降低评审摩擦并保留证据	Risk、Compliance、Model Risk
Adoption experience	onboarding guide、training pack、dashboard、support path	让业务团队真正使用	PM、BA、Ops manager

4.2 核心服务卡片

Service	解决的问题	最小能力	Governance by default	成功指标
Model Gateway	多模型接入、路由、成本、日志、fallback 不一致	approved model routes、quota、fallback、redaction、request metadata	data boundary、model allowlist、risk-based routing、audit log	受控模型调用占比、fallback success、cost per successful request、policy block accuracy
RAG Service	知识源、权限、检索、引用、刷新重复建设	source registry、ingestion、chunking、metadata filter、hybrid retrieval、citation	permission-before-retrieval、freshness policy、source owner、retrieval eval	citation accuracy、retrieval recall、stale source block、reuse across apps
Eval Service	每个团队各做各的质量判断	dataset registry、eval runner、rubric、judge calibration、release gate report	risk-tiered thresholds、critical failure hard stop、approval evidence	quality gate pass rate、regression recurrence、eval coverage by use case
Tool Gateway	Agent 调用业务系统缺少统一控制	tool catalog、schema validation、RBAC/ABAC、idempotency、sandbox mode	read/write risk level、approval requirement、side-effect audit	unsafe tool attempt block、tool success rate、approval SLA、tool reuse
Policy Engine	规则散落在 prompt 和代码里	allowed/prohibited use、data handling、content/action policy、exception rule	policy-as-code、versioning、owner approval、runtime decision log	policy decision latency、exception rate、policy drift findings
Observability	AI 行为不可解释、不可运营	trace、model/retrieval/tool/judge/HITL spans、metrics、logs	required trace context、retention policy、sensitive data masking	trace coverage、SLO compliance、debug time、cost attribution completeness
HITL Queue	高风险输出和工具动作缺少人工复核路径	queue routing、review rubric、SLA、decision record、override reason	role-based reviewer、dual control、escalation, audit	review turnaround、override rate、critical bypass count
Evidence Binder	发布和审计证据分散	bind eval report、approval、component versions、monitoring、incident records	immutable evidence index、risk owner signoff、retention	evidence completeness、audit response time、open exception aging
Templates	团队不知道如何按标准开局	repo templates、config templates、dashboard templates、runbooks	embedded policy, telemetry, eval hooks, owner metadata	template adoption、time-to-first-green-build、manual setup reduction

4.3 Catalog Card 必备字段

字段	说明
service_id	全局唯一编号, 如 `AI-RAG-SERVICE`。
service_owner	平台 owner、运维 owner、产品 owner。
consumer_persona	Product team、developer、risk reviewer、data owner。
use_case_fit	适用模式, 如 knowledge assistant、document extraction、agent workflow。
risk_fit	支持的风险等级, 以及不支持的场景。
interfaces	Portal、API、SDK、CLI、template、workflow request。
provisioning_mode	自助、自动审批、人工审批、架构评审。
input_requirements	数据边界、metadata、owner、baseline、sample cases。
output_evidence	trace、eval report、approval record、dashboard、evidence binder 条目。
SLO	latency、availability、quality gate processing time、support response。
cost_model	按 request、token、document、tool call、review item 或 use case 归因。
adoption_metrics	激活团队数、重复使用、reuse rate、developer satisfaction。
support_path	文档、office hour、L1/L2/L3、incident path。

4.4 Catalog 运营规则

规则	为什么重要
每个服务必须有明确 owner 和 deprecation policy	平台服务也是产品, 不能成为无人维护的内部依赖。
每个服务必须给出 default path 和 escape hatch	防止平台变成强制流程, 同时避免团队绕开治理。
Catalog card 必须显示接入成本和约束	内部用户需要知道时间、数据、审批、运行成本和责任边界。
服务指标必须与 roadmap 绑定	低采用、高支持成本、低复用的服务应被重设或收敛。
Catalog 与 evidence binder 打通	用平台路径上线的 use case 自动形成审计证据链。

5. Golden Paths

Golden path 是“从一个常见 AI 任务到可控上线”的产品化路径。它不是文档链接集合, 而是 opinionated template + reference implementation + policy guardrails + telemetry + eval gates + runbook + adoption assets。

5.1 Golden Path 设计原则

原则	AI 平台含义
Opinionated but optional	给出推荐路径, 允许有明确理由的例外。
Self-service first	低/中风险团队能自助创建项目、索引、eval、dashboard 和 deployment。
Governance embedded	风险分级、数据边界、policy、approval 和 evidence 默认进入流程。
Reference over abstraction	给出可运行 reference implementation, 避免只给抽象接口。
Observable from day one	从第一个 pilot request 开始记录 trace、cost、quality、adoption。
Product team ready	不只支持开发, 还支持 PRD、eval contract、release memo、adoption dashboard。
Iteration loop built in	线上反馈能回到 eval set、RAG source、policy、template 和 roadmap。

5.2 Golden Path 1: Customer-Facing RAG

适用场景: 客服政策问答、产品条款解释、费用争议说明、分行员工辅助答复。客户可见或半客户可见, 风险中高。

Path step	平台默认能力	产出
Intake	use case card、customer impact、risk tier、approved/prohibited use	RAG use case brief
Knowledge onboarding	source registry、owner、effective date、metadata、permission filter	Knowledge source card
Template provision	RAG app template、citation UI、fallback answer、handoff flow	Running reference app
Eval setup	golden questions、red-team prompts、citation rubric、no-answer test	RAG eval pack
Guardrails	PII redaction、financial advice boundary、stale policy block、handoff trigger	Policy config
Telemetry	retrieval span、citation doc ids、answer quality、agent acceptance	RAG dashboard
Release gate	citation accuracy、unsupported claim、escalation correctness、latency/cost	Release gate memo
Adoption	agent training scenarios、manager QA sample、feedback taxonomy	Adoption dashboard

关键门槛:

Metric	Suggested gate
customer-facing unsupported claim	0
permission leakage	0
stale policy citation	0 for regulated answers
citation correctness	>= 95% before scale
no-answer correctness	>= 90%
high-risk escalation	>= 99%

5.3 Golden Path 2: Employee Copilot

适用场景: BA/PM/Architect 文档助手、内部政策助手、运营知识助手、培训助手。通常内部使用, 风险取决于数据和输出是否进入正式决策。

Path step	平台默认能力	产出
Role scoping	target user、workflow step、human decision retained	Copilot scope memo
Data boundary	allowed repositories、restricted data exclusion、retention policy	Data access rule
Template provision	chat UI、prompt registry、RAG connector、feedback button	Internal copilot app
Eval setup	task-specific rubric、expert sample、bad-answer categories	Copilot eval report
Guardrails	confidential data warning、source citation、non-authoritative disclosure	Policy config
Telemetry	active users、accepted answer、edit distance、feedback reason	Adoption + quality dashboard
Enablement	scenario drills、manager guidance、support route	Launch pack

产品 PM 的判断重点:

判断	问题
Adoption quality	用户是否把 copilot 嵌入真实流程, 还是只试用几次?
Work product quality	AI 输出是否减少返工, 还是把审核成本转移给专家?
Data risk	员工是否可能输入未授权客户数据、机密文件或监管敏感材料?
Role redesign	AI 是否改变 BA/PM/Architect 的工作方式, 以及新能力如何培训?

5.4 Golden Path 3: Agent Workflow

适用场景: 支付争议助手、反欺诈调查助手、客户请求分派、运营例外处理。Agent 可读取信息、调用工具、创建草稿或触发审批。

Path step	平台默认能力	产出
Workflow mapping	AS-IS/TO-BE、tool inventory、side-effect map、human checkpoints	Agent workflow map
Agent identity	service account、user delegation、permission scope、audit subject	Agent identity record
Tool onboarding	tool gateway、schema、risk level、idempotency、dry run	Tool card
Template provision	workflow state machine、tool call wrapper、retry/fallback、HITL hook	Agent reference workflow
Eval setup	trajectory eval、tool selection accuracy、argument correctness、unsafe action tests	Agent eval pack
Guardrails	action policy、approval threshold、rate limit、rollback rule	Tool policy config
Telemetry	tool spans、policy decision、approval ids、state transitions	Agent ops dashboard
Release gate	unsafe tool attempt, approval bypass, trajectory completion, cost/SLO	Agent release memo

不可让 golden path 隐藏的问题:

风险	默认控制
Agent 执行不可逆动作	高风险 write action 必须进入 HITL queue。
工具参数由模型自由生成	schema validation + deterministic checks + dry-run preview。
人类审批变成形式	reviewer rubric、override reason、sampling audit。
Agent 循环和成本失控	state limit、step budget、cost budget、circuit breaker。
责任归属不清	agent identity、human approver、business owner 和 platform owner 分离记录。

5.5 Golden Path 4: Document AI

适用场景: KYC 文档预审、贷款材料摘要、发票/合同字段抽取、客户邮件分类、监管函件分析。

Path step	平台默认能力	产出
Document intake	document type、classification、retention、PII policy	Document processing card
Extraction template	OCR/parser、field schema、confidence score、validation rule	Extraction pipeline
Human review	low-confidence routing、dual review for high-risk fields	Review queue
Eval setup	labeled sample set、field-level precision/recall、layout regression	Document eval report
Guardrails	restricted doc handling、field masking、export control、evidence retention	Data policy config
Telemetry	extraction span、field confidence、review override、cycle time	Document AI dashboard
Release gate	critical field accuracy、review SLA、privacy leakage、cost per document	Release memo

关键产品取舍:

取舍	推荐判断
全自动 vs human-in-the-loop	涉及客户身份、授信、资金、合规义务的关键字段不应直接全自动生效。
通用 parser vs domain template	高频高价值文档应沉淀 domain template, 长尾文档保持人工路径。
准确率 vs 处理时长	高风险字段以准确和可审计为先, 低风险字段可用置信度分流。
字段抽取 vs 决策建议	抽取可以平台化, 决策规则要连接业务 policy 和审批责任。

5.6 Golden Path 5: Decision Service

适用场景: 下一步行动建议、风险分层、投诉升级、反欺诈队列优先级、营销合规筛选。这里的 AI 多为 decision support, 不应默认成为最终自动决策。

Path step	平台默认能力	产出
Decision boundary	recommendation vs decision、human authority、appeal path	Decision scope memo
Input contract	features、data lineage、sensitive attributes、freshness	Decision input card
Policy integration	business rules、model route、risk appetite、override policy	Decision policy config
Template provision	scoring API、explanation payload、override capture、monitoring hooks	Decision service template
Eval setup	backtesting、slice analysis、fairness checks、counterfactual cases	Decision eval report
Guardrails	protected class controls、manual review, adverse action boundary	Control pack
Telemetry	score distribution、override、outcome drift、appeal/complaint signals	Decision monitoring dashboard
Release gate	high-risk slice performance、fairness, stability, SLO, evidence	Decision release memo

Decision service 的 PM 表达要谨慎:

这不是“让 AI 做决定”, 而是把建议、证据、风险边界、人工确认和监控指标平台化。

6. Self-Service With Guardrails

真正成熟的平台不是“所有事都审批”, 也不是“所有人随便自助”。它通过 risk tier、data boundary、approval、templates 和 telemetry 让不同风险的路径自动分流。

6.1 Risk Tiering

Risk tier	典型场景	自助程度	必需控制
Low	内部非敏感 FAQ、草稿生成、培训演练	高度自助	approved model route、basic logging、usage policy acknowledgement
Medium	员工 copilot、客服草稿、内部知识检索	自助 + 自动检查	RAG permission、citation、eval smoke set、cost quota、feedback capture
High	客户可见政策解释、支付争议、KYC/AML 支持	受控自助 + 风险审批	risk owner signoff、golden set、HITL、monitoring、evidence binder
Critical	可能影响客户资金、授信、法律义务或监管报送	强 gate	architecture review、model risk validation、dual approval、limited release、incident runbook

6.2 Data Boundary

Boundary	平台应默认检查
Public / internal	可进入标准 route, 仍需 owner 和 retention。
Confidential	需要 enterprise route、trace masking、访问控制、export restriction。
Restricted / regulated	私有端点或批准供应商, 禁止进入未授权模型、训练、外部日志和低权限环境。
Customer PII	redaction/tokenization、最小化字段、purpose limitation、reviewer role control。
Financial advice / credit / AML sensitive	禁止自动最终结论, 必须有人工责任人和证据链。

6.3 Approval Routing

Trigger	Approval path	Evidence required
新 use case 进入 medium 以上风险	AI PM + business owner	use case card、workflow step、risk tier、success metric
使用 restricted data	Data owner + security	data classification、access rule、retention、masking design
客户可见输出	Risk / compliance + business owner	eval report、disclosure, handoff, complaint path
高风险 tool write action	Tool owner + risk owner	tool card、side-effect map、HITL rule、rollback runbook
上线到 production	Release gate forum	component versions、quality gate、SLO、cost forecast、evidence binder
例外放行	accountable executive + risk owner	exception reason、scope、expiry、compensating control、review date

6.4 Template 内嵌控制

Template element	默认内嵌内容
Repo template	owner metadata、risk tier config、trace context、policy client、eval runner。
RAG template	source registry schema、metadata filter、citation renderer、stale source handling。
Agent template	state machine、tool gateway wrapper、step budget、approval hook、audit span。
Dashboard template	latency、cost、quality、safety、adoption、risk exception tiles。
Release template	eval report link、SLO, evidence binder, rollback, incident contact。
Training template	allowed/prohibited use、example failures、feedback path、human oversight scenarios。

6.5 Telemetry as Governance

平台治理不应只发生在评审会议里。许多控制可以通过 telemetry 证明是否被执行:

Governance question	Telemetry evidence
是否所有生产请求都走 approved gateway?	trace coverage、gateway route id、direct model call detection。
是否有未授权知识被检索?	retrieval permission filter result、blocked doc ids、audit sample。
是否绕过人工审批?	tool span + approval id + reviewer decision。
是否成本失控?	cost ledger by use case、route、model、risk tier、budget variance。
是否质量门禁有效?	release gate pass/fail、critical failure trend、regression recurrence。
是否用户真的采用?	target user activation、repeat use、accepted suggestion、workflow completion。

7. Platform Product Metrics

AI 平台指标不能只看 API 调用量。调用量可能代表复用, 也可能代表失控成本、低质量试错或业务绕不开的平台摩擦。

7.1 North Star

North Star: 业务团队通过 approved golden paths, 在可控风险下更快把高价值 AI use case 推进到 pilot、release 和 scale。

7.2 指标树

Dimension	Metric	定义	用途
Delivery speed	time-to-first-pilot	use case intake 到首个受控 pilot 的时间	衡量平台是否减少重复启动成本。
Reuse	reuse rate	新 AI use case 复用 catalog service / golden path 的比例	衡量平台能力是否成为默认路径。
Quality	quality gate pass rate	release gate 一次通过率和修复后通过率	衡量 template 和 enablement 是否足够成熟。
Unit economics	cost per case	每个成功业务案例的全量成本: model、RAG、tool、judge、HITL、observability	支持 CFO/业务 owner 的 scale 决策。
Risk	risk exceptions	例外数量、类型、有效期和逾期未关闭情况	衡量治理是否被绕开或设计过窄。
Developer experience	developer satisfaction	开发者对文档、template、SLO、支持和易用性的评分	发现平台摩擦。
Adoption	product team activation	目标团队使用 golden path 完成真实 workflow 的比例	防止只看技术接入。
Operability	trace coverage / SLO compliance	请求是否可观测、服务是否稳定	衡量平台可运营性。
Roadmap fit	feature demand conversion	高重复需求转成平台能力的比例	驱动 roadmap 投资。

7.3 Metrics Guardrails

Metric	常见误读	更好的读法
API calls	调用越多越成功	与成功任务、质量、成本、adoption 联合看。
Number of services	catalog 越多越成熟	低采用服务会增加认知负担, 应重设或合并。
Time-to-first-pilot	越快越好	不能牺牲 risk tier、eval、telemetry 和 evidence。
Quality gate pass rate	越高越好	过高可能说明 gate 太弱或样本太窄。
Risk exceptions	越低越好	低例外可能说明业务绕开平台, 需要看直接调用和 shadow AI 信号。
Developer satisfaction	分数高就够	要按 persona 和 use case stage 切片。

7.4 Platform Roadmap Prioritization

平台 roadmap 应从 adoption friction 和重复需求中产生, 而不是从平台团队想做的技术能力中产生。

Signal	Roadmap implication
多个团队重复手写 RAG ingestion 和 citation	优先增强 RAG service 和 Customer-Facing RAG golden path。
release gate 经常因 eval 数据不充分失败	投资 eval dataset tooling、sample library 和 SME workflow。
高风险 use case 等审批太久	做 risk-tiered approval automation 和 evidence binder 自动汇聚。
开发者绕开 template	重做 developer onboarding、CLI、reference implementation 和文档结构。
成本难解释	优先补 trace tagging、cost ledger 和 route-level unit economics。
业务上线后不用	补 adoption dashboard、training pack、manager cadence 和 feedback loop。

8. Financial Retail Case: 银行 AI Platform Catalog

8.1 背景

一家零售银行已经完成多个 AI POC: 客服知识助手、KYC 文档预审、支付争议总结、AML 调查 copilot、内部 BA/PM 文档助手。问题不是“模型不好”, 而是每个团队都重复做接入、检索、评估、日志、审批和培训, 风险团队也无法统一看证据。

平台目标:

在 2 个季度内, 让 70% 新 AI pilot 通过 approved golden path 启动;
让高风险 use case 100% 进入 eval gate、HITL、observability 和 evidence binder;
把平均 time-to-first-pilot 从 10 周降到 4 周以内;
把成本、质量、风险和采用指标统一归因到 use case。

8.2 Catalog 组合

Catalog item	Bank use cases	Owner	默认控制
Model Gateway	全部 AI 应用	AI Platform	model allowlist、quota、redaction、route policy、cost attribution
Customer RAG Service	客服、分行、理财政策问答	Knowledge Platform	source owner、effective date、permission filter、citation eval
Document AI Service	KYC、贷款材料、投诉附件	Operations AI	field schema、confidence routing、review queue、retention
Eval Service	全部 pilot/release	EvalOps	golden/regression/red-team set、risk-tiered gate、judge calibration
Tool Gateway	支付争议、AML、CRM 查询	Integration Platform	tool risk level、RBAC、HITL、idempotency、audit
Policy Engine	客户可见、决策支持、Agent	Risk Technology	prohibited use、data boundary、action rule、exception
HITL Queue	KYC/AML/支付/客户承诺	Ops Excellence	reviewer role、SLA、override reason、dual control
Observability	全部生产 AI service	SRE / Platform	OpenTelemetry-style traces、cost ledger、quality dashboard、incident
Evidence Binder	高风险 release 和审计	Model Risk	component versions、eval、approval、monitoring、exception
Golden Path Templates	RAG、copilot、agent、document AI、decision support	Developer Experience	reference repos、dashboards、runbooks、training pack

8.3 Golden Path 应用

Use case	Recommended path	为什么
客服政策问答	Customer-Facing RAG	客户可见, 需要 citation、stale policy block、handoff 和投诉监控。
BA/PM 内部助手	Employee Copilot	内部知识和文档草稿, 重点是数据边界、adoption 和编辑质量。
支付争议总结与下一步建议	Agent Workflow	需要读交易、查争议状态、生成草稿, 高风险动作进 HITL。
KYC 文件预审	Document AI	字段抽取、置信度、人审、证据留存和隐私控制是核心。
AML alert 优先级建议	Decision Service	不能自动做最终监管判断, 需要 decision boundary、backtesting 和审计。

8.4 银行平台 Adoption Loop

Use case intake
  -> choose golden path
  -> self-service project and catalog registration
  -> risk tier and data boundary check
  -> provision templates, service bindings, telemetry
  -> build eval set and run release gate
  -> limited pilot with HITL and dashboard
  -> adoption review and cost review
  -> scale / adjust / stop
  -> production feedback back into catalog and golden paths

8.5 典型 Roadmap

Quarter	Platform roadmap	成功证据
Q1	model gateway、RAG service、eval runner、basic catalog、RAG golden path	2 个客服/内部知识 pilot 使用同一路径。
Q2	tool gateway、HITL queue、evidence binder、agent golden path	支付争议或 AML copilot limited pilot 可审计上线。
Q3	document AI service、decision service template、cost ledger、adoption dashboards	KYC/贷款文档和 decision support 用例复用平台能力。
Q4	roadmap optimization、self-service approval、catalog lifecycle、platform chargeback/showback	time-to-first-pilot、reuse rate、cost per case 和满意度进入季度经营评审。

9. Artifact Templates

9.1 Service Catalog Card

# AI Service Catalog Card

## Identity
- service_id: AI-RAG-SERVICE
- service_name: Regulated Knowledge RAG Service
- service_owner: AI Platform PM
- operating_owner: AI Platform SRE
- business_sponsor: Retail Banking Operations
- support_channel: AI Platform portal / office hour / incident queue

## Product Promise
- problem_solved: 让业务团队以统一权限、引用、评估和观测能力构建知识助手。
- target_consumers: Customer service product teams, branch enablement teams, compliance knowledge teams.
- use_case_fit: 客服政策问答、分行员工助手、内部 SOP 查询。
- unsuitable_use: 自动做授信、理财适当性、SAR 或客户资金处置决定。

## Interfaces
- portal: Backstage-style service page with onboarding workflow.
- API: `/ai/rag/query`, `/ai/rag/ingest`, `/ai/rag/eval`.
- SDK / CLI: `ai-platform rag init --path customer-facing-rag`.
- templates: customer-facing RAG app, source card, eval pack, dashboard spec.
- reference_implementation: Retail banking policy assistant with citation UI.

## Risk and Data Boundary
- supported_risk_tiers: Low, Medium, High with additional release gate for customer-facing use.
- data_classes_allowed: Public, internal, confidential with approved access filters.
- prohibited_data: Unapproved restricted customer PII, legal privileged notes, unowned policy drafts.
- required_approvals: Data owner for source onboarding; risk owner for customer-facing release.
- policy_engine_rules: permission-before-retrieval, stale policy block, financial-advice boundary.

## Provisioning
- provisioning_mode: Self-service for Low/Medium; approval-routed for High.
- required_inputs: use_case_id, source owner, metadata schema, risk tier, eval profile.
- default_configuration: hybrid retrieval, metadata filter, citation renderer, trace context.
- environment_support: dev, test, pilot, production.
- estimated_time_to_first_use: one business day for approved internal sources.

## Evidence Outputs
- telemetry: retrieval span, model span, citation doc ids, cost and latency.
- eval_report: retrieval recall, citation correctness, no-answer behavior, red-team result.
- approval_record: source owner signoff, risk owner release signoff.
- audit_log: source version, index version, permission decision, answer trace.
- evidence_binder_entry: release gate report and production monitoring link.

## SLO and Cost
| Metric | Target | Measurement | Owner |
|---|---:|---|---|
| availability | 99.5% during business hours | gateway and retrieval service uptime | AI Platform SRE |
| p95 latency | <= 8s for standard RAG answer | OpenTelemetry-style request trace | AI Platform SRE |
| support response | <= 1 business day for onboarding issue | support queue | Developer Experience Lead |
| cost unit | cost per accepted answer | cost ledger by use_case_id | Platform PM + Finance |

## Adoption and Feedback
- activation_metric: first successful pilot request with trace and eval profile.
- reuse_metric: number of production use cases sharing this RAG service.
- satisfaction_metric: quarterly developer and product team survey.
- roadmap_feedback_path: monthly platform roadmap review using support and adoption data.

9.2 Golden Path Checklist

# Golden Path Checklist

## Path Identity
- path_name: Customer-Facing RAG Golden Path
- recommended_for: regulated knowledge answers with citations and human handoff.
- risk_tiers_supported: Medium and High with customer-facing release gate.
- reference_implementation: Retail banking policy assistant.
- platform_services_used: model gateway, RAG service, eval service, policy engine, observability, evidence binder.

## Intake
- use_case_card_complete: yes, linked to `CS-RAG-001`.
- business_owner_named: Head of Contact Center Operations.
- target_workflow_defined: agent drafts answer before sending to customer.
- approved_use_defined: answer policy questions with citation and human confirmation.
- prohibited_use_defined: no automated fee waiver, credit decision, legal advice or final complaint disposition.

## Data and Knowledge
- data_boundary_confirmed: confidential internal policy documents, no unapproved customer PII.
- source_owner_named: Retail Policy Knowledge Owner.
- metadata_schema_complete: product, region, effective date, document owner, audience.
- permission_filter_defined: user role, product line, jurisdiction and document status.
- freshness_rule_defined: expired or superseded policy pages are blocked before retrieval.

## Build
- template_provisioned: RAG app template with citation UI and handoff.
- service_bindings_created: model gateway, RAG index, eval runner and dashboard.
- local_run_success: sample request returns cited answer and trace id.
- CI_eval_hook_enabled: smoke eval and red-team prompt set run on pull request.
- telemetry_context_enabled: use_case_id, risk_tier, prompt_version and index_version recorded.

## Governance
- risk_tier_approved: High, approved by risk owner for limited pilot.
- policy_rules_attached: stale policy block, PII redaction, financial-advice boundary.
- approval_route_configured: release gate requires business, risk and platform signoff.
- HITL_required_where_applicable: all customer-visible answers require agent confirmation.
- evidence_binder_created: binder id `EB-CS-RAG-001`.

## Eval and Release
- golden_set_ready: 120 policy questions across product, region and high-risk topics.
- regression_set_ready: prior bad answers and complaint-triggering cases included.
- red_team_cases_ready: prompt injection, unsupported claim and stale policy prompts included.
- release_gate_passed: no critical failure and citation threshold met.
- rollback_plan_ready: route can fall back to previous approved policy index.

## Operations and Adoption
- dashboard_live: quality, latency, cost, adoption and risk tiles available.
- cost_attribution_live: cost ledger grouped by use_case_id, route and product line.
- support_path_published: L1 support and platform incident route visible in portal.
- training_scenarios_ready: 15 customer policy scenarios with escalation examples.
- feedback_loop_connected: agent feedback creates eval and knowledge-owner review items.

9.3 Platform Adoption Dashboard

Tile	Decision supported	Slice
New use cases by golden path	哪些路径真正被采用	business unit、risk tier、path
Time-to-first-pilot trend	平台是否缩短启动时间	path、team、quarter
Reuse rate	catalog service 是否成为默认能力	service、use case type
Quality gate pass / fail	template 和 enablement 是否足够成熟	failure reason、risk tier
Cost per successful case	是否具备可规模化 unit economics	route、model、path、business unit
Risk exceptions	治理例外是否过多或逾期	exception type、owner、expiry
Developer satisfaction	开发者体验是否有摩擦	persona、path、team
Target user adoption	业务是否真的改变工作方式	role、workflow、team
Support load	哪些服务需要产品改进	service、issue category
Evidence completeness	高风险 release 是否可审计	use case、binder status

Dashboard 解释规则:

平台采用成功 = 新 use case 走 approved path + 质量门禁通过 + 成本可解释 + 风险例外受控 + 目标用户进入真实流程。

9.4 Roadmap Prioritization Matrix

Candidate capability	Reuse potential	Risk reduction	Time saving	Revenue / cost impact	Build complexity	Operating burden	Priority decision
RAG source registry v2	5	5	4	4	3	3	Fund
Agent tool approval automation	4	5	3	3	4	4	Fund with scope control
Generic no-code workflow builder	2	2	2	2	5	5	Defer
Document AI field schema registry	4	4	4	4	3	3	Fund
Advanced model router by cost only	3	2	2	3	3	3	Reframe around quality/cost route

Scoring guide:

Score	含义
5	多个高价值团队重复需要, 且能降低风险或显著缩短交付。
3	有明确需求, 但受限于少数场景或需要更多证据。
1	需求分散、价值不清或会增加平台复杂度。

9.5 Service Lifecycle Review

Review item	Question	Decision
Discoverability	用户能否在 portal 中理解服务适用边界?	改 catalog card 或 onboarding。
Provisioning	接入是否仍需大量人工?	增强 self-service 或模板。
Governance	风险控制是否默认发生?	补 policy、approval、telemetry。
Support	支持问题是否集中在同类摩擦?	转为 roadmap item。
Adoption	是否有团队重复使用?	继续投资、重设定位或合并。
Cost	成本是否可归因到业务结果?	补 tagging、ledger 或 showback。
Evidence	审计材料是否自动汇聚?	补 evidence binder integration。

10. Interview Answers

10.1 30 秒版本

我会把 AI 平台做成 service catalog 加 golden paths, 而不是只提供模型 API。Catalog 让团队发现和申请 model gateway、RAG、eval、tool gateway、policy、observability、HITL 和 evidence binder; golden paths 则把常见场景, 比如 customer-facing RAG、employee copilot、agent workflow、document AI 和 decision service, 做成带模板、参考实现、门禁、观测和采用指标的路径。这样业务团队能自助启动 pilot, 同时默认继承风险分级、数据边界、审批、评估、成本归因和审计证据。

10.2 2 分钟版本

我不会把 AI 平台定位成“统一模型 API”。模型调用只是最薄的一层。企业 AI 规模化真正需要的是可复用的交付路径: 模型怎么选、知识怎么接、工具怎么授权、输出怎么评估、哪些场景要人工复核、成本怎么归因、上线证据怎么保留、业务是否真的采用。

所以我会先建立 AI service catalog, 每个服务都有 owner、适用场景、接口、risk tier、数据边界、SLO、成本模型和证据输出。核心服务包括 model gateway、RAG service、eval service、tool gateway、policy engine、observability、HITL queue、evidence binder 和 templates。然后把高频用例沉淀为 golden paths, 例如 customer-facing RAG、employee copilot、agent workflow、document AI 和 decision service。每条路径都包含 reference implementation、template、policy guardrails、telemetry、eval gate、release memo、runbook 和 adoption dashboard。

平台指标不能只看调用量。我会看 time-to-first-pilot、reuse rate、quality gate pass rate、cost per case、risk exceptions、developer satisfaction 和目标用户采用率。对金融零售来说, 这套设计的价值是让银行的客服、KYC、AML、支付争议和内部 copilot 都能走统一受控路径, 既加快交付, 又让风险、审计和成本可管理。

10.3 AI Platform PM 版本

Q: 你如何设计 AI 平台 service catalog?

A: 我会从内部用户旅程设计 catalog, 而不是从技术组件堆列表。每张 card 要回答六个问题: 它解决哪个重复问题, 适合哪些 use case, 支持哪些 risk tier, 如何自助接入, 默认产生哪些 telemetry/evidence, 成本和 SLO 如何计算。Catalog 至少覆盖 model gateway、RAG、eval、tool、policy、observability、HITL、evidence 和 templates。然后用 adoption 数据管理生命周期: 高复用服务继续投资, 低复用服务重设定位或合并。

Q: Golden path 和普通模板有什么区别?

A: 普通模板通常只创建代码骨架。Golden path 是从 use case 到 pilot/release 的完整路径, 包含 reference implementation、service bindings、policy guardrails、eval set、release gate、observability dashboard、runbook、training 和 support。它的目标不是让开发者少写几行代码, 而是让业务团队用一致方式交付可运营、可审计、可采用的 AI service。

Q: 平台会不会限制业务团队创新?

A: 会有这个风险, 所以我会坚持 paved road + escape hatch。低/中风险场景高度自助, 高风险场景才强化 gate。平台提供 opinionated default, 但允许团队在有证据和责任人的情况下申请例外。关键是例外必须有范围、期限、补偿控制和复核, 不能变成绕开平台的暗路。

10.4 CTO 版本

Q: 为什么不让每个团队自己接模型和工具?

A: 早期探索可以分散, 但进入 pilot/release 后必须统一控制面。否则模型路由、数据边界、日志、成本、工具权限、人工审批和事故响应都会碎片化。统一平台不是为了集中写所有业务逻辑, 而是把横向能力和治理默认化, 让产品团队把精力放在业务 workflow 和 adoption 上。

Q: 如何判断平台抽象层级是否正确?

A: 看三类信号。第一, 团队是否复用平台能力且 time-to-first-pilot 下降。第二, 团队是否还能表达业务差异, 而不是被迫适配平台。第三, 风险、成本和质量证据是否比自建更完整。如果平台服务采用低、支持成本高、团队绕开 template, 说明抽象太厚或体验太差; 如果每个团队仍重复建 RAG、eval 和观测, 说明抽象太薄。

Q: 你会如何推进第一年 roadmap?

A: 我会用 2-3 个 flagship cases 驱动平台能力。第一阶段做 model gateway、RAG service、eval runner、basic catalog 和 customer-facing RAG path。第二阶段补 tool gateway、HITL、evidence binder 和 agent path。第三阶段扩 document AI、decision service、cost ledger 和 adoption dashboard。每季度用 reuse、pilot speed、quality gate、risk exceptions、cost per case 和 satisfaction 决定继续投资、收敛或停止。

10.5 自检问题

Catalog card 是否说明 owner、risk tier、data boundary、SLO、cost 和 evidence?
Golden path 是否包含 reference implementation, 而不是只给文档?
Self-service 是否按 risk tier 分流, 而不是一刀切审批?
Templates 是否默认启用 telemetry、policy、eval 和 evidence binder?
Customer-facing RAG 是否有 citation、freshness、permission 和 no-answer gate?
Agent workflow 是否有 tool risk level、HITL、step budget 和 side-effect audit?
Document AI 是否有 field-level eval、置信度分流和人工复核?
Decision service 是否明确 recommendation 与 final decision 的边界?
平台指标是否覆盖 speed、reuse、quality、cost、risk、developer experience 和 adoption?
Roadmap 是否来自重复需求和采用摩擦, 而不是平台团队的技术偏好?

Final Principle

AI 平台产品化的成熟标志不是 catalog 里有多少服务, 也不是模型 API 调用量有多高。

真正成熟的标志是:

一个新的金融零售 AI use case 能在清晰风险边界内, 通过可发现的 catalog、可自助的 golden path、默认内嵌的治理、可观测的运行指标和可审计的证据链, 从想法稳定推进到 pilot、release、scale 或停止。