AI Reference Implementation:模式库与复用保证架构
重要说明: 本文是学习、作品集和架构训练材料, 不构成法律意见、监管解释、审计结论、信息安全认证、模型验证结论或生产上线批准。金融零售正式项目必须由授权的业务、风险、合规、隐私、安全、模型风险、技术和审计角色确认。访问日期按 2026-06-30 记录。
AI Reference Implementation / Pattern Library / Reuse Assurance Architecture 解读
Target audience: Senior AI PM / AI Architect / Platform PM / Enterprise Architect / CBAP-level BA / AI Governance Lead / Financial Retail Product and Operations Leader. Learning objectives: 建立一套把重复 AI 解决方案模式沉淀为 reference implementation 和 reusable pattern library 的架构方法, 同时保留质量、安全、证据、生命周期、偏差和业务价值控制。 Core question: 当企业发现多个 AI 用例都在重复做 RAG、copilot、tool gateway、evidence extraction、human review、eval 和 observability 时, 如何把经验变成可复用的实现资产, 而不是复制粘贴代码、复制风险和复制治理盲区?
重要说明: 本文是学习、作品集和架构训练材料, 不构成法律意见、监管解释、审计结论、信息安全认证、模型验证结论或生产上线批准。金融零售正式项目必须由授权的业务、风险、合规、隐私、安全、模型风险、技术和审计角色确认。访问日期按 2026-06-30 记录。
Source Anchors
这些来源用于校准 AI 风险管理、AI 管理体系、架构描述、平台工程、工程绩效和可观测性语言。本文关注 reference implementation、pattern assurance 和 reusable evidence, 不把它扩展成完整 AI platform service catalog、golden path 目录或 product line engineering 方法。
| Source | Official link | 本文采用的思想 |
|---|---|---|
| NIST AI Risk Management Framework | https://www.nist.gov/itl/ai-risk-management-framework | 用 Govern / Map / Measure / Manage 组织 AI 模式的风险分类、控制证据、监测和持续改进 |
| ISO/IEC 42001 AI management system | https://www.iso.org/standard/81230.html | 用 AI management system 的 policy、objective、operation、performance evaluation、management review 和 improvement 管理可复用模式 |
| ISO/IEC/IEEE 42010 Architecture Description | https://www.iso.org/standard/74393.html | 用 stakeholder concerns、viewpoints、architecture rationale 和 architecture description 组织 pattern 的多视图描述 |
| CNCF Platforms White Paper | https://tag-app-delivery.cncf.io/whitepapers/platforms/ | 用 platform as product、self-service 和 paved-path 思想定义平台团队与产品团队的接口, 但不把本文变成 service catalog |
| DORA | https://dora.dev/ | 用 deployment frequency、lead time、change fail rate、time to restore 的思想衡量 reference implementation 对交付流和恢复能力的影响 |
| OpenTelemetry Documentation | https://opentelemetry.io/docs/ | 用 traces、metrics、logs 和 semantic conventions 的思想设计 AI pattern 的 observability skeleton 和 evidence trace |
一句话:
A reference implementation is a governed, evidence-backed, runnable embodiment of a reusable AI pattern, designed so teams can reuse quality, security, eval, observability and control evidence without blindly inheriting context-specific assumptions.
1. Executive Summary
企业 AI 进入规模化阶段后, 会反复出现相似的 solution patterns:
customer-facing RAG
internal policy copilot
AML investigation workbench
KYC evidence extraction
dispute evidence assistant
regulatory reporting narrative draft
这些用例的业务目标不同, 但底层能力高度重复:
- 受控知识检索和引用。
- Prompt / policy pack / model route。
- Tool gateway 和权限边界。
- Human review、override、approval 和 audit trail。
- Eval baseline、regression set、red-team set。
- Threat model、privacy classification 和 logging policy。
- OpenTelemetry trace、cost、latency、quality 和 adoption metrics。
- Evidence pack、control mapping、release gate 和 deviation record。
低成熟度组织会让每个团队重新实现这些能力。结果是:
| 重复现象 | 后果 |
|---|---|
| 每个 team 自己拼 RAG | source authority、ACL、citation quality 和 freshness 口径不一致 |
| 每个 use case 自己写 prompt wrapper | prompt injection、防越权、拒答和日志策略分裂 |
| Eval 每次从零开始 | 没有 regression memory, 老错误反复出现 |
| Human review 只写在流程图里 | reviewer capacity、queue、override reason 和 evidence 不可审 |
| 控制证据按项目手工收集 | scale decision 慢, audit reconstructability 弱 |
| 复制代码但不复制约束 | 表面复用, 实际风险被放大 |
成熟方法不是只建立一个模板库, 而是建立 reference implementation architecture:
repeated solution pattern
-> pattern taxonomy
-> reference implementation
-> reusable evidence pack
-> control mapping
-> reuse qualification
-> approved variants and deviations
-> telemetry and ROI monitoring
-> lifecycle, ownership and deprecation
Reference implementation 的价值不是让所有 use case 一模一样, 而是让团队从已验证的架构 skeleton、control evidence、eval baseline 和 operating model 开始, 并明确哪些地方必须本地化、哪些地方可以继承、哪些地方需要 deviation approval。
2. Target Audience and Role Expectations
| Role | 需要掌握的问题 | 典型输出 |
|---|---|---|
| Senior AI PM | 哪些 AI solution patterns 值得产品化复用, 复用后如何证明价值和 adoption | pattern investment thesis、reuse ROI、adoption telemetry、scale memo |
| AI Architect | reference implementation 如何表达架构意图、质量属性、控制边界和变体 | pattern architecture description、reference implementation anatomy、deviation rules |
| Platform PM | 平台团队如何把 reusable skeleton 做成可消费资产, 同时避免成为大而全平台目录 | pattern backlog、interface contract、developer experience metrics |
| CBAP-level BA | 如何把业务流程、规则、例外、工件和控制点抽象成可复用需求模式 | workflow pattern card、reuse qualification checklist、localization map |
| Security / Privacy | threat model、data boundary、logging、tool permissions 和 evidence 如何复用 | reusable threat model、privacy class map、security control inheritance |
| Risk / Compliance / Model Risk | 哪些控制证据可继承, 哪些必须由用例重新证明 | control mapping、eval baseline challenge、residual risk decision |
| Engineering Lead | 如何使用 reference implementation 快速交付, 同时遵守版本、gate 和 telemetry | implementation fork plan、release evidence、deviation record |
| Internal Audit | 如何追溯某个用例继承了什么 evidence, 修改了什么, 谁批准了偏差 | evidence lineage、pattern version trace、exception expiry report |
3. Core Thesis: Reuse Must Include Assurance, Not Only Code
AI reference implementation 常被误解成 starter repo。真正高级的 reuse 不只是复用代码:
code reuse
+ architecture intent reuse
+ control evidence reuse
+ eval baseline reuse
+ threat model reuse
+ observability reuse
+ operating model reuse
+ approved deviation discipline
如果只复用代码, 风险会被复制:
- 一个 customer-facing RAG skeleton 被用于 regulated advice, 但原始 threat model 只覆盖 internal policy lookup。
- 一个 KYC extraction prompt 被复用于 dispute evidence, 但 evidence quality rubric 没有覆盖 chargeback reason code。
- 一个 tool gateway example 默认 read-only, 后续团队增加 write action 却没有重新做 approval、idempotency 和 rollback。
- 一个 eval set 在 pilot 期表现良好, 但没有 high-risk slice 和 regression memory。
因此 reference implementation 的定义应包含:
| 维度 | 必须包含 |
|---|---|
| Runnable skeleton | 可运行的最小实现, 包括 prompt/RAG/tool/eval/telemetry wiring |
| Architecture decision | 为什么这样设计, 适用和不适用边界 |
| Quality baseline | eval dataset、rubric、threshold、known failure taxonomy |
| Security baseline | threat model、abuse cases、control checks、logging minimization |
| Control evidence | gate evidence、review evidence、trace schema、approval path |
| Variants | 允许的参数化变体和需要批准的结构性变体 |
| Lifecycle | owner、version、deprecation、migration、support level |
| Adoption proof | reuse count、integration lead time、defect reduction、value and risk metrics |
4. Conceptual Distinctions
4.1 Pattern vs Template vs Golden Path vs Reference Implementation
| Concept | Definition | 适合回答的问题 | 不应混淆的点 |
|---|---|---|---|
| Pattern | 重复出现的问题-上下文-解决方案结构 | "我们反复遇到哪类 AI 问题?" | Pattern 不一定可运行 |
| Template | 文档、代码或配置的填充起点 | "团队如何少写样板?" | Template 不能自动证明质量和控制 |
| Golden path | 平台推荐的低摩擦交付路径 | "团队如何用标准能力快速上线?" | Golden path 偏 developer experience, 不等于完整 assurance |
| Reference implementation | 可运行、可审查、带证据和变体边界的 pattern embodiment | "团队如何复用实现和证据, 并知道何时不能复用?" | Reference implementation 必须说明适用边界和证据继承规则 |
| Product line | 以变体管理和共性资产为中心的产品族工程 | "如何系统化管理多个产品变体?" | 本文不展开完整 product line engineering |
高级表达:
A pattern says what tends to work.
A template helps you start.
A golden path helps you move fast.
A reference implementation proves how the pattern works under defined controls.
4.2 Design Authority
Reference implementation 需要 design authority, 否则会退化成散落的示例代码。
| Decision type | Design authority responsibility |
|---|---|
| Pattern admission | 该重复模式是否值得纳入 library |
| Baseline controls | 哪些 quality/security/privacy/eval/observability 控制是 mandatory |
| Variant approval | 哪些变体可配置, 哪些需要 architecture review |
| Evidence inheritance | 哪些 evidence 可继承, 哪些必须本地重新生成 |
| Deprecation | 旧 pattern 何时停止新用, 已采用团队如何迁移 |
| Exception | 偏差是否可接受, 补偿控制是什么, 何时到期 |
Design authority 不是把关委员会的同义词。它应当让复用更快, 因为团队不用每次重新争论同一组架构问题。
5. Pattern Taxonomy
5.1 AI Reference Implementation Pattern Families
| Pattern family | Repeated problem | Reference implementation focus |
|---|---|---|
| Customer-facing RAG | 客户问政策、费用、账户、争议状态, 需要可信引用和安全边界 | source authority, citation, refusal, handoff, complaint signal |
| Internal policy copilot | 员工查政策、SOP、操作指引, 需要权限过滤和版本控制 | policy pack, retrieval ACL, answer boundary, SOP version trace |
| Evidence extraction | 从文档、邮件、表单、case note 中抽取结构化证据 | extraction schema, confidence band, human validation, source span |
| Investigation workbench | 分析 AML、fraud、dispute、complaint case, 需要汇总、证据链和决策支持 | case graph, evidence timeline, analyst notes, escalation boundary |
| Narrative drafting | 生成报告、解释、监管 narrative、客户沟通草稿 | grounded draft, maker-checker, source trace, approval workflow |
| Tool gateway / agent action | AI 调用工具读取或执行动作 | tool contract, authorization, idempotency, approval, audit, kill switch |
| Human review and override | AI 输出进入受控工作流, 需要复核和覆盖原因 | review queue, override taxonomy, reviewer load, QA sample |
| Eval and regression | 多个用例需要质量、风险和回归评估 | eval harness, golden set, failure taxonomy, threshold and gate |
| Observability and evidence | 需要追踪版本、调用、证据、成本、延迟和控制事件 | OTel trace skeleton, metric contract, evidence binder |
5.2 Financial Retail Pattern Map
| Use case | Primary pattern | Secondary patterns |
|---|---|---|
| Customer-facing RAG | Customer-facing RAG | escalation, complaint detection, source freshness |
| Internal policy copilot | Internal policy copilot | policy versioning, access control, adoption telemetry |
| AML investigation workbench | Investigation workbench | evidence extraction, narrative drafting, human review |
| KYC evidence extraction | Evidence extraction | document classification, exception workflow, QA sampling |
| Dispute evidence assistant | Investigation workbench | evidence timeline, card network rule retrieval, customer letter draft |
| Regulatory reporting narrative draft | Narrative drafting | data lineage, maker-checker, evidence binder |
5.3 Pattern Maturity Levels
| Level | Name | Evidence standard |
|---|---|---|
| 0 | Observed repeat | 多个团队遇到类似问题, 但没有标准解决方案 |
| 1 | Documented pattern | 有 pattern card、适用边界和风险提示 |
| 2 | Template | 有文档、sample prompt、sample schema 或 starter code |
| 3 | Reference implementation | 有 runnable skeleton、eval baseline、threat model、telemetry 和 evidence pack |
| 4 | Assured reusable pattern | 已被多个用例复用, control evidence 可继承, deviations 被管理 |
| 5 | Managed lifecycle asset | 有 owner、versioning、adoption telemetry、ROI、deprecation 和 migration |
目标不是让所有 pattern 都到 Level 5。只有高重复、高风险或高价值模式值得投入 reference implementation 级别。
6. Reference Implementation Anatomy
一个合格的 AI reference implementation 应由 12 个资产组成。
| Asset | 内容 | 复用价值 |
|---|---|---|
| Pattern card | problem、context、forces、solution、applicability、non-applicability | 防止团队误用 |
| Architecture description | context view、container/component view、data flow、control view、runtime view | 对齐 stakeholder concerns |
| Runnable skeleton | minimal service/app, sample config, local test harness, deployment notes | 缩短交付 lead time |
| Prompt skeleton | system prompt boundary, task prompt, refusal style, evidence requirement | 复用安全和质量约束 |
| RAG skeleton | source registry, chunking policy, ACL filter, citation contract, freshness check | 复用知识治理 |
| Tool gateway skeleton | tool registry, allowlist, auth, approval, idempotency, audit, rollback | 控制 agent action risk |
| Eval baseline | golden set, regression set, red-team set, rubric, thresholds, reviewer guide | 复用质量和风险记忆 |
| Threat model | misuse cases, prompt injection, data exfiltration, privilege escalation, failure mode | 复用安全分析 |
| Control mapping | NIST/ISO/internal controls to implementation checks and evidence | 复用 governance evidence |
| Observability skeleton | traces, metrics, logs, release identity, cost, latency, quality, user action | 支持 production assurance |
| Evidence pack | release evidence, eval run, threat model, approvals, deviations, known limitations | 支持 scale and audit |
| Lifecycle metadata | owner, support level, version, compatibility, deprecation, migration | 管理资产健康 |
6.1 Prompt / RAG / Tool Gateway Skeletons
| Skeleton | Mandatory design elements |
|---|---|
| Prompt skeleton | role boundary, allowed tasks, prohibited tasks, source requirement, uncertainty behavior, escalation rule, logging class |
| RAG skeleton | approved source registry, source owner, freshness SLA, ACL enforcement, retrieval eval, citation verifier, stale-source stop rule |
| Tool gateway skeleton | tool risk tier, auth context, least privilege, side-effect declaration, approval requirement, idempotency key, audit event, kill switch |
6.2 Eval Baseline
Eval baseline 不是一次性测试结果, 而是 reference implementation 的质量记忆。
| Eval asset | 内容 |
|---|---|
| Golden set | 代表目标任务的高质量样本 |
| High-risk slice | vulnerable customer、complaint、AML high-risk、KYC exception、regulatory deadline 等 |
| Regression set | 历史 defect、near miss、incident 和 reviewer disagreement |
| No-answer set | 应拒答、升级或要求更多信息的场景 |
| Prompt injection set | 外部内容试图覆盖系统策略或泄露数据 |
| Rubric | correctness、groundedness、completeness、policy fit、tone、escalation |
| Threshold | aggregate threshold and critical failure hard stop |
7. Reuse Qualification
Reuse qualification 的问题不是 "能不能复制这个 repo", 而是:
Can this use case inherit the reference implementation's architecture assumptions, controls and evidence without creating hidden risk?
7.1 Qualification Dimensions
| Dimension | Questions |
|---|---|
| Business workflow fit | 目标流程、用户角色、工件、决策点是否与 pattern 匹配 |
| Risk tier | customer impact、regulatory impact、financial loss、privacy sensitivity 是否在参考范围内 |
| Data class | PII、financial data、SAR-related data、confidential policy 是否符合 skeleton data boundary |
| Human accountability | AI 是 draft/read-only/recommend/execute 哪一级, 是否改变责任边界 |
| Source authority | RAG source 是否有 owner、freshness、approval 和 ACL |
| Tool action | 是否从 read-only 扩展到 write/execute, 是否需要新的 control |
| Eval coverage | baseline eval 是否覆盖目标 case types、languages、edge cases |
| Control inheritance | 哪些 controls 可继承, 哪些必须本地证明 |
| Operating model | review queue、manager cadence、incident path 是否可用 |
| Telemetry | 是否能输出 required events and traces |
7.2 Reuse Decision Types
| Decision | Meaning | Example |
|---|---|---|
| Reuse as-is | 仅做配置和数据源绑定, 不改变架构边界 | Internal HR policy copilot 复用 internal policy copilot skeleton |
| Reuse with local controls | skeleton 适用, 但需增加本地 eval、QA 或 source review | KYC evidence extraction 增加 beneficial ownership high-risk slice |
| Reuse as variant | 需要被 design authority 记录的结构性变体 | Customer-facing RAG 增加 authenticated account-specific retrieval |
| Deviation required | 违反默认约束, 需要批准和补偿控制 | Tool gateway 从 read-only 改为 customer-impacting write action |
| Do not reuse | 上下文超出 reference implementation 假设 | 用 internal policy copilot skeleton 处理 final credit decline explanation |
8. Reusable Evidence Pack
Reference implementation 最重要的复用资产之一是 evidence pack。它让团队不必每次从零解释同一组控制。
| Evidence object | 可继承内容 | 必须本地化内容 |
|---|---|---|
| Pattern rationale | why this architecture exists | use case business problem and outcome thesis |
| Architecture views | common components, data flow, control points | actual systems, data stores, identity and workflow integration |
| Threat model | common threats and mitigations | use-case-specific abuse cases and data exposure |
| Eval baseline | reusable suites and rubrics | local samples, risk slices, source-specific tests |
| Control mapping | common control claims and implementation hooks | control owner, residual risk, local evidence |
| Security tests | prompt injection, access boundary, tool contract checks | local source ACL and tool permissions |
| Privacy analysis | logging minimization and retention classes | actual data class, region, consent and retention |
| Observability | trace schema and metric contract | dashboard thresholds and business outcome joins |
| Release gate | standard evidence checklist | release decision, approvals and exceptions |
| Known limitations | pattern limits | local constraints and accepted residual risk |
证据复用的核心规则:
- Reusable evidence reduces duplication, not accountability.
- Evidence inheritance must be explicit, versioned and reviewable.
- A team can inherit the skeleton and still fail local release gates.
- A local deviation can invalidate inherited evidence.
9. Control Mapping
Control mapping 把 pattern assurance 变成可管理的治理语言。
| Control domain | Reusable control claim | Evidence source |
|---|---|---|
| Governance | Pattern has owner, scope, risk tier, review cadence and deprecation path | pattern registry, lifecycle metadata |
| Architecture | Stakeholder concerns and architecture decisions are documented | architecture description, ADR |
| Data and privacy | Prompt context, logs and traces minimize sensitive payload | privacy class, DLP sample, trace schema |
| Security | Prompt injection, unauthorized retrieval and tool abuse are tested | red-team run, access test, tool contract |
| Model behavior | Outputs meet groundedness, completeness and refusal thresholds | eval run, SME sample, failure taxonomy |
| Human oversight | Human review and override are observable and capacity-aware | review queue metric, override reason log |
| Operations | Release, rollback, incident and evidence paths are defined | runbook, release gate, incident drill |
| Monitoring | Runtime metrics cover quality, safety, cost, latency, adoption and control | OTel dashboard, metric contract |
| Continual improvement | Defects and deviations feed pattern version updates | defect backlog, version notes, quarterly review |
9.1 Mapping to Source Anchors
| Anchor language | Reference implementation translation |
|---|---|
| NIST AI RMF Govern | pattern ownership, design authority, risk roles, evidence review |
| NIST AI RMF Map | applicability, context, users, workflow, data, risk tier |
| NIST AI RMF Measure | eval baseline, telemetry, review sampling, control metrics |
| NIST AI RMF Manage | release gates, deviations, incident response, deprecation |
| ISO/IEC 42001 operation | pattern operation controls, competence, documented information |
| ISO/IEC 42001 performance evaluation | reuse telemetry, management review, audit evidence |
| ISO/IEC/IEEE 42010 | views, stakeholders, concerns, rationale and architecture description |
| CNCF platform thinking | self-service consumption and platform-as-product interface |
| DORA | delivery flow, change quality, recovery and operational learning |
| OpenTelemetry | traces, metrics and logs that connect release identity to outcomes |
10. Lifecycle and Versioning
AI reference implementations age quickly because models, prompts, sources, tools, policies and regulations change.
10.1 Versioned Artifacts
| Artifact | Versioning rule |
|---|---|
| Reference implementation | semantic version with breaking-change notes |
| Prompt pack | version per task boundary and policy change |
| RAG source profile | version per source set, chunking, embedding, reranking and ACL policy |
| Tool contract | version per API schema, permission, side effect and approval model |
| Eval suite | version per sample set, rubric, threshold and known failure set |
| Threat model | version per new tool, channel, data class or abuse case |
| Control mapping | version per control framework, internal policy or evidence path |
| Telemetry schema | version per required event field and semantic convention |
10.2 Lifecycle States
| State | Meaning | Allowed use |
|---|---|---|
| Candidate | pattern observed and under design | discovery and prototype only |
| Beta reference | runnable skeleton exists, limited evidence | controlled pilots |
| Approved | eval, threat model, control mapping and evidence pack reviewed | new use cases may adopt |
| Restricted | known limitation or incident requires narrowed use | only approved scopes |
| Deprecated | no new adoption; migration path exists | existing consumers migrate |
| Retired | unsupported and removed from approved library | no production use |
10.3 Deprecation Triggers
- Model or vendor change invalidates baseline eval.
- Security incident reveals structural weakness.
- New policy or regulation changes acceptable use.
- Pattern is superseded by a safer or cheaper implementation.
- Adoption telemetry shows low reuse or high deviation burden.
- Support team cannot maintain controls or evidence freshness.
11. Deviation Process
Deviation is not failure. Hidden deviation is failure.
11.1 Deviation Types
| Deviation type | Example | Required review |
|---|---|---|
| Configuration deviation | Different source freshness SLA | pattern owner review |
| Data deviation | New sensitive data class in prompt context | privacy and security review |
| Risk-tier deviation | Internal assistant becomes customer-facing | risk, compliance and architecture review |
| Tool-action deviation | Read-only tool becomes write action | security, architecture, business owner approval |
| Eval deviation | Local use case lacks required high-risk slice | eval owner and risk review |
| Observability deviation | Cannot emit required trace fields | platform and audit evidence review |
| Control deviation | Human review sampling reduced below baseline | risk owner decision with expiry |
11.2 Approved Deviation Record
| Field | Required content |
|---|---|
| Deviation ID | stable identifier tied to pattern version and use case |
| Baseline requirement | reference implementation rule being changed |
| Business reason | why standard approach does not fit |
| Risk analysis | impact, likelihood, affected controls, affected users |
| Compensating controls | what replaces or reduces the lost control |
| Evidence required | eval, test, review, monitoring and approval evidence |
| Expiry | date or trigger when deviation must be renewed or removed |
| Owner | residual risk owner and implementation owner |
12. Adoption Telemetry and Reuse ROI
Pattern library success cannot be measured by number of documents published.
12.1 Reuse Telemetry
| Metric | Meaning |
|---|---|
| qualified reuse count | use cases that passed reuse qualification and adopted the pattern |
| lead time reduction | time from intake to first controlled pilot compared with non-reference approach |
| evidence inheritance rate | percentage of evidence objects reused without rework |
| deviation rate | how often teams need approved deviations |
| defect escape rate | defects found after release by pattern and version |
| eval regression reuse | number of incidents converted into shared regression cases |
| support burden | questions, implementation issues and break-fix effort per adoption |
| deprecation compliance | consumers migrated before retirement date |
12.2 Reuse ROI
Reusable pattern ROI should combine speed, quality and risk:
Reuse ROI
= delivery lead time avoided
+ assurance effort avoided
+ defect and incident reduction
+ platform operating cost efficiency
+ audit and review effort reduction
- pattern maintenance cost
- deviation management cost
- migration and deprecation cost
Senior PM framing:
The business case for a reference implementation is not "developers save time"; it is "the organization learns once, controls once where appropriate, and reuses evidence without losing accountability."
13. Platform / Team Interface
Reference implementation sits between platform and product teams.
| Responsibility | Platform team | Use case team |
|---|---|---|
| Skeleton | Build and maintain common implementation | Configure and integrate into workflow |
| Pattern card | Maintain approved pattern definition | Confirm fit and local assumptions |
| Eval harness | Provide reusable suites and runner | Add local samples and sign off thresholds |
| Threat model | Provide common threat model | Extend for local data, channel and tool actions |
| Observability | Provide trace/metric schema and dashboards | Emit required events and connect business outcomes |
| Controls | Provide control hooks and evidence templates | Produce local evidence and obtain approvals |
| Deviation | Review and record deviations | Request, justify, monitor and close deviations |
| Lifecycle | Version, support, deprecate, migrate | Track adoption version and migrate on schedule |
13.1 Interface Contracts
| Interface | Contract |
|---|---|
| Pattern registry | pattern id, owner, maturity, versions, approved scopes, support level |
| Implementation package | repo/module/image, config schema, integration guide, test harness |
| Evidence API | standard locations for eval run, threat model, approval, exception and release artifacts |
| Telemetry contract | required traces, metrics, logs, semantic fields and privacy class |
| Deviation workflow | request fields, approvers, expiry, evidence and status |
14. Financial Retail Examples
14.1 Customer-Facing RAG
| Reference implementation element | Example |
|---|---|
| Pattern | Answer customer questions with approved sources and safe handoff |
| Mandatory controls | citation, source freshness, complaint detection, vulnerable customer escalation |
| Eval baseline | fee policy, dispute status, account servicing, ambiguous customer intent |
| Threat model | prompt injection in customer text, stale policy, unsupported commitment |
| Deviation example | adding account-specific retrieval requires stronger identity, entitlement and audit |
14.2 Internal Policy Copilot
| Reference implementation element | Example |
|---|---|
| Pattern | Employees ask policy and SOP questions inside authenticated workflow |
| Mandatory controls | role-based retrieval, policy version trace, no customer advice beyond employee authority |
| Eval baseline | branch procedure, contact-center policy, KYC exception, escalation rules |
| Reuse evidence | source registry, prompt boundary, retrieval ACL test, answer citation audit |
| Deviation example | using copilot output directly in customer letter requires narrative drafting controls |
14.3 AML Investigation Workbench
| Reference implementation element | Example |
|---|---|
| Pattern | Assist analysts with case summary, evidence timeline and narrative draft |
| Mandatory controls | analyst-owned disposition, source span, high-risk escalation, QA sampling |
| Eval baseline | transaction monitoring alerts, adverse media, structuring indicators, missing evidence |
| Threat model | omitted critical evidence, false comfort, SAR-related data leakage |
| Deviation example | automatic closure recommendation is outside baseline and needs separate approval |
14.4 KYC Evidence Extraction
| Reference implementation element | Example |
|---|---|
| Pattern | Extract beneficial ownership, document type, expiry and missing evidence |
| Mandatory controls | source span, confidence band, human validation, exception queue |
| Eval baseline | individual, SMB, trust, non-resident, high-risk country cases |
| Reuse evidence | extraction schema, document classifier tests, validation UI trace |
| Deviation example | straight-through KYC decision cannot inherit extraction-only evidence |
14.5 Dispute Evidence Assistant
| Reference implementation element | Example |
|---|---|
| Pattern | Build dispute evidence timeline and draft case packet |
| Mandatory controls | card network rule source, customer communication boundary, maker-checker |
| Eval baseline | fraud claim, merchant dispute, recurring charge, provisional credit |
| Threat model | wrong reason code, missing timeline evidence, unsupported customer denial |
| Deviation example | automated chargeback submission requires tool-action controls |
14.6 Regulatory Reporting Narrative Draft
| Reference implementation element | Example |
|---|---|
| Pattern | Draft variance explanation or management narrative from approved data lineage |
| Mandatory controls | data lineage, maker-checker, metric contract, attestation boundary |
| Eval baseline | variance explanation, source-to-report mapping, late adjustment, restatement scenario |
| Threat model | unsupported explanation, stale metric, hallucinated cause, audit trail gap |
| Deviation example | external filing language requires compliance and authorized signer review |
15. Anti-Patterns
| Anti-pattern | Why it fails | Mature replacement |
|---|---|---|
| Starter repo called reference implementation | It has code but no evidence, boundaries or lifecycle | Runnable skeleton plus eval, threat model, controls and telemetry |
| Pattern library as wiki | Documents are not consumable or testable | Pattern registry with maturity, owner, versions and adoption telemetry |
| Copying controls without context | Local risk may exceed baseline assumptions | Reuse qualification and local evidence requirements |
| Golden path replaces architecture judgment | Fast path may not fit high-risk use cases | Design authority and deviation process |
| Eval baseline never changes | New incidents are not learned from | Regression memory and versioned eval suites |
| Every deviation approved forever | Temporary exceptions become architecture drift | Expiry, compensating controls and quarterly review |
| Reuse measured by downloads | Downloads do not prove governed adoption | Qualified reuse, evidence inheritance and defect reduction |
| Platform owns everything | Use case teams avoid local accountability | Shared responsibility model |
| Local teams fork silently | Pattern assurance becomes untraceable | Versioned adoption record and supported variant model |
| No deprecation discipline | Old unsafe patterns remain in production | lifecycle state, migration guide and retirement date |
16. Interview Answers
Q1: What is the difference between a template and a reference implementation?
A template is a starting point. A reference implementation is a governed, runnable embodiment of a pattern. It includes architecture rationale, quality baseline, threat model, control mapping, observability, evidence pack, versioning and deviation rules. In AI, that distinction matters because copying code does not copy safety, eval coverage or accountability.
Q2: How would you build an AI pattern library without creating shelfware?
I would start from repeated production demand, not abstract taxonomy. For each candidate pattern, I would define a pattern card, reference implementation, eval baseline, threat model, control mapping, telemetry contract and reuse qualification checklist. I would measure qualified reuse, lead time reduction, evidence inheritance, deviation rate, defect reduction and support burden. If a pattern is not reused or creates too many deviations, it should be redesigned or deprecated.
Q3: How do you decide whether a new use case can reuse an existing reference implementation?
I would run reuse qualification across workflow fit, risk tier, data class, human accountability, source authority, tool action, eval coverage, control inheritance, operating model and telemetry. If the use case stays inside the reference assumptions, it can reuse most assets. If it changes data sensitivity, customer impact or tool actions, it needs local controls or approved deviation. Some use cases should not reuse the pattern at all.
Q4: How can reusable evidence reduce governance effort without weakening control?
Reusable evidence reduces repeated explanation of common controls, such as RAG citation checks, prompt injection tests, tool gateway audit fields or OTel trace schema. It does not remove local accountability. The use case still needs to prove its own data sources, workflows, risk tier, eval samples, approvals and residual risk. The key is explicit evidence inheritance: what is reused, what is local, what version, and who accepted it.
Q5: How do you handle variants and deviations?
I separate configured variants from deviations. A configured variant is expected, such as different source registry or workflow labels. A deviation changes a baseline assumption, such as making an internal copilot customer-facing or adding write actions. Deviations require a record with business reason, risk analysis, compensating controls, evidence, owner and expiry. Hidden deviation is architecture drift.
Q6: What would you include in a reference implementation for a customer-facing RAG assistant?
I would include approved source registry, ACL and freshness checks, citation contract, refusal and handoff rules, prompt injection tests, groundedness eval, high-risk customer scenarios, complaint escalation, OTel trace schema, cost and latency metrics, human review sampling, release gate, evidence pack and deprecation rules. I would also define when the skeleton must not be reused, such as final adverse decisions or regulated advice without separate controls.
17. Portfolio Exercise
Scenario
A financial retail institution has 6 AI initiatives:
| Use case | Current problem |
|---|---|
| Customer-facing RAG | Multiple teams build separate policy-answer bots |
| Internal policy copilot | Branch and contact-center teams use different policy search approaches |
| AML investigation workbench | Analysts need faster evidence timeline and narrative preparation |
| KYC evidence extraction | Onboarding teams repeat document extraction logic |
| Dispute evidence assistant | Dispute operations need reusable evidence packet drafting |
| Regulatory reporting narrative draft | Finance and risk teams need controlled narrative drafting from approved metrics |
Required Artifacts
- Pattern taxonomy with at least 8 pattern families.
- Reference implementation anatomy for two patterns.
- Reuse qualification checklist for one proposed adoption.
- Evidence pack showing inherited versus local evidence.
- Control mapping to NIST AI RMF, ISO/IEC 42001, architecture description and observability concepts.
- Eval baseline with golden, high-risk, no-answer and regression sets.
- Threat model reuse plan, including local extension points.
- Variant and deviation process with an example approved deviation.
- Lifecycle/versioning model with deprecation triggers.
- Adoption telemetry and reuse ROI model.
- Platform/team interface contract.
- Scale memo recommending which two patterns should become Level 4 assured reusable patterns first.
Evaluation Rubric
| Criterion | Strong evidence |
|---|---|
| Pattern clarity | Separates pattern, template, golden path and reference implementation |
| Assurance depth | Includes eval, threat model, control mapping, observability and evidence |
| Reuse discipline | Defines qualification, inheritance, variants and deviations |
| Financial retail fit | Uses realistic controls for RAG, AML, KYC, dispute and reporting |
| Lifecycle maturity | Has owner, version, support state, deprecation and migration logic |
| Platform interface | Shows what platform owns and what use case teams remain accountable for |
| Value proof | Measures lead time, defect reduction, evidence reuse and support burden |
18. Final Principle
Reference implementation work is mature when a team can say:
We reused the pattern, inherited the right evidence, localized the right controls, recorded the right deviations, and can prove in production that the reused architecture still fits this workflow.
The senior AI PM, AI Architect, Platform PM and CBAP-level BA should treat pattern reuse as an assurance architecture problem, not only a productivity tactic.