AI 底层逻辑 / 经典论文

AI Reference Implementation：模式库与复用保证架构

重要说明: 本文是学习、作品集和架构训练材料, 不构成法律意见、监管解释、审计结论、信息安全认证、模型验证结论或生产上线批准。金融零售正式项目必须由授权的业务、风险、合规、隐私、安全、模型风险、技术和审计角色确认。访问日期按 2026-06-30 记录。

652 行ai-foundations/papers/157-ai-reference-implementation-pattern-library-reuse-assurance-architecture.md

AI Reference Implementation / Pattern Library / Reuse Assurance Architecture 解读

Target audience: Senior AI PM / AI Architect / Platform PM / Enterprise Architect / CBAP-level BA / AI Governance Lead / Financial Retail Product and Operations Leader. Learning objectives: 建立一套把重复 AI 解决方案模式沉淀为 reference implementation 和 reusable pattern library 的架构方法, 同时保留质量、安全、证据、生命周期、偏差和业务价值控制。 Core question: 当企业发现多个 AI 用例都在重复做 RAG、copilot、tool gateway、evidence extraction、human review、eval 和 observability 时, 如何把经验变成可复用的实现资产, 而不是复制粘贴代码、复制风险和复制治理盲区?

Source Anchors

这些来源用于校准 AI 风险管理、AI 管理体系、架构描述、平台工程、工程绩效和可观测性语言。本文关注 reference implementation、pattern assurance 和 reusable evidence, 不把它扩展成完整 AI platform service catalog、golden path 目录或 product line engineering 方法。

Source	Official link	本文采用的思想
NIST AI Risk Management Framework	https://www.nist.gov/itl/ai-risk-management-framework	用 Govern / Map / Measure / Manage 组织 AI 模式的风险分类、控制证据、监测和持续改进
ISO/IEC 42001 AI management system	https://www.iso.org/standard/81230.html	用 AI management system 的 policy、objective、operation、performance evaluation、management review 和 improvement 管理可复用模式
ISO/IEC/IEEE 42010 Architecture Description	https://www.iso.org/standard/74393.html	用 stakeholder concerns、viewpoints、architecture rationale 和 architecture description 组织 pattern 的多视图描述
CNCF Platforms White Paper	https://tag-app-delivery.cncf.io/whitepapers/platforms/	用 platform as product、self-service 和 paved-path 思想定义平台团队与产品团队的接口, 但不把本文变成 service catalog
DORA	https://dora.dev/	用 deployment frequency、lead time、change fail rate、time to restore 的思想衡量 reference implementation 对交付流和恢复能力的影响
OpenTelemetry Documentation	https://opentelemetry.io/docs/	用 traces、metrics、logs 和 semantic conventions 的思想设计 AI pattern 的 observability skeleton 和 evidence trace

一句话:

A reference implementation is a governed, evidence-backed, runnable embodiment of a reusable AI pattern, designed so teams can reuse quality, security, eval, observability and control evidence without blindly inheriting context-specific assumptions.

1. Executive Summary

企业 AI 进入规模化阶段后, 会反复出现相似的 solution patterns:

customer-facing RAG
internal policy copilot
AML investigation workbench
KYC evidence extraction
dispute evidence assistant
regulatory reporting narrative draft

这些用例的业务目标不同, 但底层能力高度重复:

受控知识检索和引用。
Prompt / policy pack / model route。
Tool gateway 和权限边界。
Human review、override、approval 和 audit trail。
Eval baseline、regression set、red-team set。
Threat model、privacy classification 和 logging policy。
OpenTelemetry trace、cost、latency、quality 和 adoption metrics。
Evidence pack、control mapping、release gate 和 deviation record。

低成熟度组织会让每个团队重新实现这些能力。结果是:

重复现象	后果
每个 team 自己拼 RAG	source authority、ACL、citation quality 和 freshness 口径不一致
每个 use case 自己写 prompt wrapper	prompt injection、防越权、拒答和日志策略分裂
Eval 每次从零开始	没有 regression memory, 老错误反复出现
Human review 只写在流程图里	reviewer capacity、queue、override reason 和 evidence 不可审
控制证据按项目手工收集	scale decision 慢, audit reconstructability 弱
复制代码但不复制约束	表面复用, 实际风险被放大

成熟方法不是只建立一个模板库, 而是建立 reference implementation architecture:

repeated solution pattern
  -> pattern taxonomy
  -> reference implementation
  -> reusable evidence pack
  -> control mapping
  -> reuse qualification
  -> approved variants and deviations
  -> telemetry and ROI monitoring
  -> lifecycle, ownership and deprecation

Reference implementation 的价值不是让所有 use case 一模一样, 而是让团队从已验证的架构 skeleton、control evidence、eval baseline 和 operating model 开始, 并明确哪些地方必须本地化、哪些地方可以继承、哪些地方需要 deviation approval。

2. Target Audience and Role Expectations

Role	需要掌握的问题	典型输出
Senior AI PM	哪些 AI solution patterns 值得产品化复用, 复用后如何证明价值和 adoption	pattern investment thesis、reuse ROI、adoption telemetry、scale memo
AI Architect	reference implementation 如何表达架构意图、质量属性、控制边界和变体	pattern architecture description、reference implementation anatomy、deviation rules
Platform PM	平台团队如何把 reusable skeleton 做成可消费资产, 同时避免成为大而全平台目录	pattern backlog、interface contract、developer experience metrics
CBAP-level BA	如何把业务流程、规则、例外、工件和控制点抽象成可复用需求模式	workflow pattern card、reuse qualification checklist、localization map
Security / Privacy	threat model、data boundary、logging、tool permissions 和 evidence 如何复用	reusable threat model、privacy class map、security control inheritance
Risk / Compliance / Model Risk	哪些控制证据可继承, 哪些必须由用例重新证明	control mapping、eval baseline challenge、residual risk decision
Engineering Lead	如何使用 reference implementation 快速交付, 同时遵守版本、gate 和 telemetry	implementation fork plan、release evidence、deviation record
Internal Audit	如何追溯某个用例继承了什么 evidence, 修改了什么, 谁批准了偏差	evidence lineage、pattern version trace、exception expiry report

3. Core Thesis: Reuse Must Include Assurance, Not Only Code

AI reference implementation 常被误解成 starter repo。真正高级的 reuse 不只是复用代码:

code reuse
  + architecture intent reuse
  + control evidence reuse
  + eval baseline reuse
  + threat model reuse
  + observability reuse
  + operating model reuse
  + approved deviation discipline

如果只复用代码, 风险会被复制:

一个 customer-facing RAG skeleton 被用于 regulated advice, 但原始 threat model 只覆盖 internal policy lookup。
一个 KYC extraction prompt 被复用于 dispute evidence, 但 evidence quality rubric 没有覆盖 chargeback reason code。
一个 tool gateway example 默认 read-only, 后续团队增加 write action 却没有重新做 approval、idempotency 和 rollback。
一个 eval set 在 pilot 期表现良好, 但没有 high-risk slice 和 regression memory。

因此 reference implementation 的定义应包含:

维度	必须包含
Runnable skeleton	可运行的最小实现, 包括 prompt/RAG/tool/eval/telemetry wiring
Architecture decision	为什么这样设计, 适用和不适用边界
Quality baseline	eval dataset、rubric、threshold、known failure taxonomy
Security baseline	threat model、abuse cases、control checks、logging minimization
Control evidence	gate evidence、review evidence、trace schema、approval path
Variants	允许的参数化变体和需要批准的结构性变体
Lifecycle	owner、version、deprecation、migration、support level
Adoption proof	reuse count、integration lead time、defect reduction、value and risk metrics

4. Conceptual Distinctions

4.1 Pattern vs Template vs Golden Path vs Reference Implementation

Concept	Definition	适合回答的问题	不应混淆的点
Pattern	重复出现的问题-上下文-解决方案结构	"我们反复遇到哪类 AI 问题?"	Pattern 不一定可运行
Template	文档、代码或配置的填充起点	"团队如何少写样板?"	Template 不能自动证明质量和控制
Golden path	平台推荐的低摩擦交付路径	"团队如何用标准能力快速上线?"	Golden path 偏 developer experience, 不等于完整 assurance
Reference implementation	可运行、可审查、带证据和变体边界的 pattern embodiment	"团队如何复用实现和证据, 并知道何时不能复用?"	Reference implementation 必须说明适用边界和证据继承规则
Product line	以变体管理和共性资产为中心的产品族工程	"如何系统化管理多个产品变体?"	本文不展开完整 product line engineering

高级表达:

A pattern says what tends to work.
A template helps you start.
A golden path helps you move fast.
A reference implementation proves how the pattern works under defined controls.

4.2 Design Authority

Reference implementation 需要 design authority, 否则会退化成散落的示例代码。

Decision type	Design authority responsibility
Pattern admission	该重复模式是否值得纳入 library
Baseline controls	哪些 quality/security/privacy/eval/observability 控制是 mandatory
Variant approval	哪些变体可配置, 哪些需要 architecture review
Evidence inheritance	哪些 evidence 可继承, 哪些必须本地重新生成
Deprecation	旧 pattern 何时停止新用, 已采用团队如何迁移
Exception	偏差是否可接受, 补偿控制是什么, 何时到期

Design authority 不是把关委员会的同义词。它应当让复用更快, 因为团队不用每次重新争论同一组架构问题。

5. Pattern Taxonomy

5.1 AI Reference Implementation Pattern Families

Pattern family	Repeated problem	Reference implementation focus
Customer-facing RAG	客户问政策、费用、账户、争议状态, 需要可信引用和安全边界	source authority, citation, refusal, handoff, complaint signal
Internal policy copilot	员工查政策、SOP、操作指引, 需要权限过滤和版本控制	policy pack, retrieval ACL, answer boundary, SOP version trace
Evidence extraction	从文档、邮件、表单、case note 中抽取结构化证据	extraction schema, confidence band, human validation, source span
Investigation workbench	分析 AML、fraud、dispute、complaint case, 需要汇总、证据链和决策支持	case graph, evidence timeline, analyst notes, escalation boundary
Narrative drafting	生成报告、解释、监管 narrative、客户沟通草稿	grounded draft, maker-checker, source trace, approval workflow
Tool gateway / agent action	AI 调用工具读取或执行动作	tool contract, authorization, idempotency, approval, audit, kill switch
Human review and override	AI 输出进入受控工作流, 需要复核和覆盖原因	review queue, override taxonomy, reviewer load, QA sample
Eval and regression	多个用例需要质量、风险和回归评估	eval harness, golden set, failure taxonomy, threshold and gate
Observability and evidence	需要追踪版本、调用、证据、成本、延迟和控制事件	OTel trace skeleton, metric contract, evidence binder

5.2 Financial Retail Pattern Map

Use case	Primary pattern	Secondary patterns
Customer-facing RAG	Customer-facing RAG	escalation, complaint detection, source freshness
Internal policy copilot	Internal policy copilot	policy versioning, access control, adoption telemetry
AML investigation workbench	Investigation workbench	evidence extraction, narrative drafting, human review
KYC evidence extraction	Evidence extraction	document classification, exception workflow, QA sampling
Dispute evidence assistant	Investigation workbench	evidence timeline, card network rule retrieval, customer letter draft
Regulatory reporting narrative draft	Narrative drafting	data lineage, maker-checker, evidence binder

5.3 Pattern Maturity Levels

Level	Name	Evidence standard
0	Observed repeat	多个团队遇到类似问题, 但没有标准解决方案
1	Documented pattern	有 pattern card、适用边界和风险提示
2	Template	有文档、sample prompt、sample schema 或 starter code
3	Reference implementation	有 runnable skeleton、eval baseline、threat model、telemetry 和 evidence pack
4	Assured reusable pattern	已被多个用例复用, control evidence 可继承, deviations 被管理
5	Managed lifecycle asset	有 owner、versioning、adoption telemetry、ROI、deprecation 和 migration

目标不是让所有 pattern 都到 Level 5。只有高重复、高风险或高价值模式值得投入 reference implementation 级别。

6. Reference Implementation Anatomy

一个合格的 AI reference implementation 应由 12 个资产组成。

Asset	内容	复用价值
Pattern card	problem、context、forces、solution、applicability、non-applicability	防止团队误用
Architecture description	context view、container/component view、data flow、control view、runtime view	对齐 stakeholder concerns
Runnable skeleton	minimal service/app, sample config, local test harness, deployment notes	缩短交付 lead time
Prompt skeleton	system prompt boundary, task prompt, refusal style, evidence requirement	复用安全和质量约束
RAG skeleton	source registry, chunking policy, ACL filter, citation contract, freshness check	复用知识治理
Tool gateway skeleton	tool registry, allowlist, auth, approval, idempotency, audit, rollback	控制 agent action risk
Eval baseline	golden set, regression set, red-team set, rubric, thresholds, reviewer guide	复用质量和风险记忆
Threat model	misuse cases, prompt injection, data exfiltration, privilege escalation, failure mode	复用安全分析
Control mapping	NIST/ISO/internal controls to implementation checks and evidence	复用 governance evidence
Observability skeleton	traces, metrics, logs, release identity, cost, latency, quality, user action	支持 production assurance
Evidence pack	release evidence, eval run, threat model, approvals, deviations, known limitations	支持 scale and audit
Lifecycle metadata	owner, support level, version, compatibility, deprecation, migration	管理资产健康

6.1 Prompt / RAG / Tool Gateway Skeletons

Skeleton	Mandatory design elements
Prompt skeleton	role boundary, allowed tasks, prohibited tasks, source requirement, uncertainty behavior, escalation rule, logging class
RAG skeleton	approved source registry, source owner, freshness SLA, ACL enforcement, retrieval eval, citation verifier, stale-source stop rule
Tool gateway skeleton	tool risk tier, auth context, least privilege, side-effect declaration, approval requirement, idempotency key, audit event, kill switch

6.2 Eval Baseline

Eval baseline 不是一次性测试结果, 而是 reference implementation 的质量记忆。

Eval asset	内容
Golden set	代表目标任务的高质量样本
High-risk slice	vulnerable customer、complaint、AML high-risk、KYC exception、regulatory deadline 等
Regression set	历史 defect、near miss、incident 和 reviewer disagreement
No-answer set	应拒答、升级或要求更多信息的场景
Prompt injection set	外部内容试图覆盖系统策略或泄露数据
Rubric	correctness、groundedness、completeness、policy fit、tone、escalation
Threshold	aggregate threshold and critical failure hard stop

7. Reuse Qualification

Reuse qualification 的问题不是 "能不能复制这个 repo", 而是:

Can this use case inherit the reference implementation's architecture assumptions, controls and evidence without creating hidden risk?

7.1 Qualification Dimensions

Dimension	Questions
Business workflow fit	目标流程、用户角色、工件、决策点是否与 pattern 匹配
Risk tier	customer impact、regulatory impact、financial loss、privacy sensitivity 是否在参考范围内
Data class	PII、financial data、SAR-related data、confidential policy 是否符合 skeleton data boundary
Human accountability	AI 是 draft/read-only/recommend/execute 哪一级, 是否改变责任边界
Source authority	RAG source 是否有 owner、freshness、approval 和 ACL
Tool action	是否从 read-only 扩展到 write/execute, 是否需要新的 control
Eval coverage	baseline eval 是否覆盖目标 case types、languages、edge cases
Control inheritance	哪些 controls 可继承, 哪些必须本地证明
Operating model	review queue、manager cadence、incident path 是否可用
Telemetry	是否能输出 required events and traces

7.2 Reuse Decision Types

Decision	Meaning	Example
Reuse as-is	仅做配置和数据源绑定, 不改变架构边界	Internal HR policy copilot 复用 internal policy copilot skeleton
Reuse with local controls	skeleton 适用, 但需增加本地 eval、QA 或 source review	KYC evidence extraction 增加 beneficial ownership high-risk slice
Reuse as variant	需要被 design authority 记录的结构性变体	Customer-facing RAG 增加 authenticated account-specific retrieval
Deviation required	违反默认约束, 需要批准和补偿控制	Tool gateway 从 read-only 改为 customer-impacting write action
Do not reuse	上下文超出 reference implementation 假设	用 internal policy copilot skeleton 处理 final credit decline explanation

8. Reusable Evidence Pack

Reference implementation 最重要的复用资产之一是 evidence pack。它让团队不必每次从零解释同一组控制。

Evidence object	可继承内容	必须本地化内容
Pattern rationale	why this architecture exists	use case business problem and outcome thesis
Architecture views	common components, data flow, control points	actual systems, data stores, identity and workflow integration
Threat model	common threats and mitigations	use-case-specific abuse cases and data exposure
Eval baseline	reusable suites and rubrics	local samples, risk slices, source-specific tests
Control mapping	common control claims and implementation hooks	control owner, residual risk, local evidence
Security tests	prompt injection, access boundary, tool contract checks	local source ACL and tool permissions
Privacy analysis	logging minimization and retention classes	actual data class, region, consent and retention
Observability	trace schema and metric contract	dashboard thresholds and business outcome joins
Release gate	standard evidence checklist	release decision, approvals and exceptions
Known limitations	pattern limits	local constraints and accepted residual risk

证据复用的核心规则:

Reusable evidence reduces duplication, not accountability.
Evidence inheritance must be explicit, versioned and reviewable.
A team can inherit the skeleton and still fail local release gates.
A local deviation can invalidate inherited evidence.

9. Control Mapping

Control mapping 把 pattern assurance 变成可管理的治理语言。

Control domain	Reusable control claim	Evidence source
Governance	Pattern has owner, scope, risk tier, review cadence and deprecation path	pattern registry, lifecycle metadata
Architecture	Stakeholder concerns and architecture decisions are documented	architecture description, ADR
Data and privacy	Prompt context, logs and traces minimize sensitive payload	privacy class, DLP sample, trace schema
Security	Prompt injection, unauthorized retrieval and tool abuse are tested	red-team run, access test, tool contract
Model behavior	Outputs meet groundedness, completeness and refusal thresholds	eval run, SME sample, failure taxonomy
Human oversight	Human review and override are observable and capacity-aware	review queue metric, override reason log
Operations	Release, rollback, incident and evidence paths are defined	runbook, release gate, incident drill
Monitoring	Runtime metrics cover quality, safety, cost, latency, adoption and control	OTel dashboard, metric contract
Continual improvement	Defects and deviations feed pattern version updates	defect backlog, version notes, quarterly review

9.1 Mapping to Source Anchors

Anchor language	Reference implementation translation
NIST AI RMF Govern	pattern ownership, design authority, risk roles, evidence review
NIST AI RMF Map	applicability, context, users, workflow, data, risk tier
NIST AI RMF Measure	eval baseline, telemetry, review sampling, control metrics
NIST AI RMF Manage	release gates, deviations, incident response, deprecation
ISO/IEC 42001 operation	pattern operation controls, competence, documented information
ISO/IEC 42001 performance evaluation	reuse telemetry, management review, audit evidence
ISO/IEC/IEEE 42010	views, stakeholders, concerns, rationale and architecture description
CNCF platform thinking	self-service consumption and platform-as-product interface
DORA	delivery flow, change quality, recovery and operational learning
OpenTelemetry	traces, metrics and logs that connect release identity to outcomes

10. Lifecycle and Versioning

AI reference implementations age quickly because models, prompts, sources, tools, policies and regulations change.

10.1 Versioned Artifacts

Artifact	Versioning rule
Reference implementation	semantic version with breaking-change notes
Prompt pack	version per task boundary and policy change
RAG source profile	version per source set, chunking, embedding, reranking and ACL policy
Tool contract	version per API schema, permission, side effect and approval model
Eval suite	version per sample set, rubric, threshold and known failure set
Threat model	version per new tool, channel, data class or abuse case
Control mapping	version per control framework, internal policy or evidence path
Telemetry schema	version per required event field and semantic convention

10.2 Lifecycle States

State	Meaning	Allowed use
Candidate	pattern observed and under design	discovery and prototype only
Beta reference	runnable skeleton exists, limited evidence	controlled pilots
Approved	eval, threat model, control mapping and evidence pack reviewed	new use cases may adopt
Restricted	known limitation or incident requires narrowed use	only approved scopes
Deprecated	no new adoption; migration path exists	existing consumers migrate
Retired	unsupported and removed from approved library	no production use

10.3 Deprecation Triggers

Model or vendor change invalidates baseline eval.
Security incident reveals structural weakness.
New policy or regulation changes acceptable use.
Pattern is superseded by a safer or cheaper implementation.
Adoption telemetry shows low reuse or high deviation burden.
Support team cannot maintain controls or evidence freshness.

11. Deviation Process

Deviation is not failure. Hidden deviation is failure.

11.1 Deviation Types

Deviation type	Example	Required review
Configuration deviation	Different source freshness SLA	pattern owner review
Data deviation	New sensitive data class in prompt context	privacy and security review
Risk-tier deviation	Internal assistant becomes customer-facing	risk, compliance and architecture review
Tool-action deviation	Read-only tool becomes write action	security, architecture, business owner approval
Eval deviation	Local use case lacks required high-risk slice	eval owner and risk review
Observability deviation	Cannot emit required trace fields	platform and audit evidence review
Control deviation	Human review sampling reduced below baseline	risk owner decision with expiry

11.2 Approved Deviation Record

Field	Required content
Deviation ID	stable identifier tied to pattern version and use case
Baseline requirement	reference implementation rule being changed
Business reason	why standard approach does not fit
Risk analysis	impact, likelihood, affected controls, affected users
Compensating controls	what replaces or reduces the lost control
Evidence required	eval, test, review, monitoring and approval evidence
Expiry	date or trigger when deviation must be renewed or removed
Owner	residual risk owner and implementation owner

12. Adoption Telemetry and Reuse ROI

Pattern library success cannot be measured by number of documents published.

12.1 Reuse Telemetry

Metric	Meaning
qualified reuse count	use cases that passed reuse qualification and adopted the pattern
lead time reduction	time from intake to first controlled pilot compared with non-reference approach
evidence inheritance rate	percentage of evidence objects reused without rework
deviation rate	how often teams need approved deviations
defect escape rate	defects found after release by pattern and version
eval regression reuse	number of incidents converted into shared regression cases
support burden	questions, implementation issues and break-fix effort per adoption
deprecation compliance	consumers migrated before retirement date

12.2 Reuse ROI

Reusable pattern ROI should combine speed, quality and risk:

Reuse ROI
= delivery lead time avoided
+ assurance effort avoided
+ defect and incident reduction
+ platform operating cost efficiency
+ audit and review effort reduction
- pattern maintenance cost
- deviation management cost
- migration and deprecation cost

Senior PM framing:

The business case for a reference implementation is not "developers save time"; it is "the organization learns once, controls once where appropriate, and reuses evidence without losing accountability."

13. Platform / Team Interface

Reference implementation sits between platform and product teams.

Responsibility	Platform team	Use case team
Skeleton	Build and maintain common implementation	Configure and integrate into workflow
Pattern card	Maintain approved pattern definition	Confirm fit and local assumptions
Eval harness	Provide reusable suites and runner	Add local samples and sign off thresholds
Threat model	Provide common threat model	Extend for local data, channel and tool actions
Observability	Provide trace/metric schema and dashboards	Emit required events and connect business outcomes
Controls	Provide control hooks and evidence templates	Produce local evidence and obtain approvals
Deviation	Review and record deviations	Request, justify, monitor and close deviations
Lifecycle	Version, support, deprecate, migrate	Track adoption version and migrate on schedule

13.1 Interface Contracts

Interface	Contract
Pattern registry	pattern id, owner, maturity, versions, approved scopes, support level
Implementation package	repo/module/image, config schema, integration guide, test harness
Evidence API	standard locations for eval run, threat model, approval, exception and release artifacts
Telemetry contract	required traces, metrics, logs, semantic fields and privacy class
Deviation workflow	request fields, approvers, expiry, evidence and status

14. Financial Retail Examples

14.1 Customer-Facing RAG

Reference implementation element	Example
Pattern	Answer customer questions with approved sources and safe handoff
Mandatory controls	citation, source freshness, complaint detection, vulnerable customer escalation
Eval baseline	fee policy, dispute status, account servicing, ambiguous customer intent
Threat model	prompt injection in customer text, stale policy, unsupported commitment
Deviation example	adding account-specific retrieval requires stronger identity, entitlement and audit

14.2 Internal Policy Copilot

Reference implementation element	Example
Pattern	Employees ask policy and SOP questions inside authenticated workflow
Mandatory controls	role-based retrieval, policy version trace, no customer advice beyond employee authority
Eval baseline	branch procedure, contact-center policy, KYC exception, escalation rules
Reuse evidence	source registry, prompt boundary, retrieval ACL test, answer citation audit
Deviation example	using copilot output directly in customer letter requires narrative drafting controls

14.3 AML Investigation Workbench

Reference implementation element	Example
Pattern	Assist analysts with case summary, evidence timeline and narrative draft
Mandatory controls	analyst-owned disposition, source span, high-risk escalation, QA sampling
Eval baseline	transaction monitoring alerts, adverse media, structuring indicators, missing evidence
Threat model	omitted critical evidence, false comfort, SAR-related data leakage
Deviation example	automatic closure recommendation is outside baseline and needs separate approval

14.4 KYC Evidence Extraction

Reference implementation element	Example
Pattern	Extract beneficial ownership, document type, expiry and missing evidence
Mandatory controls	source span, confidence band, human validation, exception queue
Eval baseline	individual, SMB, trust, non-resident, high-risk country cases
Reuse evidence	extraction schema, document classifier tests, validation UI trace
Deviation example	straight-through KYC decision cannot inherit extraction-only evidence

14.5 Dispute Evidence Assistant

Reference implementation element	Example
Pattern	Build dispute evidence timeline and draft case packet
Mandatory controls	card network rule source, customer communication boundary, maker-checker
Eval baseline	fraud claim, merchant dispute, recurring charge, provisional credit
Threat model	wrong reason code, missing timeline evidence, unsupported customer denial
Deviation example	automated chargeback submission requires tool-action controls

14.6 Regulatory Reporting Narrative Draft

Reference implementation element	Example
Pattern	Draft variance explanation or management narrative from approved data lineage
Mandatory controls	data lineage, maker-checker, metric contract, attestation boundary
Eval baseline	variance explanation, source-to-report mapping, late adjustment, restatement scenario
Threat model	unsupported explanation, stale metric, hallucinated cause, audit trail gap
Deviation example	external filing language requires compliance and authorized signer review

15. Anti-Patterns

Anti-pattern	Why it fails	Mature replacement
Starter repo called reference implementation	It has code but no evidence, boundaries or lifecycle	Runnable skeleton plus eval, threat model, controls and telemetry
Pattern library as wiki	Documents are not consumable or testable	Pattern registry with maturity, owner, versions and adoption telemetry
Copying controls without context	Local risk may exceed baseline assumptions	Reuse qualification and local evidence requirements
Golden path replaces architecture judgment	Fast path may not fit high-risk use cases	Design authority and deviation process
Eval baseline never changes	New incidents are not learned from	Regression memory and versioned eval suites
Every deviation approved forever	Temporary exceptions become architecture drift	Expiry, compensating controls and quarterly review
Reuse measured by downloads	Downloads do not prove governed adoption	Qualified reuse, evidence inheritance and defect reduction
Platform owns everything	Use case teams avoid local accountability	Shared responsibility model
Local teams fork silently	Pattern assurance becomes untraceable	Versioned adoption record and supported variant model
No deprecation discipline	Old unsafe patterns remain in production	lifecycle state, migration guide and retirement date

16. Interview Answers

Q1: What is the difference between a template and a reference implementation?

A template is a starting point. A reference implementation is a governed, runnable embodiment of a pattern. It includes architecture rationale, quality baseline, threat model, control mapping, observability, evidence pack, versioning and deviation rules. In AI, that distinction matters because copying code does not copy safety, eval coverage or accountability.

Q2: How would you build an AI pattern library without creating shelfware?

I would start from repeated production demand, not abstract taxonomy. For each candidate pattern, I would define a pattern card, reference implementation, eval baseline, threat model, control mapping, telemetry contract and reuse qualification checklist. I would measure qualified reuse, lead time reduction, evidence inheritance, deviation rate, defect reduction and support burden. If a pattern is not reused or creates too many deviations, it should be redesigned or deprecated.

Q3: How do you decide whether a new use case can reuse an existing reference implementation?

I would run reuse qualification across workflow fit, risk tier, data class, human accountability, source authority, tool action, eval coverage, control inheritance, operating model and telemetry. If the use case stays inside the reference assumptions, it can reuse most assets. If it changes data sensitivity, customer impact or tool actions, it needs local controls or approved deviation. Some use cases should not reuse the pattern at all.

Q4: How can reusable evidence reduce governance effort without weakening control?

Reusable evidence reduces repeated explanation of common controls, such as RAG citation checks, prompt injection tests, tool gateway audit fields or OTel trace schema. It does not remove local accountability. The use case still needs to prove its own data sources, workflows, risk tier, eval samples, approvals and residual risk. The key is explicit evidence inheritance: what is reused, what is local, what version, and who accepted it.

Q5: How do you handle variants and deviations?

I separate configured variants from deviations. A configured variant is expected, such as different source registry or workflow labels. A deviation changes a baseline assumption, such as making an internal copilot customer-facing or adding write actions. Deviations require a record with business reason, risk analysis, compensating controls, evidence, owner and expiry. Hidden deviation is architecture drift.

Q6: What would you include in a reference implementation for a customer-facing RAG assistant?

I would include approved source registry, ACL and freshness checks, citation contract, refusal and handoff rules, prompt injection tests, groundedness eval, high-risk customer scenarios, complaint escalation, OTel trace schema, cost and latency metrics, human review sampling, release gate, evidence pack and deprecation rules. I would also define when the skeleton must not be reused, such as final adverse decisions or regulated advice without separate controls.

17. Portfolio Exercise

Scenario

A financial retail institution has 6 AI initiatives:

Use case	Current problem
Customer-facing RAG	Multiple teams build separate policy-answer bots
Internal policy copilot	Branch and contact-center teams use different policy search approaches
AML investigation workbench	Analysts need faster evidence timeline and narrative preparation
KYC evidence extraction	Onboarding teams repeat document extraction logic
Dispute evidence assistant	Dispute operations need reusable evidence packet drafting
Regulatory reporting narrative draft	Finance and risk teams need controlled narrative drafting from approved metrics

Required Artifacts

Pattern taxonomy with at least 8 pattern families.
Reference implementation anatomy for two patterns.
Reuse qualification checklist for one proposed adoption.
Evidence pack showing inherited versus local evidence.
Control mapping to NIST AI RMF, ISO/IEC 42001, architecture description and observability concepts.
Eval baseline with golden, high-risk, no-answer and regression sets.
Threat model reuse plan, including local extension points.
Variant and deviation process with an example approved deviation.
Lifecycle/versioning model with deprecation triggers.
Adoption telemetry and reuse ROI model.
Platform/team interface contract.
Scale memo recommending which two patterns should become Level 4 assured reusable patterns first.

Evaluation Rubric

Criterion	Strong evidence
Pattern clarity	Separates pattern, template, golden path and reference implementation
Assurance depth	Includes eval, threat model, control mapping, observability and evidence
Reuse discipline	Defines qualification, inheritance, variants and deviations
Financial retail fit	Uses realistic controls for RAG, AML, KYC, dispute and reporting
Lifecycle maturity	Has owner, version, support state, deprecation and migration logic
Platform interface	Shows what platform owns and what use case teams remain accountable for
Value proof	Measures lead time, defect reduction, evidence reuse and support burden

18. Final Principle

Reference implementation work is mature when a team can say:

We reused the pattern, inherited the right evidence, localized the right controls, recorded the right deviations, and can prove in production that the reused architecture still fits this workflow.

The senior AI PM, AI Architect, Platform PM and CBAP-level BA should treat pattern reuse as an assurance architecture problem, not only a productivity tactic.