返回 Papers
AI 底层逻辑 / 经典论文

AI Reference Implementation:模式库与复用保证架构

重要说明: 本文是学习、作品集和架构训练材料, 不构成法律意见、监管解释、审计结论、信息安全认证、模型验证结论或生产上线批准。金融零售正式项目必须由授权的业务、风险、合规、隐私、安全、模型风险、技术和审计角色确认。访问日期按 2026-06-30 记录。

652ai-foundations/papers/157-ai-reference-implementation-pattern-library-reuse-assurance-architecture.md

AI Reference Implementation / Pattern Library / Reuse Assurance Architecture 解读

Target audience: Senior AI PM / AI Architect / Platform PM / Enterprise Architect / CBAP-level BA / AI Governance Lead / Financial Retail Product and Operations Leader. Learning objectives: 建立一套把重复 AI 解决方案模式沉淀为 reference implementation 和 reusable pattern library 的架构方法, 同时保留质量、安全、证据、生命周期、偏差和业务价值控制。 Core question: 当企业发现多个 AI 用例都在重复做 RAG、copilot、tool gateway、evidence extraction、human review、eval 和 observability 时, 如何把经验变成可复用的实现资产, 而不是复制粘贴代码、复制风险和复制治理盲区?

重要说明: 本文是学习、作品集和架构训练材料, 不构成法律意见、监管解释、审计结论、信息安全认证、模型验证结论或生产上线批准。金融零售正式项目必须由授权的业务、风险、合规、隐私、安全、模型风险、技术和审计角色确认。访问日期按 2026-06-30 记录。


Source Anchors

这些来源用于校准 AI 风险管理、AI 管理体系、架构描述、平台工程、工程绩效和可观测性语言。本文关注 reference implementation、pattern assurance 和 reusable evidence, 不把它扩展成完整 AI platform service catalog、golden path 目录或 product line engineering 方法。

SourceOfficial link本文采用的思想
NIST AI Risk Management Frameworkhttps://www.nist.gov/itl/ai-risk-management-framework用 Govern / Map / Measure / Manage 组织 AI 模式的风险分类、控制证据、监测和持续改进
ISO/IEC 42001 AI management systemhttps://www.iso.org/standard/81230.html用 AI management system 的 policy、objective、operation、performance evaluation、management review 和 improvement 管理可复用模式
ISO/IEC/IEEE 42010 Architecture Descriptionhttps://www.iso.org/standard/74393.html用 stakeholder concerns、viewpoints、architecture rationale 和 architecture description 组织 pattern 的多视图描述
CNCF Platforms White Paperhttps://tag-app-delivery.cncf.io/whitepapers/platforms/用 platform as product、self-service 和 paved-path 思想定义平台团队与产品团队的接口, 但不把本文变成 service catalog
DORAhttps://dora.dev/用 deployment frequency、lead time、change fail rate、time to restore 的思想衡量 reference implementation 对交付流和恢复能力的影响
OpenTelemetry Documentationhttps://opentelemetry.io/docs/用 traces、metrics、logs 和 semantic conventions 的思想设计 AI pattern 的 observability skeleton 和 evidence trace

一句话:

A reference implementation is a governed, evidence-backed, runnable embodiment of a reusable AI pattern, designed so teams can reuse quality, security, eval, observability and control evidence without blindly inheriting context-specific assumptions.

1. Executive Summary

企业 AI 进入规模化阶段后, 会反复出现相似的 solution patterns:

customer-facing RAG
internal policy copilot
AML investigation workbench
KYC evidence extraction
dispute evidence assistant
regulatory reporting narrative draft

这些用例的业务目标不同, 但底层能力高度重复:

  • 受控知识检索和引用。
  • Prompt / policy pack / model route。
  • Tool gateway 和权限边界。
  • Human review、override、approval 和 audit trail。
  • Eval baseline、regression set、red-team set。
  • Threat model、privacy classification 和 logging policy。
  • OpenTelemetry trace、cost、latency、quality 和 adoption metrics。
  • Evidence pack、control mapping、release gate 和 deviation record。

低成熟度组织会让每个团队重新实现这些能力。结果是:

重复现象后果
每个 team 自己拼 RAGsource authority、ACL、citation quality 和 freshness 口径不一致
每个 use case 自己写 prompt wrapperprompt injection、防越权、拒答和日志策略分裂
Eval 每次从零开始没有 regression memory, 老错误反复出现
Human review 只写在流程图里reviewer capacity、queue、override reason 和 evidence 不可审
控制证据按项目手工收集scale decision 慢, audit reconstructability 弱
复制代码但不复制约束表面复用, 实际风险被放大

成熟方法不是只建立一个模板库, 而是建立 reference implementation architecture:

repeated solution pattern
  -> pattern taxonomy
  -> reference implementation
  -> reusable evidence pack
  -> control mapping
  -> reuse qualification
  -> approved variants and deviations
  -> telemetry and ROI monitoring
  -> lifecycle, ownership and deprecation

Reference implementation 的价值不是让所有 use case 一模一样, 而是让团队从已验证的架构 skeleton、control evidence、eval baseline 和 operating model 开始, 并明确哪些地方必须本地化、哪些地方可以继承、哪些地方需要 deviation approval。


2. Target Audience and Role Expectations

Role需要掌握的问题典型输出
Senior AI PM哪些 AI solution patterns 值得产品化复用, 复用后如何证明价值和 adoptionpattern investment thesis、reuse ROI、adoption telemetry、scale memo
AI Architectreference implementation 如何表达架构意图、质量属性、控制边界和变体pattern architecture description、reference implementation anatomy、deviation rules
Platform PM平台团队如何把 reusable skeleton 做成可消费资产, 同时避免成为大而全平台目录pattern backlog、interface contract、developer experience metrics
CBAP-level BA如何把业务流程、规则、例外、工件和控制点抽象成可复用需求模式workflow pattern card、reuse qualification checklist、localization map
Security / Privacythreat model、data boundary、logging、tool permissions 和 evidence 如何复用reusable threat model、privacy class map、security control inheritance
Risk / Compliance / Model Risk哪些控制证据可继承, 哪些必须由用例重新证明control mapping、eval baseline challenge、residual risk decision
Engineering Lead如何使用 reference implementation 快速交付, 同时遵守版本、gate 和 telemetryimplementation fork plan、release evidence、deviation record
Internal Audit如何追溯某个用例继承了什么 evidence, 修改了什么, 谁批准了偏差evidence lineage、pattern version trace、exception expiry report

3. Core Thesis: Reuse Must Include Assurance, Not Only Code

AI reference implementation 常被误解成 starter repo。真正高级的 reuse 不只是复用代码:

code reuse
  + architecture intent reuse
  + control evidence reuse
  + eval baseline reuse
  + threat model reuse
  + observability reuse
  + operating model reuse
  + approved deviation discipline

如果只复用代码, 风险会被复制:

  • 一个 customer-facing RAG skeleton 被用于 regulated advice, 但原始 threat model 只覆盖 internal policy lookup。
  • 一个 KYC extraction prompt 被复用于 dispute evidence, 但 evidence quality rubric 没有覆盖 chargeback reason code。
  • 一个 tool gateway example 默认 read-only, 后续团队增加 write action 却没有重新做 approval、idempotency 和 rollback。
  • 一个 eval set 在 pilot 期表现良好, 但没有 high-risk slice 和 regression memory。

因此 reference implementation 的定义应包含:

维度必须包含
Runnable skeleton可运行的最小实现, 包括 prompt/RAG/tool/eval/telemetry wiring
Architecture decision为什么这样设计, 适用和不适用边界
Quality baselineeval dataset、rubric、threshold、known failure taxonomy
Security baselinethreat model、abuse cases、control checks、logging minimization
Control evidencegate evidence、review evidence、trace schema、approval path
Variants允许的参数化变体和需要批准的结构性变体
Lifecycleowner、version、deprecation、migration、support level
Adoption proofreuse count、integration lead time、defect reduction、value and risk metrics

4. Conceptual Distinctions

4.1 Pattern vs Template vs Golden Path vs Reference Implementation

ConceptDefinition适合回答的问题不应混淆的点
Pattern重复出现的问题-上下文-解决方案结构"我们反复遇到哪类 AI 问题?"Pattern 不一定可运行
Template文档、代码或配置的填充起点"团队如何少写样板?"Template 不能自动证明质量和控制
Golden path平台推荐的低摩擦交付路径"团队如何用标准能力快速上线?"Golden path 偏 developer experience, 不等于完整 assurance
Reference implementation可运行、可审查、带证据和变体边界的 pattern embodiment"团队如何复用实现和证据, 并知道何时不能复用?"Reference implementation 必须说明适用边界和证据继承规则
Product line以变体管理和共性资产为中心的产品族工程"如何系统化管理多个产品变体?"本文不展开完整 product line engineering

高级表达:

A pattern says what tends to work.
A template helps you start.
A golden path helps you move fast.
A reference implementation proves how the pattern works under defined controls.

4.2 Design Authority

Reference implementation 需要 design authority, 否则会退化成散落的示例代码。

Decision typeDesign authority responsibility
Pattern admission该重复模式是否值得纳入 library
Baseline controls哪些 quality/security/privacy/eval/observability 控制是 mandatory
Variant approval哪些变体可配置, 哪些需要 architecture review
Evidence inheritance哪些 evidence 可继承, 哪些必须本地重新生成
Deprecation旧 pattern 何时停止新用, 已采用团队如何迁移
Exception偏差是否可接受, 补偿控制是什么, 何时到期

Design authority 不是把关委员会的同义词。它应当让复用更快, 因为团队不用每次重新争论同一组架构问题。


5. Pattern Taxonomy

5.1 AI Reference Implementation Pattern Families

Pattern familyRepeated problemReference implementation focus
Customer-facing RAG客户问政策、费用、账户、争议状态, 需要可信引用和安全边界source authority, citation, refusal, handoff, complaint signal
Internal policy copilot员工查政策、SOP、操作指引, 需要权限过滤和版本控制policy pack, retrieval ACL, answer boundary, SOP version trace
Evidence extraction从文档、邮件、表单、case note 中抽取结构化证据extraction schema, confidence band, human validation, source span
Investigation workbench分析 AML、fraud、dispute、complaint case, 需要汇总、证据链和决策支持case graph, evidence timeline, analyst notes, escalation boundary
Narrative drafting生成报告、解释、监管 narrative、客户沟通草稿grounded draft, maker-checker, source trace, approval workflow
Tool gateway / agent actionAI 调用工具读取或执行动作tool contract, authorization, idempotency, approval, audit, kill switch
Human review and overrideAI 输出进入受控工作流, 需要复核和覆盖原因review queue, override taxonomy, reviewer load, QA sample
Eval and regression多个用例需要质量、风险和回归评估eval harness, golden set, failure taxonomy, threshold and gate
Observability and evidence需要追踪版本、调用、证据、成本、延迟和控制事件OTel trace skeleton, metric contract, evidence binder

5.2 Financial Retail Pattern Map

Use casePrimary patternSecondary patterns
Customer-facing RAGCustomer-facing RAGescalation, complaint detection, source freshness
Internal policy copilotInternal policy copilotpolicy versioning, access control, adoption telemetry
AML investigation workbenchInvestigation workbenchevidence extraction, narrative drafting, human review
KYC evidence extractionEvidence extractiondocument classification, exception workflow, QA sampling
Dispute evidence assistantInvestigation workbenchevidence timeline, card network rule retrieval, customer letter draft
Regulatory reporting narrative draftNarrative draftingdata lineage, maker-checker, evidence binder

5.3 Pattern Maturity Levels

LevelNameEvidence standard
0Observed repeat多个团队遇到类似问题, 但没有标准解决方案
1Documented pattern有 pattern card、适用边界和风险提示
2Template有文档、sample prompt、sample schema 或 starter code
3Reference implementation有 runnable skeleton、eval baseline、threat model、telemetry 和 evidence pack
4Assured reusable pattern已被多个用例复用, control evidence 可继承, deviations 被管理
5Managed lifecycle asset有 owner、versioning、adoption telemetry、ROI、deprecation 和 migration

目标不是让所有 pattern 都到 Level 5。只有高重复、高风险或高价值模式值得投入 reference implementation 级别。


6. Reference Implementation Anatomy

一个合格的 AI reference implementation 应由 12 个资产组成。

Asset内容复用价值
Pattern cardproblem、context、forces、solution、applicability、non-applicability防止团队误用
Architecture descriptioncontext view、container/component view、data flow、control view、runtime view对齐 stakeholder concerns
Runnable skeletonminimal service/app, sample config, local test harness, deployment notes缩短交付 lead time
Prompt skeletonsystem prompt boundary, task prompt, refusal style, evidence requirement复用安全和质量约束
RAG skeletonsource registry, chunking policy, ACL filter, citation contract, freshness check复用知识治理
Tool gateway skeletontool registry, allowlist, auth, approval, idempotency, audit, rollback控制 agent action risk
Eval baselinegolden set, regression set, red-team set, rubric, thresholds, reviewer guide复用质量和风险记忆
Threat modelmisuse cases, prompt injection, data exfiltration, privilege escalation, failure mode复用安全分析
Control mappingNIST/ISO/internal controls to implementation checks and evidence复用 governance evidence
Observability skeletontraces, metrics, logs, release identity, cost, latency, quality, user action支持 production assurance
Evidence packrelease evidence, eval run, threat model, approvals, deviations, known limitations支持 scale and audit
Lifecycle metadataowner, support level, version, compatibility, deprecation, migration管理资产健康

6.1 Prompt / RAG / Tool Gateway Skeletons

SkeletonMandatory design elements
Prompt skeletonrole boundary, allowed tasks, prohibited tasks, source requirement, uncertainty behavior, escalation rule, logging class
RAG skeletonapproved source registry, source owner, freshness SLA, ACL enforcement, retrieval eval, citation verifier, stale-source stop rule
Tool gateway skeletontool risk tier, auth context, least privilege, side-effect declaration, approval requirement, idempotency key, audit event, kill switch

6.2 Eval Baseline

Eval baseline 不是一次性测试结果, 而是 reference implementation 的质量记忆。

Eval asset内容
Golden set代表目标任务的高质量样本
High-risk slicevulnerable customer、complaint、AML high-risk、KYC exception、regulatory deadline 等
Regression set历史 defect、near miss、incident 和 reviewer disagreement
No-answer set应拒答、升级或要求更多信息的场景
Prompt injection set外部内容试图覆盖系统策略或泄露数据
Rubriccorrectness、groundedness、completeness、policy fit、tone、escalation
Thresholdaggregate threshold and critical failure hard stop

7. Reuse Qualification

Reuse qualification 的问题不是 "能不能复制这个 repo", 而是:

Can this use case inherit the reference implementation's architecture assumptions, controls and evidence without creating hidden risk?

7.1 Qualification Dimensions

DimensionQuestions
Business workflow fit目标流程、用户角色、工件、决策点是否与 pattern 匹配
Risk tiercustomer impact、regulatory impact、financial loss、privacy sensitivity 是否在参考范围内
Data classPII、financial data、SAR-related data、confidential policy 是否符合 skeleton data boundary
Human accountabilityAI 是 draft/read-only/recommend/execute 哪一级, 是否改变责任边界
Source authorityRAG source 是否有 owner、freshness、approval 和 ACL
Tool action是否从 read-only 扩展到 write/execute, 是否需要新的 control
Eval coveragebaseline eval 是否覆盖目标 case types、languages、edge cases
Control inheritance哪些 controls 可继承, 哪些必须本地证明
Operating modelreview queue、manager cadence、incident path 是否可用
Telemetry是否能输出 required events and traces

7.2 Reuse Decision Types

DecisionMeaningExample
Reuse as-is仅做配置和数据源绑定, 不改变架构边界Internal HR policy copilot 复用 internal policy copilot skeleton
Reuse with local controlsskeleton 适用, 但需增加本地 eval、QA 或 source reviewKYC evidence extraction 增加 beneficial ownership high-risk slice
Reuse as variant需要被 design authority 记录的结构性变体Customer-facing RAG 增加 authenticated account-specific retrieval
Deviation required违反默认约束, 需要批准和补偿控制Tool gateway 从 read-only 改为 customer-impacting write action
Do not reuse上下文超出 reference implementation 假设用 internal policy copilot skeleton 处理 final credit decline explanation

8. Reusable Evidence Pack

Reference implementation 最重要的复用资产之一是 evidence pack。它让团队不必每次从零解释同一组控制。

Evidence object可继承内容必须本地化内容
Pattern rationalewhy this architecture existsuse case business problem and outcome thesis
Architecture viewscommon components, data flow, control pointsactual systems, data stores, identity and workflow integration
Threat modelcommon threats and mitigationsuse-case-specific abuse cases and data exposure
Eval baselinereusable suites and rubricslocal samples, risk slices, source-specific tests
Control mappingcommon control claims and implementation hookscontrol owner, residual risk, local evidence
Security testsprompt injection, access boundary, tool contract checkslocal source ACL and tool permissions
Privacy analysislogging minimization and retention classesactual data class, region, consent and retention
Observabilitytrace schema and metric contractdashboard thresholds and business outcome joins
Release gatestandard evidence checklistrelease decision, approvals and exceptions
Known limitationspattern limitslocal constraints and accepted residual risk

证据复用的核心规则:

  • Reusable evidence reduces duplication, not accountability.
  • Evidence inheritance must be explicit, versioned and reviewable.
  • A team can inherit the skeleton and still fail local release gates.
  • A local deviation can invalidate inherited evidence.

9. Control Mapping

Control mapping 把 pattern assurance 变成可管理的治理语言。

Control domainReusable control claimEvidence source
GovernancePattern has owner, scope, risk tier, review cadence and deprecation pathpattern registry, lifecycle metadata
ArchitectureStakeholder concerns and architecture decisions are documentedarchitecture description, ADR
Data and privacyPrompt context, logs and traces minimize sensitive payloadprivacy class, DLP sample, trace schema
SecurityPrompt injection, unauthorized retrieval and tool abuse are testedred-team run, access test, tool contract
Model behaviorOutputs meet groundedness, completeness and refusal thresholdseval run, SME sample, failure taxonomy
Human oversightHuman review and override are observable and capacity-awarereview queue metric, override reason log
OperationsRelease, rollback, incident and evidence paths are definedrunbook, release gate, incident drill
MonitoringRuntime metrics cover quality, safety, cost, latency, adoption and controlOTel dashboard, metric contract
Continual improvementDefects and deviations feed pattern version updatesdefect backlog, version notes, quarterly review

9.1 Mapping to Source Anchors

Anchor languageReference implementation translation
NIST AI RMF Governpattern ownership, design authority, risk roles, evidence review
NIST AI RMF Mapapplicability, context, users, workflow, data, risk tier
NIST AI RMF Measureeval baseline, telemetry, review sampling, control metrics
NIST AI RMF Managerelease gates, deviations, incident response, deprecation
ISO/IEC 42001 operationpattern operation controls, competence, documented information
ISO/IEC 42001 performance evaluationreuse telemetry, management review, audit evidence
ISO/IEC/IEEE 42010views, stakeholders, concerns, rationale and architecture description
CNCF platform thinkingself-service consumption and platform-as-product interface
DORAdelivery flow, change quality, recovery and operational learning
OpenTelemetrytraces, metrics and logs that connect release identity to outcomes

10. Lifecycle and Versioning

AI reference implementations age quickly because models, prompts, sources, tools, policies and regulations change.

10.1 Versioned Artifacts

ArtifactVersioning rule
Reference implementationsemantic version with breaking-change notes
Prompt packversion per task boundary and policy change
RAG source profileversion per source set, chunking, embedding, reranking and ACL policy
Tool contractversion per API schema, permission, side effect and approval model
Eval suiteversion per sample set, rubric, threshold and known failure set
Threat modelversion per new tool, channel, data class or abuse case
Control mappingversion per control framework, internal policy or evidence path
Telemetry schemaversion per required event field and semantic convention

10.2 Lifecycle States

StateMeaningAllowed use
Candidatepattern observed and under designdiscovery and prototype only
Beta referencerunnable skeleton exists, limited evidencecontrolled pilots
Approvedeval, threat model, control mapping and evidence pack reviewednew use cases may adopt
Restrictedknown limitation or incident requires narrowed useonly approved scopes
Deprecatedno new adoption; migration path existsexisting consumers migrate
Retiredunsupported and removed from approved libraryno production use

10.3 Deprecation Triggers

  • Model or vendor change invalidates baseline eval.
  • Security incident reveals structural weakness.
  • New policy or regulation changes acceptable use.
  • Pattern is superseded by a safer or cheaper implementation.
  • Adoption telemetry shows low reuse or high deviation burden.
  • Support team cannot maintain controls or evidence freshness.

11. Deviation Process

Deviation is not failure. Hidden deviation is failure.

11.1 Deviation Types

Deviation typeExampleRequired review
Configuration deviationDifferent source freshness SLApattern owner review
Data deviationNew sensitive data class in prompt contextprivacy and security review
Risk-tier deviationInternal assistant becomes customer-facingrisk, compliance and architecture review
Tool-action deviationRead-only tool becomes write actionsecurity, architecture, business owner approval
Eval deviationLocal use case lacks required high-risk sliceeval owner and risk review
Observability deviationCannot emit required trace fieldsplatform and audit evidence review
Control deviationHuman review sampling reduced below baselinerisk owner decision with expiry

11.2 Approved Deviation Record

FieldRequired content
Deviation IDstable identifier tied to pattern version and use case
Baseline requirementreference implementation rule being changed
Business reasonwhy standard approach does not fit
Risk analysisimpact, likelihood, affected controls, affected users
Compensating controlswhat replaces or reduces the lost control
Evidence requiredeval, test, review, monitoring and approval evidence
Expirydate or trigger when deviation must be renewed or removed
Ownerresidual risk owner and implementation owner

12. Adoption Telemetry and Reuse ROI

Pattern library success cannot be measured by number of documents published.

12.1 Reuse Telemetry

MetricMeaning
qualified reuse countuse cases that passed reuse qualification and adopted the pattern
lead time reductiontime from intake to first controlled pilot compared with non-reference approach
evidence inheritance ratepercentage of evidence objects reused without rework
deviation ratehow often teams need approved deviations
defect escape ratedefects found after release by pattern and version
eval regression reusenumber of incidents converted into shared regression cases
support burdenquestions, implementation issues and break-fix effort per adoption
deprecation complianceconsumers migrated before retirement date

12.2 Reuse ROI

Reusable pattern ROI should combine speed, quality and risk:

Reuse ROI
= delivery lead time avoided
+ assurance effort avoided
+ defect and incident reduction
+ platform operating cost efficiency
+ audit and review effort reduction
- pattern maintenance cost
- deviation management cost
- migration and deprecation cost

Senior PM framing:

The business case for a reference implementation is not "developers save time"; it is "the organization learns once, controls once where appropriate, and reuses evidence without losing accountability."


13. Platform / Team Interface

Reference implementation sits between platform and product teams.

ResponsibilityPlatform teamUse case team
SkeletonBuild and maintain common implementationConfigure and integrate into workflow
Pattern cardMaintain approved pattern definitionConfirm fit and local assumptions
Eval harnessProvide reusable suites and runnerAdd local samples and sign off thresholds
Threat modelProvide common threat modelExtend for local data, channel and tool actions
ObservabilityProvide trace/metric schema and dashboardsEmit required events and connect business outcomes
ControlsProvide control hooks and evidence templatesProduce local evidence and obtain approvals
DeviationReview and record deviationsRequest, justify, monitor and close deviations
LifecycleVersion, support, deprecate, migrateTrack adoption version and migrate on schedule

13.1 Interface Contracts

InterfaceContract
Pattern registrypattern id, owner, maturity, versions, approved scopes, support level
Implementation packagerepo/module/image, config schema, integration guide, test harness
Evidence APIstandard locations for eval run, threat model, approval, exception and release artifacts
Telemetry contractrequired traces, metrics, logs, semantic fields and privacy class
Deviation workflowrequest fields, approvers, expiry, evidence and status

14. Financial Retail Examples

14.1 Customer-Facing RAG

Reference implementation elementExample
PatternAnswer customer questions with approved sources and safe handoff
Mandatory controlscitation, source freshness, complaint detection, vulnerable customer escalation
Eval baselinefee policy, dispute status, account servicing, ambiguous customer intent
Threat modelprompt injection in customer text, stale policy, unsupported commitment
Deviation exampleadding account-specific retrieval requires stronger identity, entitlement and audit

14.2 Internal Policy Copilot

Reference implementation elementExample
PatternEmployees ask policy and SOP questions inside authenticated workflow
Mandatory controlsrole-based retrieval, policy version trace, no customer advice beyond employee authority
Eval baselinebranch procedure, contact-center policy, KYC exception, escalation rules
Reuse evidencesource registry, prompt boundary, retrieval ACL test, answer citation audit
Deviation exampleusing copilot output directly in customer letter requires narrative drafting controls

14.3 AML Investigation Workbench

Reference implementation elementExample
PatternAssist analysts with case summary, evidence timeline and narrative draft
Mandatory controlsanalyst-owned disposition, source span, high-risk escalation, QA sampling
Eval baselinetransaction monitoring alerts, adverse media, structuring indicators, missing evidence
Threat modelomitted critical evidence, false comfort, SAR-related data leakage
Deviation exampleautomatic closure recommendation is outside baseline and needs separate approval

14.4 KYC Evidence Extraction

Reference implementation elementExample
PatternExtract beneficial ownership, document type, expiry and missing evidence
Mandatory controlssource span, confidence band, human validation, exception queue
Eval baselineindividual, SMB, trust, non-resident, high-risk country cases
Reuse evidenceextraction schema, document classifier tests, validation UI trace
Deviation examplestraight-through KYC decision cannot inherit extraction-only evidence

14.5 Dispute Evidence Assistant

Reference implementation elementExample
PatternBuild dispute evidence timeline and draft case packet
Mandatory controlscard network rule source, customer communication boundary, maker-checker
Eval baselinefraud claim, merchant dispute, recurring charge, provisional credit
Threat modelwrong reason code, missing timeline evidence, unsupported customer denial
Deviation exampleautomated chargeback submission requires tool-action controls

14.6 Regulatory Reporting Narrative Draft

Reference implementation elementExample
PatternDraft variance explanation or management narrative from approved data lineage
Mandatory controlsdata lineage, maker-checker, metric contract, attestation boundary
Eval baselinevariance explanation, source-to-report mapping, late adjustment, restatement scenario
Threat modelunsupported explanation, stale metric, hallucinated cause, audit trail gap
Deviation exampleexternal filing language requires compliance and authorized signer review

15. Anti-Patterns

Anti-patternWhy it failsMature replacement
Starter repo called reference implementationIt has code but no evidence, boundaries or lifecycleRunnable skeleton plus eval, threat model, controls and telemetry
Pattern library as wikiDocuments are not consumable or testablePattern registry with maturity, owner, versions and adoption telemetry
Copying controls without contextLocal risk may exceed baseline assumptionsReuse qualification and local evidence requirements
Golden path replaces architecture judgmentFast path may not fit high-risk use casesDesign authority and deviation process
Eval baseline never changesNew incidents are not learned fromRegression memory and versioned eval suites
Every deviation approved foreverTemporary exceptions become architecture driftExpiry, compensating controls and quarterly review
Reuse measured by downloadsDownloads do not prove governed adoptionQualified reuse, evidence inheritance and defect reduction
Platform owns everythingUse case teams avoid local accountabilityShared responsibility model
Local teams fork silentlyPattern assurance becomes untraceableVersioned adoption record and supported variant model
No deprecation disciplineOld unsafe patterns remain in productionlifecycle state, migration guide and retirement date

16. Interview Answers

Q1: What is the difference between a template and a reference implementation?

A template is a starting point. A reference implementation is a governed, runnable embodiment of a pattern. It includes architecture rationale, quality baseline, threat model, control mapping, observability, evidence pack, versioning and deviation rules. In AI, that distinction matters because copying code does not copy safety, eval coverage or accountability.

Q2: How would you build an AI pattern library without creating shelfware?

I would start from repeated production demand, not abstract taxonomy. For each candidate pattern, I would define a pattern card, reference implementation, eval baseline, threat model, control mapping, telemetry contract and reuse qualification checklist. I would measure qualified reuse, lead time reduction, evidence inheritance, deviation rate, defect reduction and support burden. If a pattern is not reused or creates too many deviations, it should be redesigned or deprecated.

Q3: How do you decide whether a new use case can reuse an existing reference implementation?

I would run reuse qualification across workflow fit, risk tier, data class, human accountability, source authority, tool action, eval coverage, control inheritance, operating model and telemetry. If the use case stays inside the reference assumptions, it can reuse most assets. If it changes data sensitivity, customer impact or tool actions, it needs local controls or approved deviation. Some use cases should not reuse the pattern at all.

Q4: How can reusable evidence reduce governance effort without weakening control?

Reusable evidence reduces repeated explanation of common controls, such as RAG citation checks, prompt injection tests, tool gateway audit fields or OTel trace schema. It does not remove local accountability. The use case still needs to prove its own data sources, workflows, risk tier, eval samples, approvals and residual risk. The key is explicit evidence inheritance: what is reused, what is local, what version, and who accepted it.

Q5: How do you handle variants and deviations?

I separate configured variants from deviations. A configured variant is expected, such as different source registry or workflow labels. A deviation changes a baseline assumption, such as making an internal copilot customer-facing or adding write actions. Deviations require a record with business reason, risk analysis, compensating controls, evidence, owner and expiry. Hidden deviation is architecture drift.

Q6: What would you include in a reference implementation for a customer-facing RAG assistant?

I would include approved source registry, ACL and freshness checks, citation contract, refusal and handoff rules, prompt injection tests, groundedness eval, high-risk customer scenarios, complaint escalation, OTel trace schema, cost and latency metrics, human review sampling, release gate, evidence pack and deprecation rules. I would also define when the skeleton must not be reused, such as final adverse decisions or regulated advice without separate controls.


17. Portfolio Exercise

Scenario

A financial retail institution has 6 AI initiatives:

Use caseCurrent problem
Customer-facing RAGMultiple teams build separate policy-answer bots
Internal policy copilotBranch and contact-center teams use different policy search approaches
AML investigation workbenchAnalysts need faster evidence timeline and narrative preparation
KYC evidence extractionOnboarding teams repeat document extraction logic
Dispute evidence assistantDispute operations need reusable evidence packet drafting
Regulatory reporting narrative draftFinance and risk teams need controlled narrative drafting from approved metrics

Required Artifacts

  1. Pattern taxonomy with at least 8 pattern families.
  2. Reference implementation anatomy for two patterns.
  3. Reuse qualification checklist for one proposed adoption.
  4. Evidence pack showing inherited versus local evidence.
  5. Control mapping to NIST AI RMF, ISO/IEC 42001, architecture description and observability concepts.
  6. Eval baseline with golden, high-risk, no-answer and regression sets.
  7. Threat model reuse plan, including local extension points.
  8. Variant and deviation process with an example approved deviation.
  9. Lifecycle/versioning model with deprecation triggers.
  10. Adoption telemetry and reuse ROI model.
  11. Platform/team interface contract.
  12. Scale memo recommending which two patterns should become Level 4 assured reusable patterns first.

Evaluation Rubric

CriterionStrong evidence
Pattern claritySeparates pattern, template, golden path and reference implementation
Assurance depthIncludes eval, threat model, control mapping, observability and evidence
Reuse disciplineDefines qualification, inheritance, variants and deviations
Financial retail fitUses realistic controls for RAG, AML, KYC, dispute and reporting
Lifecycle maturityHas owner, version, support state, deprecation and migration logic
Platform interfaceShows what platform owns and what use case teams remain accountable for
Value proofMeasures lead time, defect reduction, evidence reuse and support burden

18. Final Principle

Reference implementation work is mature when a team can say:

We reused the pattern, inherited the right evidence, localized the right controls, recorded the right deviations, and can prove in production that the reused architecture still fits this workflow.

The senior AI PM, AI Architect, Platform PM and CBAP-level BA should treat pattern reuse as an assurance architecture problem, not only a productivity tactic.