AI Enterprise Architecture / TOGAF / ArchiMate / ADM Playbook
以下来源作为术语和方法锚点. 本文是学习, 作品集和架构沟通训练材料, 不构成认证, 法律, 合规, 审计或采购意见. 正式项目需要按机构政策, 监管地区, 数据类型, AI 风险等级, 模型供应商合同和内部架构治理流程复核.
AI Enterprise Architecture TOGAF / ArchiMate / ADM Playbook
目的: 把 TOGAF ADM, ArchiMate 和 AI governance 从方法论语言转成 AI enterprise architecture 的可执行体系: capability planning, architecture runway, governance gate, portfolio management, release evidence 和审计材料. 适用对象: 已有 10 年金融零售 PM / BA / Developer 经验, 且已具备 CBAP 能力的人. 本文不讲 TOGAF 或 ArchiMate 入门, 重点是 AI Solutions Architect / Enterprise Architect / AI Product Architect 如何用 EA 方法管理企业 AI 转型. 核心观点: AI EA 的成熟度不在于画出 app + LLM + RAG 的技术参考架构, 而在于把战略能力, use case portfolio, platform runway, policy control, eval gate, operating model 和 evidence repository 放进同一套 ADM cycle.
Source Anchors
以下来源作为术语和方法锚点. 本文是学习, 作品集和架构沟通训练材料, 不构成认证, 法律, 合规, 审计或采购意见. 正式项目需要按机构政策, 监管地区, 数据类型, AI 风险等级, 模型供应商合同和内部架构治理流程复核.
| Anchor | Official Link | 本文使用方式 |
|---|---|---|
| TOGAF | https://www.opengroup.org/togaf | 用 enterprise architecture, ADM, architecture governance, roadmap 和 architecture repository 语言组织 AI 转型. |
| ArchiMate | https://www.opengroup.org/archimate-forum/archimate-overview | 用 motivation, strategy, business, application, technology, implementation / migration 层表达 AI capability, platform, controls 和 transition states. |
| NIST AI RMF | https://www.nist.gov/itl/ai-risk-management-framework | 用 Govern, Map, Measure, Manage 把 AI risk, eval, monitoring, incident, evidence 和 continuous improvement 接入 ADM. |
| ISO/IEC 42001 | https://www.iso.org/standard/81230.html | 用 AI management system 语言把政策, 目标, 过程, 运行, 绩效评价和持续改进接入 Architecture Repository 和 governance cycle. |
One-Sentence Positioning
AI Enterprise Architecture = 用 TOGAF ADM 管企业 AI 从战略到路线图的生命周期, 用 ArchiMate 建模 capability / application / technology / motivation / migration 的关系, 再用 AI RMF 和 ISO 42001 把风险, eval, release gate, monitoring 和 evidence 变成可审计的架构治理系统.
更短的面试表达:
我不会把 AI EA 做成一张技术蓝图. 我会把 ADM 改造成 AI capability, use case, platform and governance 的连续循环, 用 ArchiMate 表达从业务能力到模型, 数据, 工具, 控制和迁移路线的关系, 最后用 evidence repository 证明每个 release 为什么可以上线.
1. 为什么 AI EA 不能只靠技术参考架构
技术参考架构能说明系统如何组成, 但不足以管理企业 AI 的投资, 风险和演进. 金融零售 AI 的难点通常不是 "能不能接入模型", 而是以下问题是否可治理:
| EA 问题 | 技术参考架构的盲点 | AI EA 的回答 |
|---|---|---|
| 哪些 AI 能力值得投资 | 组件图通常不表达战略主题, value stream, capability gap 和 funding gate | 用 capability-based planning 管 portfolio thesis, maturity gap, investment increment |
| 哪些 use cases 可以复用平台能力 | 项目图只服务单个 solution | 用 Architecture Repository 管 shared runway, reference building blocks, standards and exceptions |
| 哪些业务动作允许被 AI 影响 | 技术图容易把 tool call 画成普通 API 调用 | 用 action tier, policy control, human accountability 和 approval gate 管风险 |
| 哪些 release 可以进入 pilot 或 production | 组件图不证明 fit for purpose | 用 eval contract, release memo, residual risk acceptance 和 monitoring plan 形成 gate evidence |
| 模型, prompt, index, tool schema 变化影响哪些业务 | 单点架构图缺少 traceability | 用 ArchiMate relationship 和 repository metadata 连接 capability, service, data object, technology service, work package |
| 审计如何重建决策 | 技术图不能替代证据链 | 用 evidence binder 串联 intake, ADR, design review, eval, approval, trace, incident and review |
| 组织如何持续管理 AI | 项目图不表达 operating model | 用 governance forums, RACI, management review, policy exceptions 和 architecture board cadence |
关键转变:
从: 画一个 AI technical reference architecture
到: 设计一个 AI enterprise architecture operating system
这个 operating system 管:
- capability thesis
- use case portfolio
- platform runway
- reference building blocks
- model / data / tool / policy controls
- eval and release gates
- architecture repository
- governance evidence
- transition roadmap
对有 CBAP 背景的人, 升级点不是再学一遍需求或流程, 而是把 stakeholder concerns, business capabilities, value streams, risk controls 和 implementation roadmap 放进同一个 architecture decision system.
2. AI ADM: 把 TOGAF ADM 改造成四个并行循环
传统 ADM 容易被误用成线性文档流程. 企业 AI 更适合把 ADM 改造成四个互相牵引的循环:
AI Capability Cycle
strategy -> value stream -> capability gap -> maturity target -> funding gate
AI Use Case Cycle
intake -> risk tier -> architecture package -> eval gate -> pilot -> production monitoring
AI Platform Cycle
reference building blocks -> runway backlog -> reusable services -> standards -> exception review
AI Governance Cycle
policy -> control -> evidence -> incident learning -> management review -> policy refresh
四个循环的关系:
| Cycle | 主要问题 | 主要 owner | 主要产出 |
|---|---|---|---|
| Capability Cycle | 企业为什么要投资这组 AI 能力 | Business Architect, Enterprise Architect, AI Product Lead | capability map, heatmap, roadmap, funding memo |
| Use Case Cycle | 某个 AI 场景能否安全进入生产 | Product Owner, Solution Architect, Model Risk, Compliance | use case brief, risk tier, architecture package, eval report, release memo |
| Platform Cycle | 哪些能力应沉淀为企业 runway | AI Platform Architect, EA, Security, Data Architecture | reference architecture, model gateway, RAG platform, tool gateway, eval service, policy engine |
| Governance Cycle | AI 运行是否持续符合目标和风险偏好 | AI Governance Lead, Risk, Compliance, Audit, Operations | policy register, evidence binder, monitoring review, exception log, improvement actions |
ADM 的价值在这里不是 "按阶段填模板", 而是让所有 AI 投资都能回答:
- 它服务哪个 enterprise capability?
- 它改变哪个 value stream stage?
- 它复用了哪些 architecture building blocks?
- 它引入哪些 AI-specific risks?
- 它的 release gate 和 residual risk evidence 是什么?
- 它上线后如何进入 continuous governance?
3. AI ADM Phase Reinterpretation
3.1 Phase Map
| ADM phase | AI EA 改造后的核心任务 | 关键架构问题 | 最小 evidence |
|---|---|---|---|
| Preliminary | 建立 AI architecture capability, governance mandate, repository structure, principles and decision rights | 谁有权批准 AI 架构, release, exception and retirement? | AI EA charter, architecture principles, governance RACI, repository taxonomy |
| A. Architecture Vision | 定义 AI transformation thesis, priority value streams, capability outcomes and risk appetite | 为什么这组 AI 投资值得做, 哪些场景不做? | vision brief, outcome tree, value stream scope, no-go boundaries |
| B. Business Architecture | 建 capability map, value stream map, operating model, human accountability and adoption model | AI 改变哪些业务能力, 决策权和岗位责任? | capability heatmap, value stream to capability matrix, RACI, adoption metrics |
| C. Information Systems Architecture - Data | 定义 data / knowledge architecture, semantic layer, lineage, entitlement, retention and evidence | 哪些数据和知识可被 AI 使用, 如何证明来源和权限? | source registry, knowledge card, data contract, retrieval evidence design |
| C. Information Systems Architecture - Application | 定义 AI product architecture, application services, orchestration, model route, RAG, tool gateway, workflow integration | AI 系统如何嵌入业务流程和企业应用组合? | C4 container, application service catalog, API / event contracts, workflow sequence |
| D. Technology Architecture | 定义 model runtime, cloud / network boundary, observability, security, platform services and deployment pattern | 平台如何支撑可靠, 安全, 可观测, 可替换的 AI 运行? | platform view, model gateway route, telemetry schema, deployment view, resilience controls |
| E. Opportunities and Solutions | 将 capability increments 组合成 solution options and runway backlog | build / buy / partner / platform 的取舍是什么? | option matrix, ADR set, work package candidates, dependency map |
| F. Migration Planning | 排 release train, pilot path, transition architectures, funding gates and architecture runway | 如何从 pilot 到 production 到 scale, 每个 plateau 证明什么? | transition roadmap, release gate plan, migration backlog, benefit / risk sequencing |
| G. Implementation Governance | 在 delivery 中执行 architecture conformance, eval gate, exception handling and release decision | delivery 是否偏离 architecture intent, 风险是否被接受? | design review record, eval report, policy test, release memo, exception approval |
| H. Architecture Change Management | 用 production telemetry, incidents, model changes, regulatory changes and portfolio review 驱动架构刷新 | 什么触发 re-architecture, restriction, rollback or retirement? | monitoring review, incident postmortem, change impact analysis, refreshed roadmap |
| Requirements Management | 管 strategy -> capability -> use case -> architecture decision -> control -> evidence 的 traceability | 需求变化是否影响模型, 数据, 工具, policy, eval and evidence? | traceability matrix, decision log, evidence index |
3.2 Phase Gates For AI
| Gate | ADM entry point | Gate question | Gate decision |
|---|---|---|---|
| Gate 0: Strategic Fit | Phase A | 场景是否属于优先 value stream 和 capability thesis? | proceed to discovery, merge, park, reject |
| Gate 1: Capability Readiness | Phase B | capability owner, process owner, adoption path and human accountability 是否清楚? | fund capability increment, refine business architecture, stop |
| Gate 2: Data / Knowledge Readiness | Phase C Data | source owner, entitlement, freshness, lineage and retention 是否足够支撑 AI? | approve data route, restrict scope, require remediation |
| Gate 3: Solution Architecture | Phase C App / D Tech | AI pattern, platform building blocks, model route, tool boundary and security controls 是否一致? | approve architecture, approve with conditions, redesign |
| Gate 4: Eval and Risk | Phase E / G | eval 是否覆盖 intended use, prohibited use, critical failures and slices? | release to pilot, limited release, no-go |
| Gate 5: Production | Phase F / G | monitoring, runbook, evidence binder, support model and rollback 是否就绪? | production go, restricted go, manual fallback |
| Gate 6: Scale / Refresh | Phase H | 生产证据是否支持扩展, 调整, 降级或退役? | scale, refresh, restrict, retire |
3.3 ADM Deliverables Reframed For AI
| Traditional EA deliverable | AI EA deliverable | 高级表达重点 |
|---|---|---|
| Architecture Vision | AI transformation thesis and value stream scope | 用业务结果和风险偏好限定 AI 投资边界 |
| Business Architecture | AI capability map and human accountability model | AI 不是功能, 是组织能力和责任分配的变化 |
| Data Architecture | governed data / knowledge architecture | RAG source, semantic layer, entitlement and evidence are architecture assets |
| Application Architecture | AI product and platform service architecture | application, orchestration, model gateway, RAG, tool gateway, policy and eval 的职责分离 |
| Technology Architecture | AI runtime and control plane architecture | runtime reliability, observability, model route, secrets, egress and kill switch |
| Architecture Roadmap | capability increments plus architecture runway | 先建可复用 runway, 再按风险扩展 use cases |
| Implementation Governance | release gate and conformance evidence | 架构治理必须能阻断 release, 而不是只提出建议 |
| Change Management | telemetry-driven architecture refresh | model drift, policy change, incident and adoption signal 触发 ADM 再循环 |
4. ArchiMate AI 元模型映射
ArchiMate 的价值不是画得比 C4 更复杂, 而是能把 motivation, strategy, business, application, technology and implementation / migration 放进同一张 enterprise model. AI EA 应避免发明一套孤立的 AI 符号系统. 更好的做法是用 ArchiMate 原生层表达 AI 资产, 并用命名规范标注 AI-specific concerns.
4.1 Motivation Layer
| ArchiMate concept | AI EA interpretation | Example |
|---|---|---|
| Stakeholder | 对 AI 结果, 风险或控制有决策权或 concern 的角色 | COO, CDAO, Retail Banking Head, Model Risk, Compliance, Audit, Branch Manager |
| Driver | 推动 AI 转型的外部或内部压力 | cost-to-serve pressure, fraud complexity, regulatory scrutiny, service inconsistency |
| Assessment | 对当前能力, 风险或绩效的评价 | AML investigation cycle time high, policy answer inconsistency, RAG source freshness weak |
| Goal | 期望达成的目标 | reduce case handling time, improve complaint escalation, improve audit reconstruction |
| Outcome | 可观察的结果 | first contact resolution improved, analyst rework reduced, release evidence complete |
| Requirement | AI 架构必须满足的要求 | all customer-visible AI answers require source citation and human confirmation |
| Constraint | 不可突破的边界 | no automated credit decline, no model direct write to core banking, no restricted data to public model |
| Principle | 持久性设计原则 | human accountability by design, evidence by design, policy before tool execution |
4.2 Strategy Layer
| ArchiMate concept | AI EA interpretation | Example |
|---|---|---|
| Capability | 企业稳定拥有的业务或平台能力 | AI-assisted servicing, evidence-grounded investigation, governed model routing |
| Resource | 支撑能力的资源 | approved model pool, policy knowledge base, eval dataset, AI platform team |
| Course of Action | 为实现目标采取的战略路线 | build shared AI control plane, federate domain knowledge, pilot before scale |
| Value Stream | 价值交付过程 | Resolve customer complaint, Investigate AML alert, Originate consumer loan |
AI capability 不要用 vendor 名或项目名. 推荐命名:
| Bad capability naming | Better capability naming |
|---|---|
| ChatGPT for Service | AI-assisted customer servicing |
| Vector DB platform | Evidence-grounded knowledge retrieval |
| Copilot rollout | Role-specific AI work augmentation |
| Agent framework | Governed tool and action orchestration |
| Dashboard project | AI portfolio and control observability |
4.3 Business Layer
| ArchiMate concept | AI EA interpretation | Example |
|---|---|---|
| Business Actor | 业务组织或外部主体 | Customer, Contact Center, AML Operations, Credit Underwriting |
| Business Role | 执行责任的角色 | Service Agent, Supervisor, Analyst, Underwriter, Product Owner |
| Business Process | 被 AI 增强或控制的流程 | complaint handling, AML alert triage, loan document review |
| Business Service | 对外或对内提供的业务服务 | customer support, financial crime investigation, credit decision support |
| Business Object | 业务信息对象 | complaint case, policy article, transaction alert, loan application |
| Contract | 业务约束或承诺 | customer communication policy, model risk policy, data sharing agreement |
AI 相关 business modeling 重点:
- AI output 是 business object 的草稿, 证据包, 建议, 解释, 分类, 还是执行请求.
- Business role 中必须显示 final decision owner, reviewer, approver, operator and auditor.
- Business process 中必须显示 human checkpoint, escalation, exception and fallback.
- Business service 不能把 "AI answer" 写成黑箱, 要说明客户或员工真正获得的业务服务.
4.4 Application Layer
| ArchiMate concept | AI EA interpretation | Example |
|---|---|---|
| Application Component | 可部署或可管理的应用组件 | AI orchestration service, RAG service, tool gateway, policy engine, eval service |
| Application Service | 对业务或其他应用提供的服务 | cited answer generation, policy compliance check, tool dry-run service |
| Application Function | 组件内部能力 | retrieval, reranking, prompt assembly, schema validation, approval routing |
| Data Object | 应用层数据对象 | prompt record, trace event, source chunk, eval case, action ledger entry |
| Application Interface | API, event, UI 或服务接口 | model gateway API, retrieval API, policy decision API, eval result event |
应用层建模的 AI rule:
| Rule | 原因 |
|---|---|
| Model gateway 建成 Application Component, 不把外部 LLM 直接画到每个 app | 管 allowlist, route, version, cost, fallback and telemetry |
| RAG service 建成 Application Component, 不把 vector DB 等同为 RAG 架构 | RAG 包含 ingestion, entitlement, retrieval, citation, no-answer and eval |
| Tool gateway 建成 Application Component | 控制 schema, policy, dry run, approval, idempotency and action ledger |
| Eval service 建成 Application Component | release gate 需要可重复执行, 可版本化, 可审计 |
| Evidence binder 建成 Application Component 或 Repository | evidence 是运行和治理资产, 不是项目文件夹 |
4.5 Technology Layer
| ArchiMate concept | AI EA interpretation | Example |
|---|---|---|
| Technology Service | 基础设施或平台服务 | inference runtime, vector search hosting, telemetry pipeline, secrets management |
| Node | 运行节点 | private model cluster, cloud AI service boundary, regional data processing node |
| System Software | 平台软件 | orchestration runtime, policy-as-code runtime, model serving runtime |
| Technology Interface | 技术接口 | private endpoint, service mesh, telemetry collector, egress gateway |
| Artifact | 可部署或可版本化资产 | prompt template package, policy bundle, model adapter, tool schema package |
技术层建模要显式表达:
- deployment boundary: on-prem, private cloud, public model API, region boundary.
- security boundary: secrets, egress, DLP, service identity, network segmentation.
- operational boundary: SLO, fallback, capacity, latency, cost quota.
- change boundary: model version, prompt version, policy bundle, retrieval index, tool schema.
4.6 Implementation And Migration Layer
| ArchiMate concept | AI EA interpretation | Example |
|---|---|---|
| Work Package | 一组可交付的转型工作 | Build model gateway MVP, launch service RAG pilot, implement tool approval ledger |
| Deliverable | 工作包交付资产 | architecture package, eval report, policy control pack, release memo |
| Implementation Event | 关键里程碑 | pilot start, production gate, model migration, audit review |
| Plateau | 稳态架构阶段 | Fragmented pilots, Controlled pilots, Governed production, Enterprise reuse |
| Gap | 两个 plateau 之间的缺口 | no shared eval service, no evidence binder, missing entitlement-aware retrieval |
AI EA 的 roadmap 不应只是项目排期. 它应该表达 plateau:
Plateau 0: fragmented AI experiments
gap: no inventory, no model gateway, no eval baseline
Plateau 1: controlled AI pilots
new capabilities: risk tier, use case intake, prompt registry, basic eval
Plateau 2: governed production AI
new capabilities: model gateway, RAG entitlement, tool gateway, release gate, observability
Plateau 3: reusable enterprise AI platform
new capabilities: portfolio governance, evidence binder, policy-as-code, continuous evaluation
4.7 Relationship Pattern
推荐的 ArchiMate 关系链:
Driver
-> Goal
-> Outcome
-> Capability
-> Value Stream
-> Business Process
-> Business Service
-> Application Service
-> Application Component
-> Technology Service
-> Work Package
-> Deliverable
-> Evidence
示例:
Driver: rising complaint handling cost
-> Goal: improve complaint resolution consistency
-> Outcome: reduce repeat contacts and policy errors
-> Capability: AI-assisted customer servicing
-> Value Stream: Resolve customer complaint
-> Business Process: assess complaint and draft response
-> Business Service: customer complaint resolution support
-> Application Service: cited policy answer generation
-> Application Component: governed RAG service
-> Technology Service: hybrid retrieval and inference runtime
-> Work Package: service RAG pilot release
-> Deliverable: eval report, release memo, evidence binder entry
5. Architecture Repository For AI EA
Architecture Repository 是 AI EA 的操作系统. 如果 repository 只存文档, 它无法治理 AI. 它必须连接 standards, building blocks, decisions, requirements, controls, evidence and operational telemetry.
5.1 Repository Domains
| Repository domain | AI EA contents | 作用 |
|---|---|---|
| Architecture Metamodel | AI-specific modeling conventions, naming rules, relationship rules | 统一 capability, use case, model, data, tool, policy and evidence 的表达 |
| Architecture Capability | AI architecture charter, roles, RACI, review cadence, skill matrix | 说明组织如何做 AI EA |
| Architecture Landscape | baseline, target and transition architectures | 说明当前 AI 实验, 目标平台, 迁移 plateau |
| Standards Information Base | AI principles, model policy, RAG standard, tool action policy, eval standard | 规定什么是默认做法 |
| Reference Library | reference architectures, pattern catalog, control patterns, sample packages | 复用解决方案和设计经验 |
| Governance Log | review decisions, waivers, exceptions, risk acceptances, architecture board records | 证明决策链和例外管理 |
| Architecture Requirements Repository | strategy, capability, use case, control and release requirements | 维护 traceability |
| Solutions Landscape | deployed AI products, platform services, integrations, dependencies | 管生产实例和影响范围 |
| Evidence Binder | intake, ADR, eval, approval, trace sample, monitoring, incident, management review | 支撑 audit, model risk, compliance and portfolio review |
5.2 AI Repository Minimum Metadata
| Asset | Required metadata |
|---|---|
| AI use case | use_case_id, owner, business domain, value stream, capability, risk tier, status, release scope |
| AI capability | capability_id, owner, maturity level, target maturity, KPIs, architecture dependencies |
| Model route | model_id, provider, deployment boundary, allowed use, prohibited data, version, fallback |
| Prompt asset | prompt_id, owner, use case, version, eval baseline, approved status, change record |
| Knowledge source | source_id, owner, jurisdiction, effective date, expiry, classification, entitlement, freshness SLA |
| Tool | tool_id, owner, action tier, schema version, allowed workflows, approval rules, rollback pattern |
| Policy control | control_id, requirement, policy version, enforcement point, test evidence, exception rule |
| Eval contract | eval_id, intended use, prohibited use, metrics, thresholds, critical failures, slices, judge method |
| Release memo | release_id, scope, versions, eval result, residual risk, approvals, monitoring plan, rollback trigger |
| Evidence item | evidence_id, source system, artifact type, linked decision, owner, retention, audit export path |
5.3 Repository Views
| View | Question answered |
|---|---|
| Capability to Use Case View | 哪些 use cases 支撑哪些 capabilities, 哪些能力还只是实验? |
| Use Case to Platform View | 某场景复用了哪些 model, RAG, tool, policy, eval and observability building blocks? |
| Model Impact View | 某个模型或供应商变化会影响哪些 use cases, prompts and release gates? |
| Knowledge Impact View | 某个政策文档过期会影响哪些 RAG outputs, eval cases and customer journeys? |
| Tool Action Risk View | 哪些 tools 能修改客户, 账户, 资金, case status or regulatory output? |
| Control Coverage View | high-risk use cases 是否都有对应 controls, tests and evidence? |
| Exception Aging View | 哪些 architecture exceptions 即将到期, 哪些仍未关闭? |
| Evidence Completeness View | 哪些 releases 缺少 eval, approval, monitoring or incident evidence? |
5.4 Architecture Building Blocks And Solution Building Blocks
| ABB | SBB examples | Governance rule |
|---|---|---|
| Governed Model Access | model gateway, provider adapter, fallback route | 所有生产模型调用必须经过 approved route |
| Evidence-Grounded Retrieval | source registry, ingestion pipeline, entitlement filter, hybrid retriever | RAG 必须有 source owner, freshness and citation evidence |
| Controlled Tool Execution | tool gateway, schema validator, dry-run executor, approval router, action ledger | 模型不能直接持有业务系统 token |
| AI Policy Enforcement | policy engine, policy-as-code bundle, decision log | 高风险动作必须经过 runtime policy decision |
| EvalOps And Release Gate | eval runner, golden set, red-team pack, release scorecard | release decision 不能只依赖 demo 或 UAT |
| AI Observability | trace schema, telemetry collector, quality dashboard, replay lab | trace 必须串联 identity, model, prompt, retrieval, tool, policy, output |
| Governance Evidence | evidence binder, release memo, exception log, management review dashboard | 证据是架构运行产物, 不是上线后补材料 |
6. 金融零售案例: AI Customer Intelligence And Control Plane Transformation
6.1 背景
一家区域性金融零售集团希望把三个高优先级 AI 场景从 PoC 推进到受控生产:
| Domain | Use case | Business pressure | AI risk |
|---|---|---|---|
| Customer Service | 客服政策问答和回复草稿 | handling time 高, 政策回答不一致 | 客户可见错误承诺, 投诉升级遗漏, 过期政策引用 |
| Credit | 信贷资料摘要和政策匹配 | 审批周期长, 资料复核重复 | 客户权益影响, fairness, reason code and adverse action risk |
| AML | alert triage summary 和调查 narrative 草稿 | case backlog, analyst rework | 合规证据, SAR narrative quality, 不得自动关闭高风险 case |
EA 决策不是 "三个团队各建一个 AI app", 而是:
- 用 capability portfolio 合并共性能力.
- 用 ADM cycle 为每个场景建立 risk-tiered release path.
- 用 ArchiMate 表达业务能力, 应用服务, 平台服务, work packages and plateaus.
- 用 Architecture Repository 管所有 decisions, controls and evidence.
6.2 Capability Portfolio
| Capability | Customer Service | Credit | AML | Shared runway |
|---|---|---|---|---|
| Evidence-grounded knowledge assistance | policy Q&A, cited answer | credit policy matching | AML typology and procedure lookup | source registry, entitlement-aware RAG, citation eval |
| AI-assisted case summarization | complaint summary | application document summary | alert and transaction pattern summary | summarization eval, prompt registry, trace schema |
| Governed tool and action orchestration | refund draft, CRM note | checklist update, memo draft | case note draft | tool gateway, action tier, approval ledger |
| Human decision augmentation | agent review | underwriter review | analyst disposition | HITL workflow, override reason, QA sampling |
| AI release and evidence management | service release memo | model risk evidence | compliance evidence | eval service, evidence binder, management review |
6.3 ADM Application
| ADM phase | Case-specific work | Output |
|---|---|---|
| Preliminary | Establish AI governance forum and repository taxonomy | AI EA charter, RACI, evidence binder schema |
| A | Define transformation thesis: improve regulated customer and case work with governed AI | architecture vision, risk appetite, no-go boundaries |
| B | Map value streams: Resolve complaint, Originate loan, Investigate alert | capability heatmap, human accountability map |
| C Data | Define source registry for product policy, credit policy, AML policy, case data | knowledge cards, entitlement model, data contracts |
| C Application | Define reusable model gateway, RAG service, tool gateway, policy engine, eval service | application component map, service catalog |
| D | Define runtime boundary, private endpoints, telemetry, DLP, secrets, model routes | technology view, security and observability controls |
| E | Select solution options: shared control plane plus domain-specific workflows | ADR set, option matrix, work package backlog |
| F | Sequence transition plateaus from controlled pilots to production reuse | migration roadmap, funding gates |
| G | Run architecture conformance and eval gates before each release | review record, eval report, release memo |
| H | Use incidents, drift, policy changes and adoption telemetry to refresh roadmap | quarterly portfolio review, change impact analysis |
6.4 ArchiMate View Package
Motivation View
Driver: service inconsistency, credit cycle time, AML backlog
Goal: improve regulated work quality and efficiency with governed AI
Constraint: no autonomous credit decline, no direct model write to core systems
Requirement: all high-risk AI outputs require evidence and human confirmation
Strategy View
Capability: AI-assisted customer servicing
Capability: AI-assisted credit operations
Capability: AI-assisted financial crime operations
Capability: governed AI platform control plane
Business View
Value Stream: Resolve customer complaint
Value Stream: Originate consumer loan
Value Stream: Investigate AML alert
Business Role: Agent, Supervisor, Underwriter, AML Analyst, Model Risk Reviewer
Application View
Application Component: Contact Center AI Assistant
Application Component: Credit Policy Assistant
Application Component: AML Investigation Copilot
Application Component: Model Gateway, RAG Service, Tool Gateway, Policy Engine, Eval Service, Evidence Binder
Technology View
Technology Service: inference runtime, hybrid retrieval hosting, telemetry pipeline, secrets management, DLP, private endpoint
Implementation View
Plateau 1: controlled pilots
Plateau 2: governed production
Plateau 3: reusable AI platform
Work Package: build model gateway, build entitlement-aware RAG, implement eval gate, implement evidence binder
6.5 Release Gate Example
| Gate area | Customer Service | Credit | AML |
|---|---|---|---|
| Intended use | draft and cited answer | document summary and policy support | analyst summary and narrative draft |
| Prohibited use | autonomous refund approval or legal commitment | automated approve / decline | automated close, SAR final submission |
| Critical failures | unsupported claim, stale policy, missing complaint escalation | incorrect policy, protected class leakage, invalid reason code | missing suspicious pattern, fabricated evidence, auto-close attempt |
| Required human role | service agent and supervisor for high-risk | underwriter final decision | analyst final disposition |
| Eval evidence | citation correctness, tone, escalation, refusal | document fidelity, policy match, fairness slice | typology coverage, evidence grounding, compliance narrative quality |
| Monitoring | complaint, recontact, override, QA defects | override, appeal, adverse action review | rework, QA finding, regulatory issue |
| Evidence binder | release memo, eval report, trace sample, policy approval | model risk package, release memo, data evidence | compliance approval, eval report, analyst review evidence |
6.6 Architecture Roadmap
| Plateau | Capability state | Platform runway | Governance evidence |
|---|---|---|---|
| Plateau 0: Fragmented PoCs | teams test AI separately | direct model API, local prompts, manual spreadsheets | scattered documents, no traceability |
| Plateau 1: Controlled pilots | three use cases registered and risk-tiered | prompt registry, source registry, basic eval, model gateway MVP | intake, risk tier, pilot eval, limited release memo |
| Plateau 2: Governed production | selected workflows enter production with HITL | entitlement RAG, tool gateway, policy decision log, observability | release memo, trace samples, monitoring dashboard, exception log |
| Plateau 3: Enterprise reuse | shared AI capabilities reused across domains | reusable control plane, evidence binder, policy-as-code, continuous eval | management review, audit export, portfolio evidence, retirement criteria |
7. Templates
7.1 AI ADM Cycle Canvas
# AI ADM Cycle Canvas
## 1. Architecture Mandate
- business_domain:
- architecture_sponsor:
- capability_owner:
- product_owner:
- enterprise_architect:
- risk_owner:
- compliance_owner:
- release_authority:
## 2. Architecture Vision
- strategic_theme:
- value_stream:
- business_outcomes:
- risk_appetite:
- no_go_boundaries:
## 3. Business Architecture
- target_capabilities:
- current_maturity:
- target_maturity:
- value_stream_stages:
- human_accountability:
- operating_model_changes:
- adoption_metrics:
## 4. Data And Knowledge Architecture
- data_sources:
- knowledge_sources:
- source_owners:
- classification:
- entitlement_model:
- freshness_sla:
- lineage_and_retention:
- evidence_requirements:
## 5. Application Architecture
- ai_product_boundary:
- workflow_insertion_point:
- application_services:
- model_route:
- rag_services:
- tool_services:
- policy_services:
- eval_services:
- integration_contracts:
## 6. Technology Architecture
- deployment_boundary:
- inference_runtime:
- retrieval_runtime:
- identity_and_secrets:
- network_and_egress:
- telemetry_schema:
- resilience_and_fallback:
- cost_controls:
## 7. Opportunities And Solutions
- option_a:
- option_b:
- recommended_option:
- rationale:
- architecture_decisions:
- reusable_building_blocks:
- one_off_exceptions:
## 8. Migration Planning
- transition_plateaus:
- work_packages:
- release_sequence:
- dependencies:
- funding_gates:
- pilot_scope:
- production_scope:
- scale_conditions:
## 9. Implementation Governance
- conformance_checks:
- eval_contract:
- critical_failures:
- release_gate:
- exception_process:
- residual_risk_acceptance:
- rollback_trigger:
## 10. Change Management
- monitoring_signals:
- incident_triggers:
- model_change_process:
- policy_change_process:
- repository_update_rules:
- quarterly_review_questions:
7.2 ArchiMate AI Layer Map
# ArchiMate AI Layer Map
| Layer | Model elements | AI-specific content | Evidence |
|---|---|---|---|
| Motivation | stakeholder, driver, assessment, goal, outcome, requirement, constraint, principle | risk appetite, AI boundaries, human accountability, responsible AI principles | stakeholder concern matrix, architecture principles |
| Strategy | capability, resource, course of action, value stream | AI capability portfolio, shared AI runway, value stream impact | capability heatmap, roadmap thesis |
| Business | business actor, role, process, service, object, contract | AI-assisted workflow, final decision owner, review and approval roles | process signoff, RACI, operating procedure |
| Application | component, service, function, interface, data object | orchestration, model gateway, RAG, tool gateway, policy, eval, evidence binder | service catalog, API contract, trace schema |
| Technology | node, technology service, system software, artifact | inference runtime, retrieval runtime, telemetry, DLP, network boundary, policy runtime | deployment view, security control evidence |
| Implementation / Migration | work package, deliverable, event, plateau, gap | pilot, production gate, platform runway, model migration, policy rollout | release memo, work package evidence, transition roadmap |
7.3 Architecture Package
# AI Architecture Package
## Package Identity
- package_id:
- use_case_id:
- capability_id:
- release_id:
- risk_tier:
- status:
## Executive Architecture Summary
- business_problem:
- intended_use:
- prohibited_use:
- recommended_architecture:
- key_tradeoffs:
- residual_risks:
## View Package
- motivation_view:
- capability_view:
- value_stream_view:
- business_process_view:
- application_component_view:
- data_knowledge_view:
- tool_action_view:
- policy_control_view:
- eval_release_view:
- observability_evidence_view:
- migration_view:
## Decision Set
| ADR | Decision | Alternatives | Rationale | Consequence | Evidence |
|---|---|---|---|---|---|
## Control Set
| Control | Enforcement point | Test method | Owner | Evidence |
|---|---|---|---|---|
## Release Set
| Release condition | Required evidence | Approver | Decision |
|---|---|---|---|
7.4 Stakeholder Review Pack
# AI Stakeholder Review Pack
## Review Context
- review_date:
- architecture_package:
- decision_requested:
- release_scope:
- risk_tier:
| Stakeholder | Review concern | View required | Evidence required | Decision right |
|---|---|---|---|---|
| Business Sponsor | value, adoption, operating impact | capability view, value stream view | baseline, benefit model, adoption plan | approve business scope |
| Product Owner | workflow fit, user experience, feedback loop | process view, runtime view | pilot plan, user acceptance evidence | approve product behavior |
| Enterprise Architect | architecture fit, reuse, technical debt | ArchiMate layer map, application view, roadmap | ADR, repository impact analysis | approve architecture conformance |
| Solution Architect | component responsibility, integration, failure modes | application component view, sequence view | API contract, runbook | approve solution design |
| Data Owner | source, quality, entitlement, retention | data and knowledge view | source card, data contract, retrieval test | approve data use |
| Security | identity, secrets, egress, DLP, access | technology view, security view | control test, threat review | approve security controls |
| Model Risk | intended use, eval, monitoring, drift | eval release view | eval report, monitoring plan | approve model risk position |
| Compliance | customer impact, policy compliance, recordkeeping | policy control view, evidence view | policy decision samples, release memo | approve compliance conditions |
| Operations | support, incident, fallback, training | operating view, observability view | runbook, training evidence | accept production operations |
| Audit | traceability, evidence completeness, exception handling | evidence view, repository view | evidence binder export | review audit readiness |
7.5 Portfolio Evidence
# AI Portfolio Evidence
## Portfolio Thesis
- strategic_theme:
- priority_value_streams:
- target_capabilities:
- risk_appetite:
- investment_horizon:
## Capability Evidence
| Capability | Owner | Current maturity | Target maturity | Value stream | KPI | Runway dependency |
|---|---|---:|---:|---|---|---|
## Use Case Evidence
| Use case | Capability | Risk tier | Pattern | Release status | Eval status | Evidence status |
|---|---|---|---|---|---|---|
## Platform Runway Evidence
| Runway item | Consumers | Owner | Status | Evidence | Reuse signal |
|---|---|---|---|---|---|
## Governance Evidence
| Decision | Forum | Date | Conditions | Evidence | Review cadence |
|---|---|---|---|---|---|
## Portfolio Decisions
- continue:
- scale:
- restrict:
- merge:
- retire:
- fund_runway:
7.6 ADM To AI RMF Crosswalk
| ADM concern | NIST AI RMF connection | AI EA evidence |
|---|---|---|
| Architecture mandate | Govern | AI EA charter, RACI, policy register |
| Business context | Map | value stream, impact assessment, stakeholder concern matrix |
| Risk and quality measurement | Measure | eval contract, test set, slices, monitoring metrics |
| Risk response and operation | Manage | release decision, risk treatment, incident runbook, rollback trigger |
| Architecture change | Govern / Manage | management review, policy refresh, repository update |
7.7 ADM To ISO 42001 Crosswalk
| ADM concern | ISO 42001-style management system concern | AI EA evidence |
|---|---|---|
| Organizational context | AI objectives, interested parties, scope | AI transformation thesis, stakeholder map, AI inventory scope |
| Planning | AI risks and opportunities, objectives, actions | risk tier model, capability roadmap, funding gates |
| Support | resources, competence, awareness, documented information | AI EA roles, training plan, repository structure |
| Operation | operational planning and control | release gates, policy controls, tool action controls, runbook |
| Performance evaluation | monitoring, measurement, internal review, management review | eval reports, monitoring dashboard, management review pack |
| Improvement | nonconformity, corrective action, continual improvement | incident postmortem, regression cases, architecture refresh |
8. Architecture Review Questions
Strategy And Capability
- Which enterprise capability is this AI initiative improving?
- Which value stream stage changes because of AI?
- Which business outcome, risk outcome and adoption outcome define success?
- Is this a capability increment, a one-off feature, or a platform runway item?
- Who owns the capability after production release?
Architecture Fit
- Does the solution reuse approved model, RAG, tool, policy, eval and observability building blocks?
- Which exceptions are being requested, and when do they expire?
- What changes in application portfolio, data portfolio and technology portfolio?
- Which transition plateau does this release move the enterprise toward?
- Does the architecture package link to repository assets and evidence?
AI Risk And Controls
- What is the intended use and what is explicitly prohibited?
- What data enters prompt, retrieval, model provider, logs and evidence store?
- Which tool actions are read-only, draft, low-risk write, customer-impacting or prohibited?
- Where is policy enforced at runtime?
- Which critical failures block release?
- Which human role confirms, approves, overrides or rejects the AI output?
Evidence And Operations
- Can production trace replay identity, purpose, prompt, model, retrieval, tool, policy, output and approval?
- Does the release memo show eval result, residual risk, approvers and rollback trigger?
- Are monitoring signals tied to the original use case risk?
- Are incidents converted into eval cases, policy changes or architecture changes?
- Can audit export an evidence binder without reconstructing the story manually?
9. Anti-Patterns
| Anti-pattern | Symptom | Better pattern |
|---|---|---|
| ADM as document theater | phases produce documents but do not drive release decisions | use ADM phases as funding, architecture, eval and release gates |
| ArchiMate as drawing exercise | model has many elements but no decision trace | model from stakeholder concerns and link to repository evidence |
| Technology-only AI reference architecture | app, LLM, vector DB and APIs are shown, but capability and governance are absent | add capability, value stream, policy, eval, evidence and migration views |
| Use case zoo | many PoCs, no shared runway | consolidate use cases into capability portfolio and platform ABBs |
| RAG as enterprise strategy | every AI problem becomes document search | choose AI pattern based on decision, workflow, risk and evidence needs |
| HITL as decoration | human clicks approve without authority, evidence or accountability | define final decision role, approval criteria, override reason and QA calibration |
| Eval as QA appendix | eval appears after build and cannot block release | define eval contract during architecture option phase |
| Control as policy text | controls exist in slides but not runtime | enforce policy through gateway, decision log, approval and telemetry |
| Repository as file dump | documents stored, relationships lost | maintain metadata and traceability across capability, use case, model, data, tool, control, release |
| Architecture board as bottleneck | every AI idea waits for manual review | risk-tiered gates, pre-approved patterns and exception management |
| Evidence after release | audit evidence assembled manually after incident | evidence by design from intake through monitoring |
10. Interview Expression
10.1 30 秒版本
我会把 TOGAF ADM 改造成 AI capability, use case, platform and governance 的连续循环. Phase A 和 B 负责把战略主题转成 capability portfolio; Phase C 和 D 负责把 data, application, model, tool, policy, eval and observability 建成目标架构; Phase E 和 F 负责 option, runway and migration; Phase G 和 H 负责 release gate, conformance, monitoring and architecture refresh. ArchiMate 用来表达这些元素之间的关系, Architecture Repository 用来保存决策和证据.
10.2 2 分钟版本
我不会用 TOGAF 重新讲一遍 ADM 阶段. 在 AI enterprise architecture 里, 我会把 ADM 当作一个治理循环.
第一, 在 Architecture Vision 和 Business Architecture 阶段, 我先确定 AI transformation thesis, priority value streams, capability gaps and risk appetite. 这一步把 AI 从 use case list 升级成 capability portfolio.
第二, 在 Data, Application and Technology Architecture 阶段, 我定义 governed data / knowledge architecture, AI application services, model gateway, RAG service, tool gateway, policy engine, eval service, observability and evidence binder. 这里的重点不是组件堆叠, 而是把模型, 数据, 工具和控制边界分清楚.
第三, 在 Opportunities, Migration and Implementation Governance 阶段, 我用 option matrix 和 ADR 做 build / buy / platform 取舍, 用 architecture runway 排 transition plateaus, 用 eval contract 和 release memo 决定 pilot, production or restrict.
第四, 在 Change Management 阶段, 我把 production telemetry, incidents, model changes, policy changes and adoption signals 回灌到 repository, 触发架构刷新.
ArchiMate 负责把 motivation, capability, business process, application service, technology service and work package 连起来. 比如一个客服 AI 场景不只是一个 assistant, 它对应 driver, goal, capability, value stream, business role, RAG application service, model gateway, telemetry pipeline, work package and release evidence. 这样 EA 才能管理投资, 风险, 平台复用和审计证据.
10.3 Chief Architect 版本
如果我是 Chief Architect, 我会把企业 AI 架构定义为一个 governed capability system, 而不是一个 AI application stack. 我的架构目标有三层:
- Strategy control: AI 投资必须映射到 enterprise capability, value stream and measurable outcome.
- Runtime control: 模型, 数据, RAG, 工具动作, policy, eval and observability 必须经过统一或标准化控制点.
- Evidence control: 每个 release 必须能证明 intended use, risk tier, eval result, residual risk acceptance, monitoring and rollback path.
TOGAF 给我生命周期和治理结构, ArchiMate 给我跨层建模语言, NIST AI RMF 和 ISO 42001 给我风险与管理系统语言. 组合起来, 它们让 AI 从 PoC 治理成可扩展, 可运营, 可审计的企业能力.
10.4 AI Product Architect 版本
作为 AI Product Architect, 我会用 EA 方法避免产品路线图变成 feature list. 每个 AI 产品能力都要连接到 capability outcome, workflow insertion point, human accountability, eval gate and monitoring signal.
例如客服 AI, 我不会只写 "生成回复草稿". 我会把它建模为 AI-assisted customer servicing capability, 嵌入 Resolve complaint value stream, 由 Agent 负责最终客户回复, Supervisor 审批高风险承诺, RAG 服务提供有权限的政策引用, policy engine 检查投诉升级和禁止话术, eval gate 阻断 unsupported claim, evidence binder 保存 release and production trace.
这样产品路线图就能和架构 runway, 风险控制, adoption telemetry and portfolio funding 对齐.
10.5 追问准备
| Question | Answer signal |
|---|---|
| 为什么不直接用技术参考架构管理 AI? | 技术参考架构说明组成, 但不管理 capability investment, risk tier, funding gate, release evidence and change lifecycle. AI EA 必须覆盖战略到运行证据. |
| ADM 会不会太重? | 不应该按阶段制造文档. 我会用 risk-tiered ADM tailoring: 低风险用轻量 package, 高风险客户影响或监管场景必须走完整 gate and evidence. |
| ArchiMate 在 AI 中有什么具体价值? | 它把 business capability, application services, technology services, work packages and goals 连起来, 可以分析模型变更, 数据源过期, tool action 或 platform exception 对业务能力的影响. |
| Architecture Repository 应该先建什么? | 先建 AI inventory, use case metadata, model route registry, source registry, tool catalog, eval contract registry and release evidence index. 不需要一开始追求完整工具链. |
| 如何证明 AI 架构可审计? | 看 evidence binder 是否能串联 intake, risk tier, stakeholder concerns, ADR, data approval, eval result, release decision, production trace, incident and management review. |
| 平台 runway 如何避免脱离业务? | 每个 runway item 必须有 named consumers, capability dependency, release blocker, owner, evidence and reuse signal. 没有消费者的平台能力不进入优先投资. |
11. Final Mental Model
TOGAF ADM 在 AI 时代的价值不是阶段名称, 而是把企业变化管理成可治理的生命周期.
ArchiMate 在 AI 时代的价值不是符号, 而是把战略动机, 能力, 流程, 应用, 技术和迁移状态放进同一张可分析模型.
Architecture Repository 在 AI 时代的价值不是存文档, 而是存关系, 决策, 控制, release and evidence.
一句话总结:
AI EA is the discipline of turning AI ambition into governed capabilities, reusable runway, risk-tiered releases, operating accountability and audit-ready evidence.