AI 底层逻辑 / 经典论文

AI Adoption Analytics：行为改变与价值兑现架构

以下来源用于组织 AI 风险管理、AI 管理体系、变更管理、可观测性、工程绩效和价值证据语言。本文是学习和作品集材料, 不构成法律、合规、审计或监管结论。

823 行ai-foundations/papers/151-ai-adoption-analytics-behavior-change-value-realization-architecture.md

AI Adoption Analytics / Behavior Change / Value Realization Architecture 解读

Target audience: Senior AI PM / AI Architect / Business Architect / CBAP-level BA / AI Transformation Lead / AI Value Office Lead / Financial Retail Operations Leader. Learning objectives: 建立一套能证明 AI 被真实采用、改变工作方式、改善流程结果并产生可持续价值的 evidence architecture, 而不是只报告 login、prompt count、seat activation 或 demo satisfaction。 Core question: AI 上线以后, 如何证明一线人员真的把它纳入 work-as-done, 行为和流程正在改变, 风险没有被转移到人工复核或客户伤害, 价值不是短期 novelty effect?

Source Anchors

Source	Link	本文采用的思想
NIST AI Risk Management Framework	https://www.nist.gov/itl/ai-risk-management-framework	用 Govern / Map / Measure / Manage 组织 AI adoption 证据、风险测量、持续监控和处置闭环
ISO/IEC 42001 AI management system	https://www.iso.org/standard/81230.html	用 AI management system 的 policy、objective、operation、performance evaluation、improvement 语言管理 adoption 和 value realization
Prosci ADKAR	https://www.prosci.com/blog/adkar-model	用 Awareness、Desire、Knowledge、Ability、Reinforcement 解释行为改变不是培训完成率
OpenTelemetry Documentation	https://opentelemetry.io/docs/	用 traces、metrics、logs、semantic conventions 的思路设计 AI adoption telemetry 和 workflow observability
DORA	https://dora.dev/	用 deployment frequency、lead time、change fail rate、time to restore 的思想连接 AI 产品变更、运营学习和工程系统健康
NIST AI RMF Playbook	https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook	用 AI RMF action-oriented language 组织 evidence collection、monitoring 和 review routines
FFIEC Management IT Handbook	https://ithandbook.ffiec.gov/it-booklets/management.aspx	用治理、风险识别、监控和报告语言对接金融机构管理层证据需求

一句话:

AI adoption analytics is not usage reporting; it is an evidence system that links exposure, trusted use, behavior change, workflow quality, control performance, unit economics and durable outcome realization.

1. Executive Summary

很多 AI 项目失败不是因为模型完全不可用, 而是因为企业无法证明以下链条成立:

AI shipped
  -> target users were exposed in real workflow
  -> users trusted it for the right tasks
  -> behavior changed in work-as-done
  -> process outcomes improved without hidden risk transfer
  -> benefits were realized after cost, review load and control overhead
  -> operating loop reinforced the new behavior

只看 usage 会产生严重误判:

Vanity signal	为什么危险	需要补充的证据
Prompt count 高	可能是用户反复修错、探索新奇或被迫使用	accepted suggestion rate、task completion、rework、sentiment、quality
MAU 高	可能只是登录, 没有进入关键工作步骤	workflow step coverage、active case penetration、decision influence
Seat activation 高	可能是 license 被分配, 行为未改变	cohort adoption、returning qualified use、manager reinforcement
Time saved survey	容易高估, 也可能转移到复核或投诉	cycle time、human review load、exception queue、quality and control evidence
Accuracy 提升	模型指标不能自动变成业务价值	operational lift、policy compliance、loss reduction、cost-to-serve

高级 AI PM / Architect / BA 的任务不是做一个 adoption dashboard, 而是定义:

什么算真正的 adoption event。
当前 work-as-done baseline 是什么。
AI 在流程中改变了哪个 decision、artifact、handoff 或 control。
价值指标如何被 cohort、阶段、风险、成本和质量修正。
低 adoption 是产品问题、流程问题、信任问题、激励问题、经理节奏问题还是 change saturation。
什么时候 scale, 什么时候 redesign, 什么时候 stop。

2. Target Audience and Role Expectations

Role	应该负责的问题	典型输出
Senior AI PM	AI 产品是否被目标用户持续、正确、愿意地纳入核心工作	adoption event taxonomy、behavior funnel、scale/stop memo
AI Architect	telemetry、identity、workflow、policy、evidence 和 outcome data 是否可追溯	adoption observability architecture、event schema、control trace
CBAP-level BA	真实流程、角色、规则、例外、阻力和价值泄漏是否被建模	work-as-done baseline、change impact map、stakeholder adoption analysis
Operations Lead	AI 是否改善队列、周期、质量、复核负担和客户体验	operating review pack、manager coaching loop、exception taxonomy
Risk / Control Partner	AI adoption 是否引入过度依赖、绕控、错误升级或审计盲区	control override log、human review evidence、risk acceptance record
Finance / Value Office	收益是否可归因、可复现、可规模化	benefits register、unit economics、value leakage analysis

成熟组织会把 adoption 作为跨职能证据系统, 而不是交给培训团队或产品分析师单独完成。

3. Thesis: Adoption 是行为改变, 不是使用量

AI adoption 的最小证明单元不是:

user clicked AI button

而是:

in a named workflow step, a target user used an AI output in a governed way that changed or improved the work artifact, decision, handoff, cycle time, quality, control performance or customer outcome.

这意味着 adoption 必须同时回答 7 个问题:

Question	解释
Who	目标用户是谁, 是新手、专家、经理、承包商、分支机构、区域团队还是中央运营
Where	在哪个流程、系统、队列、案例类型、客户旅程、风险等级中使用
What	AI 输出影响了摘要、建议、决策、下一步动作、客户回复、调查笔记还是控制证据
How	用户接受、修改、拒绝、覆盖、升级、重新生成还是绕开
Why	使用或拒绝的原因是什么, 是否与信任、质量、政策、激励、时间压力有关
So what	行为改变是否带来周期、质量、风险、成本或客户体验变化
For how long	变化是否跨 cohort、时间、经理、流程版本和模型版本持续存在

金融零售里的 AI adoption 往往是 workflow adoption:

AML investigator 是否使用 AI-generated case narrative 缩短 SAR prep 前的证据整理, 而不是只看打开 copilot 次数。
Contact-center agent 是否在受控话术边界内采用建议答案, 降低 hold time 和 repeat contact, 而不是只看 suggestion impressions。
KYC onboarding analyst 是否通过 AI document completeness check 减少 rework 和 customer chase, 而不是只看 OCR 调用量。
Credit ops analyst 是否使用 AI collateral summary 发现缺失条件并提升 first-pass approval quality, 而不是只看摘要生成数。
Branch / relationship manager 是否把 copilot insight 转化为合规的下一步客户行动, 而不是只看周活。

4. Conceptual Model: Adoption-to-Value Evidence Chain

建议使用 8 层 evidence chain:

1. Exposure
2. Qualified use
3. Trust-calibrated use
4. Behavior change
5. Workflow quality
6. Outcome movement
7. Value realization
8. Reinforced operating loop

Layer	关键问题	示例指标
Exposure	目标用户是否在真实工作中看到 AI	eligible users exposed、case coverage、workflow step availability
Qualified use	是否在目标任务中使用, 而不是随机探索	qualified action rate、task-matched AI invocation、returning use
Trust-calibrated use	用户是否在正确场景接受, 在不确定场景拒绝或升级	accept/edit/reject/override mix、escalation appropriateness、trust calibration
Behavior change	工作方式是否改变	artifact reuse、new sequence adoption、manual step removal、handoff change
Workflow quality	流程质量是否改善	first-pass quality、rework、exception rate、control defects
Outcome movement	业务结果是否变化	AHT、cycle time、STP、loss rate、conversion、complaints
Value realization	扣除成本和风险后是否产生收益	net benefit、cost-to-serve、human review load、value leakage
Reinforcement	组织是否强化新行为	manager coaching、SOP updates、performance cadence、feedback loop closure

这条链条的强度取决于最薄弱环节。一个 contact-center AI 可以有很高 exposure, 但如果 agents 只复制建议又触发投诉上升, 它不是成功 adoption, 而是过度依赖。

5. Work-as-Done Baseline

没有 baseline 的 adoption analytics 只是在讲上线故事。

Work-as-done baseline 必须捕捉真实工作, 包括非正式绕行、人工判断、系统切换、聊天求助和经理审批。BA 在这里的价值极高, 因为系统日志只记录 work-as-imagined 的一部分。

Baseline dimension	需要捕捉的内容	金融零售例子
Case mix	案例类型、风险等级、复杂度、渠道、区域	AML alert type、KYC entity type、credit exception class
Actor map	角色、经验层级、授权边界、经理干预	investigator L1/L2、branch RM、contact-center specialist
Activity sequence	实际步骤和系统切换	CRM -> policy search -> core banking -> notes -> supervisor chat
Decision points	哪些判断影响下一步	escalate, close, request document, approve with condition
Artifacts	产生哪些业务记录和客户沟通	investigation narrative、call note、KYC deficiency notice
Controls	哪些步骤是控制点	sanctions check, suitability disclosure, dual review
Pain points	等待、返工、缺信息、政策不清	document chase, duplicated note-taking, uncertain policy
Informal workarounds	用户实际用什么补系统短板	shared spreadsheet, saved templates, peer review chat
Current metrics	当前周期、质量、成本、队列、投诉	AHT, alert aging, first-pass yield, re-open rate

Baseline 不能只由访谈生成。推荐组合:

SME interview and observation。
Process mining 或 workflow log analysis。
Case note sampling。
Screen-flow / clickstream review, 在隐私和授权边界内。
Manager review cadence and coaching artifacts。
Exception queue and rework analysis。
Customer complaint and quality assurance sampling。

6. Adoption Event Taxonomy

Adoption event taxonomy 是整套架构的核心。它把“用户使用了 AI”拆成可解释、可控、可审计的事件。

Event class	Event examples	价值含义
Exposure	`ai_surface_shown`, `suggestion_presented`, `copilot_available_in_case`	用户有机会采用
Intent	`open_assistant`, `ask_policy_question`, `request_case_summary`	用户表达任务意图
AI output	`summary_generated`, `recommendation_returned`, `next_best_action_returned`	系统产生可用输出
Human response	`accepted`, `edited`, `rejected`, `ignored`, `regenerated`	初步信任和可用性
Decision influence	`used_in_case_note`, `used_in_disposition`, `used_in_customer_response`	AI 进入业务工件或决策
Control action	`override`, `escalate`, `dual_review_requested`, `policy_boundary_hit`	风险控制和信任校准
Learning signal	`feedback_positive`, `feedback_negative`, `reason_selected`, `defect_reported`	产品和模型学习
Outcome link	`case_closed`, `customer_contact_completed`, `document_deficiency_resolved`	连接到流程结果
Reinforcement	`manager_coached`, `sop_updated`, `job_aid_viewed`, `team_review_completed`	组织强化新行为

关键是区分:

AI touched the workflow
vs
AI changed the workflow
vs
AI improved the workflow

很多项目只能证明第一层, 却向管理层宣称第三层。

7. Reference Architecture

Adoption analytics 需要同时连接产品、流程、身份、模型、控制和业务结果。

AI Experience Layer
  -> Workflow Integration Layer
  -> Adoption Telemetry Layer
  -> Identity / Cohort / Entitlement Layer
  -> Process and Outcome Data Layer
  -> Risk / Control Evidence Layer
  -> Analytics and Attribution Layer
  -> Value Realization and Operating Review Layer

7.1 Architecture Components

Component	Responsibility	关键设计点
AI experience instrumentation	捕捉用户看到、询问、接受、修改、拒绝、反馈和升级	event taxonomy、low-friction reason capture、privacy minimization
Workflow context resolver	将 AI 事件绑定到 case、customer journey、process step、queue、risk tier	workflow id、case id、stage id、control point id
Identity and cohort service	定义 eligible users、role、team、region、experience、training、manager	cohort analysis、manager effect、change saturation
Model and prompt registry	记录模型、prompt、policy pack、tool version	adoption 与模型版本、release、eval 结果可关联
Outcome data connector	拉取流程和业务结果	AHT、cycle time、rework、STP、quality score、loss、complaint
Human review tracker	捕捉 review load 和 control override	reviewer time、queue depth、override reason、defect severity
Risk evidence store	保存 control hits、escalations、exceptions、audit trail	不替代合规判断, 但提供治理证据
Analytics workspace	funnel、cohort、attribution、leakage、saturation 分析	baseline comparison、matched cohorts、segment drilldown
Operating review pack	将指标转化为行动	manager coaching、product backlog、risk actions、scale/stop decision

7.2 Telemetry Design Principles

Principle	解释
Instrument workflow, not only UI	记录 AI 对业务步骤和工件的影响
Preserve context	没有 case type、risk tier、role、stage 的事件很难解释
Capture negative signals	rejection、ignore、override、regenerate、complaint 都是高价值证据
Minimize sensitive payload	事件记录应保存引用、分类和必要摘要, 避免不必要客户数据
Version everything	model、prompt、policy、workflow、training、SOP、feature flag 都要版本化
Connect to outcomes	adoption 事件必须能连接到流程质量和业务结果
Make evidence reviewable	指标需要能追溯到样本、定义、口径和责任人

8. Data / Telemetry Schema

下面是一个面向金融零售 AI copilot 的最小可用 adoption event schema。它不是数据库物理设计, 而是 BA / PM / Architect 对齐口径的 canonical event contract。

Field	Type	Description
event_id	string	全局唯一事件 ID
event_time	timestamp	事件发生时间
event_name	enum	taxonomy 中的事件名
event_class	enum	exposure / intent / output / response / influence / control / learning / outcome / reinforcement
user_id_hash	string	去标识化用户标识
role	enum	investigator / agent / analyst / RM / manager / QA / supervisor
team_id	string	团队或分支机构
manager_id_hash	string	去标识化 manager, 用于 manager effect 分析
cohort_id	string	pilot cohort、region、experience cohort、training cohort
workflow_id	string	AML investigation、KYC onboarding、credit ops、contact center 等
workflow_stage	string	triage、review、customer contact、decision、quality check
case_id_hash	string	去标识化 case reference
case_type	string	alert type、call reason、KYC entity type、credit exception
risk_tier	enum	low / medium / high / material
customer_segment	string	retail、SMB、wealth、branch、digital
ai_surface	string	embedded panel、inline suggestion、draft generator、policy search
model_id	string	model registry id
prompt_version	string	prompt or policy pack version
tool_ids	array	agent tools or connectors used
output_type	enum	summary / recommendation / draft / classification / next action / risk flag
confidence_band	enum	calibrated band if used; avoid false precision
user_action	enum	accept / edit / reject / ignore / regenerate / override / escalate
edit_distance_band	enum	none / light / material / rewrite
reason_code	enum	useful / inaccurate / incomplete / unsafe / policy_unclear / too_slow / not_relevant
control_point_id	string	linked control or policy boundary
override_reason	string	required when user overrides AI or control suggestion
human_review_required	boolean	是否需要人工复核
human_review_minutes	number	复核负担, 可后补汇总
downstream_artifact_id	string	note、letter、decision record、case narrative
outcome_event_id	string	linked process outcome event
latency_ms	number	用户可感知延迟
cost_estimate	number	token、tool、license 或单位成本估算
privacy_class	enum	event-only / sensitive-reference / restricted
retention_class	enum	analytics / business-record-link / control-evidence

8.1 Example Events

{
  "event_name": "ai_recommendation_edited",
  "event_class": "response",
  "workflow_id": "aml_alert_investigation",
  "workflow_stage": "case_narrative_draft",
  "role": "investigator",
  "case_type": "transaction_monitoring_alert",
  "risk_tier": "high",
  "output_type": "investigation_summary",
  "user_action": "edit",
  "edit_distance_band": "material",
  "reason_code": "incomplete",
  "control_point_id": "aml_secondary_review_required",
  "human_review_required": true
}

{
  "event_name": "suggested_answer_accepted",
  "event_class": "decision_influence",
  "workflow_id": "contact_center_agent_assist",
  "workflow_stage": "customer_response",
  "role": "agent",
  "case_type": "card_dispute_status",
  "risk_tier": "medium",
  "output_type": "customer_reply",
  "user_action": "accept",
  "edit_distance_band": "light",
  "reason_code": "useful",
  "outcome_event_id": "call_completed"
}

9. Metrics Hierarchy

Adoption analytics 需要指标层级, 否则团队会把最容易收集的 usage 当成价值。

Telemetry metrics
  -> Adoption metrics
  -> Behavior change metrics
  -> Flow / quality metrics
  -> Risk / control metrics
  -> Value metrics
  -> Durability metrics

Layer	指标	解释
Telemetry	event completeness、schema coverage、trace join rate	数据是否可信
Adoption	eligible-user active rate、qualified use rate、returning qualified use、case penetration	是否在目标人群和目标任务中使用
Behavior	accept/edit/reject mix、artifact reuse、manual step reduction、handoff change	行为是否改变
Flow / quality	cycle time、queue aging、first-pass quality、rework、QA defects	流程是否改善
Risk / control	override rate、escalation appropriateness、over-reliance signals、control defects	风险是否被控制
Value	net hours released、cost-to-serve、loss reduction、conversion lift、complaint reduction	是否产生收益
Durability	cohort retention、post-novelty persistence、manager variance、model version stability	是否可持续

9.1 Leading and Lagging Indicators

Type	Examples	用法
Leading	qualified use rate、accepted-with-light-edit rate、feedback density、manager coaching completion	判断 adoption 是否形成早期动能
Intermediate	first-pass quality、rework reduction、review queue depth、policy search time	判断行为是否改善流程
Lagging	cost reduction、loss reduction、revenue lift、complaint reduction、regulatory finding reduction	判断价值是否实现

不要把 leading indicator 当成 business case close。它只能说明值得继续观察或 scale candidate, 不能单独证明收益。

9.2 Anti-Metrics

Anti-metric	可能说明
Prompt count 上升但 cycle time 不降	用户在与系统搏斗
Accept rate 极高但 defect 上升	过度依赖或缺少复核
Reject rate 高且 reason 为 policy unclear	边界和话术不可信
Human review load 上升超过节省时间	价值泄漏到复核队列
Manager variance 极大	采用依赖局部 champion, 未制度化
初期提升 4 周后回落	novelty effect 或 reinforcement 不足

10. Behavior Funnel

行为漏斗把 adoption 从“看见”推进到“稳定改变”。

Funnel step	Definition	Drop-off diagnosis
Eligible	用户和案例符合目标场景	cohort 定义错误、entitlement 不完整
Exposed	AI 在正确工作步骤出现	UI / workflow integration 不到位
Engaged	用户主动打开或响应 AI	价值不明显、入口不自然、速度慢
Assisted	AI 输出被阅读并进入任务	输出不相关、格式不匹配
Influenced	用户接受、编辑后采用或用来决策	信任不足、政策边界不清、质量不稳
Completed	任务或工件完成	下游系统或审批卡住
Improved	周期、质量或风险指标改善	AI 只转移工作, 未改变瓶颈
Reinforced	经理、SOP、培训和绩效节奏支持新方式	adoption 依赖个人热情

对于 AML investigator adoption, 漏斗可能是:

eligible AML alerts
  -> alerts with copilot panel visible
  -> investigator requests alert summary
  -> summary used in investigation note
  -> note passes QA with no material correction
  -> alert aging reduced
  -> investigator returns to use in next high-risk case

11. Cohort Analysis

没有 cohort, adoption 指标会把不同人群、经理、风险、案例复杂度和培训波次混在一起。

Cohort dimension	为什么重要
Role / level	新手和专家采用 AI 的动机相反: 新手需要 guidance, 专家需要 speed and precision
Team / manager	经理 reinforcement 往往比培训更影响持续 adoption
Region / branch	本地政策、客群、绩效压力和容量约束不同
Case type / risk tier	低风险场景高 adoption 不代表高风险场景可 scale
Tenure / experience	AI 可能帮助新人缩短 ramp, 也可能让专家觉得干扰
Training wave	可区分产品改进和 enablement 改进
Model / prompt version	adoption 变化可能来自质量变化而不是 change program
Feature flag exposure	便于 matched cohort 或 stepped-wedge rollout

推荐分析:

New vs experienced investigator adoption curve。
Manager A/B 差异和 coaching pattern。
Low-risk vs high-risk KYC case penetration。
Pre-training vs post-training qualified use。
Model version change 前后 accept/edit/reject mix。
Contact-center queue type 对 AHT、QA、repeat contact 的影响。

12. Behavior Change Model

Prosci ADKAR 可以作为行为改变诊断框架, 但在 AI 场景必须工程化到 telemetry 和 operating cadence。

ADKAR stage	AI adoption interpretation	Evidence
Awareness	用户知道为什么改变, AI 解决什么流程问题	launch narrative recall、manager briefing、problem framing survey
Desire	用户愿意尝试, 认为对自己有益且不伤害绩效	opt-in demand、champion participation、resistance signal trend
Knowledge	用户知道何时用、何时不用、如何升级	policy boundary quiz、in-product guidance use、correct escalation
Ability	用户能在真实 case 中完成新工作方式	qualified task completion、light-edit acceptance、rework reduction
Reinforcement	新行为被经理、SOP、指标和反馈循环强化	manager coaching log、SOP update、returning use、performance review alignment

12.1 Resistance Signals

Signal	可能原因	产品/BA/运营动作
High ignore rate	入口干扰、输出时机错误	调整 trigger 和 placement
High regenerate rate	输出不稳定或用户不知道如何提问	改 prompt、模板化任务、提升 retrieval
High reject with "policy unclear"	边界不可信	增加政策引用、审批边界和解释
Shadow use of external AI	官方工具不满足实际工作	分析 unmet need, 改善 sanctioned tool
Manager discourages use	激励或风险责任不清	更新 SOP、RACI 和 manager scorecard
Users accept then rewrite	格式不符合业务工件	以业务 artifact 作为输出 contract
Adoption concentrated in champions	组织强化不足	建立 peer coaching 和 team-level cadence

12.2 Change Saturation

Change saturation 是高级 adoption 分析必须纳入的变量。一个团队可能不是抵抗 AI, 而是同时承受核心系统迁移、新产品上线、监管整改、组织调整和绩效压力。

Saturation factor	Adoption implication
Concurrent process changes	AI adoption drop-off 可能来自流程不稳定
Staffing shortage	用户没有时间学习和反馈
High queue backlog	短期 pressure 会驱动 copy/paste 或绕控
Policy changes	用户不敢信任 AI 输出
Manager turnover	reinforcement loop 断裂
Incentive conflict	用户被奖励速度但承担质量风险

Scale decision 必须包括 change load review, 否则会把组织容量问题误判为产品失败或用户抵抗。

13. Outcome Attribution

AI value proof 的难点不是“指标变了”, 而是“指标为什么变”。金融零售通常不能随意做简单 A/B, 因为存在客户公平、运营容量、风险等级和监管敏感性约束。可以使用更稳健的证据组合。

Method	适用场景	风险
Matched cohort comparison	有相似团队或案例可以对比	匹配不充分导致偏差
Stepped-wedge rollout	分批上线但最终覆盖全部目标群体	需要强 rollout discipline
Difference-in-differences	有上线前后和对照组数据	外部变化可能干扰
Interrupted time series	有稳定长期指标	同期政策或队列变化需解释
Shadow mode comparison	AI 输出不影响生产决策时评估	不能证明用户行为改变
Workflow replay	用历史 case 比较建议质量和处理路径	历史数据代表性有限
Manager-level variance analysis	分析 reinforcement 对 adoption 的影响	可能混入团队能力差异

归因报告要明确:

Baseline period。
Exposure and eligibility logic。
Cohort selection。
Confounders, 如 staffing、seasonality、policy changes、campaigns。
Cost and human review adjustment。
Quality and risk guardrail。
Confidence level in business language, 不伪装成绝对因果。

14. Value Leakage

AI 项目常见问题是 gross benefit 看起来漂亮, net benefit 被泄漏吃掉。

Leakage type	例子	需要测量
Human review load	AI 生成内容节省 5 分钟, 但 QA 多花 7 分钟	reviewer minutes、queue depth、defect rate
Rework	用户接受建议后被退回	first-pass quality、re-open、correction reason
Control override	用户频繁绕过 AI 或 AI 触发过多 false positive	override rate、false alert burden
Support burden	一线不断问如何使用或如何解释	help desk tickets、manager coaching time
Customer harm	错误建议导致投诉或误导	complaint linkage、customer correction events
Latency	AI 等待时间抵消人工节省	latency p95、abandon rate
Model and vendor cost	每个 case 的 token/license 成本上升	unit cost per completed case
Change cost	培训、SOP、经理会议和过渡期双跑	enablement cost、dual-run cost
Trust debt	早期错误导致长期不用	post-incident adoption decay

成熟的 value realization 公式:

Net realized value
= gross process benefit
- AI run cost
- human review load
- rework and exception cost
- support and change cost
- risk/control remediation cost
- customer harm adjustment

15. Risk / Control Architecture

Adoption analytics 不是只为增长服务, 也为风险控制服务。

Risk	Adoption analytics signal	Control response
Over-reliance	accept rate 极高、edit distance 低、defect 上升	sampling QA、friction in high-risk cases、confidence explanation
Under-reliance	high reject/ignore, strong quality evidence	workflow placement、trust building、manager coaching
Automation bias	用户接受错误建议, 尤其在高风险 case	mandatory rationale、dual review、challenge prompts
Deskilling	新人只复制 AI, 独立判断下降	skill assessment、rotating unaided review、training
Bypass / shadow AI	外部 AI 使用或复制敏感内容	sanctioned tool improvement、DLP monitoring、policy communication
Control override abuse	frequent override without reason	override reason required、manager review、risk sampling
Hidden backlog transfer	AI 前台提速, 后台复核爆仓	end-to-end queue monitoring
Unequal adoption	某些分支或团队被排除	cohort coverage review、access remediation
Model drift impact	新版本后 reject、defect、complaint 上升	version-linked monitoring、rollback trigger

Control override 不是坏事。成熟系统必须区分:

Healthy override: 用户发现 AI 不适用并正确升级。
Suspicious override: 用户为追求速度绕过必要控制。
Product-caused override: AI 输出格式或边界不符合工作需要。
Policy-caused override: 规则不清导致用户不敢采用。

16. Operating Model

Adoption analytics 需要明确节奏, 否则 dashboard 不会改变任何事情。

16.1 Forums

Forum	Cadence	Participants	Decision
Daily ops pulse	每日或每两日	Ops manager、AI PM、support lead	是否有使用障碍、队列异常、控制告警
Weekly adoption review	每周	AI PM、BA、manager champions、analytics	funnel drop-off、resistance signals、backlog actions
Biweekly risk/control review	双周	Risk、QA、ops、architect、PM	override、defect、complaint、human review load
Monthly value review	每月	Business owner、finance、Value Office、PM	benefit evidence、value leakage、scale/stop
Quarterly architecture review	每季度	Architect、platform、data、risk、product	telemetry coverage、platform reuse、model/tool lifecycle

16.2 RACI

Activity	AI PM	BA	Architect	Ops Manager	Risk/Control	Analytics	Finance
Adoption event taxonomy	A/R	R	C	C	C	C	I
Work-as-done baseline	C	A/R	I	R	C	C	I
Telemetry architecture	C	C	A/R	I	C	R	I
Behavior funnel review	A/R	R	I	R	C	R	I
Outcome attribution	A	C	I	C	C	R	C
Risk/control evidence	C	C	C	R	A/R	C	I
Benefits sign-off	R	C	I	C	I	C	A/R
Scale/stop recommendation	A/R	C	C	C	C	C	C

16.3 Operational Learning Loop

Telemetry
  -> analysis
  -> hypothesis
  -> workflow/product/control change
  -> manager reinforcement
  -> monitored rollout
  -> evidence review
  -> scale / redesign / stop

如果 review meeting 只解释指标, 没有 backlog、SOP、training、control 或 release action, 它不是 operating model, 只是 reporting ceremony。

17. Financial Retail Patterns

17.1 AML Investigator Adoption

Evidence layer	Good signal	Bad signal
Qualified use	Copilot used on eligible alert types and investigation stages	Used mainly for low-risk easy cases
Behavior change	Investigation narratives reuse AI summaries with material analyst edits	Copy/paste without source verification
Quality	QA corrections decrease, missed evidence decreases	QA defects increase after high accept rate
Risk/control	Escalations occur when policy boundary is hit	Overrides without rationale
Value	Alert aging and prep time fall after review load adjustment	Review queue grows and SAR quality drops

17.2 Contact-Center Agent Assist

Evidence layer	Good signal	Bad signal
Qualified use	Agent assist appears in target call reasons	Suggestions shown for irrelevant call types
Behavior change	Agents use policy-grounded response and reduce hold time	Agents read generic text that frustrates customers
Quality	QA score and first contact resolution improve	AHT drops but repeat contact rises
Risk/control	Sensitive topics trigger approved handoff	Agents use AI response outside policy boundary
Value	Net handle time reduction after QA and complaint adjustment	Saved seconds offset by after-call correction

17.3 KYC Onboarding

Evidence layer	Good signal	Bad signal
Qualified use	AI completeness check used before customer chase	Used after analyst already completed manual review
Behavior change	Analysts request fewer unnecessary documents	AI flags too many deficiencies
Quality	First-pass completion and approval quality improve	False deficiency notices increase
Risk/control	High-risk entities still receive required review	AI creates pressure to under-review
Value	Cycle time falls without increased remediation	Customer frustration and rework rise

17.4 Credit Ops

Evidence layer	Good signal	Bad signal
Qualified use	Used for collateral summary and covenant extraction in target products	Used for final credit judgment
Behavior change	Analysts find missing conditions earlier	Analysts stop reading source documents
Quality	Approval package defects decrease	Exception approvals increase without rationale
Risk/control	Human decision rights remain explicit	Control override lacks audit trail
Value	Faster package prep and lower rework	Faster throughput with worse downstream losses

17.5 Branch / Relationship Manager Copilot

Evidence layer	Good signal	Bad signal
Qualified use	Used before client meeting for permitted insight prep	Used to generate unapproved advice
Behavior change	RM records better next actions and follow-ups	Tool becomes a generic note generator
Quality	Follow-up completion and customer relevance improve	Compliance review flags unsuitable content
Risk/control	Advice boundary and disclosure controls fire	RM bypasses prompts to get sales script
Value	Relationship actions improve retention or cross-sell quality	Short-term sales lift creates complaint risk

18. Evidence Pack

Scale decision 需要一个 evidence pack, 而不是一张 usage chart。

Evidence object	内容
Problem statement	业务问题、目标流程、目标用户、不是 AI 技术愿望
Work-as-done baseline	当前步骤、角色、痛点、指标、例外和非正式绕行
Adoption taxonomy	事件定义、合格使用口径、control override definition
Telemetry quality report	event completeness、join rate、missing fields、known limitations
Behavior funnel	eligible -> exposed -> engaged -> influenced -> completed -> improved
Cohort analysis	role、manager、region、case type、risk tier、training wave
Outcome attribution	对照、分批、时间序列或其他归因方法和局限
Value leakage analysis	review load、rework、support、latency、cost、risk adjustment
Risk/control report	override、escalation、defect、complaint、over-reliance and under-reliance
User trust signals	feedback、qualitative themes、reason codes、trust calibration
Operating actions	product changes、SOP updates、training、manager coaching、control changes
Scale/stop recommendation	continue、redesign、scale、restrict、stop and why

Evidence pack 的质量标准:

口径可解释。
指标可追溯。
负面证据没有被隐藏。
成本和复核负担已扣除。
风险和控制不是最后一页附录。
有明确下一步行动和责任人。

19. Anti-Patterns

Anti-pattern	为什么危险	更成熟做法
Adoption = login / MAU	无法证明工作改变	定义 qualified adoption event
Training completion = adoption	学会和使用是两件事	用 work-as-done 行为漏斗验证
Prompt count = value	可能是摩擦和返工	连接 outcome、quality、review load
Accept rate 越高越好	可能是 automation bias	同时看 defects、overrides、QA
只看平均值	掩盖 cohort、manager 和 case mix 差异	做 cohort and segmentation
只报节省小时	忽略复核、风险、支持和变更成本	做 net value and leakage analysis
用 survey 代替 telemetry	主观反馈不够	结合 telemetry、case sample、outcome
忽略 resistance	把用户问题简单归咎为不配合	诊断信任、流程、激励、容量
没有 baseline	无法证明改变	建立 work-as-done baseline
上线后没有 learning loop	指标不会自动转化为改进	建立 operating review and action backlog

20. PM / BA / Architect Implications

20.1 For Senior AI PM

在 PRD 中定义 adoption event, 不要等上线后再让 analytics 猜。
把 behavior funnel 和 scale/stop rule 放入 release criteria。
将 user trust、control override 和 value leakage 作为产品指标。
对低 adoption 不急着做培训, 先诊断 workflow fit、output contract、manager incentives 和 risk boundary。

20.2 For CBAP-level BA

用 work-as-done baseline 捕捉真实流程和隐性工作。
将 adoption 需求写成行为改变需求, 例如“investigator can complete narrative with cited evidence and appropriate escalation”。
定义 resistance signal taxonomy 和 reason codes。
确保 adoption event 与业务规则、控制点、异常路径和工件相连。

20.3 For AI Architect

把 telemetry schema 当成架构契约, 不是前端埋点清单。
让 traces 连接 user action、model version、workflow stage、control event 和 outcome。
支持 cohort、版本、风险等级和 case type 分析。
设计 retention、privacy class、evidence store 和 access control。

21. Interview Answers

Q1: 如何证明 AI 工具真的被采用, 而不是只有使用量?

我会把 adoption 定义为 qualified workflow adoption, 而不是 login 或 prompt count。首先建立 work-as-done baseline, 明确目标用户、流程步骤、case type、业务工件和当前痛点。然后定义 adoption event taxonomy: exposure、intent、output、accept/edit/reject、decision influence、control override、outcome link。接着用 behavior funnel 和 cohort analysis 看用户是否在真实任务中持续使用, 并连接 cycle time、first-pass quality、rework、complaint、human review load 和 cost-to-serve。最后用 evidence pack 做 scale/stop 决策, 明确收益、风险、价值泄漏和下一步操作。

Q2: AI adoption 指标和传统 SaaS usage 指标最大区别是什么?

传统 SaaS usage 更关注 seat activation、DAU、feature click 和 retention。AI adoption 必须看 trust-calibrated behavior change, 因为 AI 输出可能影响判断、客户沟通、控制执行和业务记录。高使用量可能代表价值, 也可能代表用户反复修错或过度依赖。因此我会同时看 accept/edit/reject mix、override、escalation、defect、human review load、workflow outcome 和 durability。AI 的好 adoption 不是更多点击, 而是在正确边界内改变工作并改善结果。

Q3: 如果 contact-center agent assist 上线后 MAU 很高, 但 AHT 没下降, 你怎么分析?

我不会先假设用户不配合。会拆 behavior funnel: eligible calls 是否正确 exposure, agent 是否真正使用建议, 建议是否被大量编辑或忽略, 是否把时间从通话中转移到 after-call work 或 QA correction。还要看 call reason cohort、agent tenure、manager team、latency、policy boundary hit、repeat contact、QA defect 和 customer complaint。可能原因包括输出格式不适合通话、政策引用不够可信、建议出现太晚、用户需要额外复核或 case mix 更复杂。下一步是用这些证据决定产品改进、流程调整、manager coaching 或限制场景。

Q4: 如何避免 AI adoption 造成 automation bias?

我会把 over-reliance 当成 adoption risk, 不把高 accept rate 自动解读为成功。设计上要有 confidence calibration、source evidence、policy boundary、high-risk friction、mandatory rationale、dual review 和 sampling QA。指标上看 accept rate 与 defect、complaint、override、escalation appropriateness、edit distance 的组合。如果高风险 case 中 accept rate 极高而 edit distance 极低, 但 QA defect 上升, 说明可能有 automation bias。治理上通过 risk/control review 和 version-linked rollback trigger 处理。

Q5: 如何向 CFO 解释 AI value realization?

我会区分 gross benefit 和 net realized value。Gross benefit 可能是节省处理时间或提高转化, 但 net realized value 必须扣除 AI run cost、license、token、human review load、rework、support、training、dual-run、control remediation 和 customer harm adjustment。然后用 cohort 或分批 rollout 说明归因可信度, 用 finance sign-off 固化口径。对 CFO 来说, 成功不是“用户喜欢 AI”, 而是可重复、可归因、扣除成本和风险后仍成立的价值。

Q6: AML investigator copilot 如何设计 adoption evidence?

我会从 alert investigation 的 work-as-done baseline 开始, 区分 alert type、risk tier、investigator level 和 review path。Adoption event 包括 summary requested、evidence viewed、narrative drafted、analyst edited、source checked、escalation triggered、QA correction、case closed。核心指标不是生成摘要数量, 而是 eligible case penetration、material edit rate、QA defect reduction、alert aging、SAR prep quality、review load、override reason 和 high-risk case control adherence。只有当效率、质量和控制证据同时成立, 才建议扩大 cohort。

22. Portfolio Exercise

目标: 为一个金融零售企业设计 AI adoption analytics evidence pack。

Scenario

企业正在同时推进 5 个 AI 用例:

Use case	Business goal
AML investigator copilot	缩短 alert investigation aging, 提升 case narrative quality
Contact-center agent assist	降低 hold time, 提升 first contact resolution
KYC onboarding assistant	减少 document rework 和客户追补
Credit ops package reviewer	提升 approval package first-pass quality
Branch / RM copilot	提升客户跟进质量和合规下一步行动

Required Artifacts

Work-as-done baseline map for one workflow.
Adoption event taxonomy with at least 12 events.
Telemetry schema subset with role、case type、workflow stage、model version、human action、control action and outcome link。
Behavior funnel with drop-off diagnosis。
Cohort plan by role、manager、case type、risk tier and training wave。
Leading and lagging metrics hierarchy。
Value leakage model including human review load。
Risk/control evidence plan for over-reliance and control override。
Monthly operating review agenda。
Scale/stop recommendation memo.

Evaluation Rubric

Criterion	Strong evidence
Baseline quality	Captures real workflow, exceptions and informal workarounds
Adoption definition	Distinguishes exposure, qualified use, decision influence and outcome
Metrics maturity	Combines behavior, quality, risk, value and durability
Attribution discipline	Uses cohort or rollout logic and names confounders
Risk integration	Treats over-reliance, override, review load and customer harm as first-class
Operating loop	Converts evidence into product, process, control and manager actions
Portfolio judgment	Recommends scale, redesign, restrict or stop with reasons

23. Minimum Viable Architecture

If time is limited, build this minimum version:

One canonical adoption event schema.
Work-as-done baseline for the highest-value workflow.
Behavior funnel dashboard with cohort filters.
Human action capture: accept, edit, reject, ignore, regenerate, override, escalate.
Outcome join to cycle time, quality, rework and complaint.
Human review load and control override report.
Monthly evidence pack with scale/stop recommendation.

The mature version adds attribution models, manager reinforcement analysis, model-version drift monitoring, finance sign-off, value leakage automation and portfolio-level adoption heatmaps.

24. Final Principle

AI adoption analytics should make three uncomfortable truths visible:

Users may touch AI without trusting it.
Users may trust AI without changing the process.
The process may speed up without creating durable net value.

The job of senior AI PM, AI Architect and CBAP-level BA is to design the evidence system that separates those cases and turns adoption from a vanity story into a governed operating capability.