返回 Papers
AI 底层逻辑 / 经典论文

AI Adoption Analytics:行为改变与价值兑现架构

以下来源用于组织 AI 风险管理、AI 管理体系、变更管理、可观测性、工程绩效和价值证据语言。本文是学习和作品集材料, 不构成法律、合规、审计或监管结论。

823ai-foundations/papers/151-ai-adoption-analytics-behavior-change-value-realization-architecture.md

AI Adoption Analytics / Behavior Change / Value Realization Architecture 解读

Target audience: Senior AI PM / AI Architect / Business Architect / CBAP-level BA / AI Transformation Lead / AI Value Office Lead / Financial Retail Operations Leader. Learning objectives: 建立一套能证明 AI 被真实采用、改变工作方式、改善流程结果并产生可持续价值的 evidence architecture, 而不是只报告 login、prompt count、seat activation 或 demo satisfaction。 Core question: AI 上线以后, 如何证明一线人员真的把它纳入 work-as-done, 行为和流程正在改变, 风险没有被转移到人工复核或客户伤害, 价值不是短期 novelty effect?


Source Anchors

以下来源用于组织 AI 风险管理、AI 管理体系、变更管理、可观测性、工程绩效和价值证据语言。本文是学习和作品集材料, 不构成法律、合规、审计或监管结论。

SourceLink本文采用的思想
NIST AI Risk Management Frameworkhttps://www.nist.gov/itl/ai-risk-management-framework用 Govern / Map / Measure / Manage 组织 AI adoption 证据、风险测量、持续监控和处置闭环
ISO/IEC 42001 AI management systemhttps://www.iso.org/standard/81230.html用 AI management system 的 policy、objective、operation、performance evaluation、improvement 语言管理 adoption 和 value realization
Prosci ADKARhttps://www.prosci.com/blog/adkar-model用 Awareness、Desire、Knowledge、Ability、Reinforcement 解释行为改变不是培训完成率
OpenTelemetry Documentationhttps://opentelemetry.io/docs/用 traces、metrics、logs、semantic conventions 的思路设计 AI adoption telemetry 和 workflow observability
DORAhttps://dora.dev/用 deployment frequency、lead time、change fail rate、time to restore 的思想连接 AI 产品变更、运营学习和工程系统健康
NIST AI RMF Playbookhttps://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook用 AI RMF action-oriented language 组织 evidence collection、monitoring 和 review routines
FFIEC Management IT Handbookhttps://ithandbook.ffiec.gov/it-booklets/management.aspx用治理、风险识别、监控和报告语言对接金融机构管理层证据需求

一句话:

AI adoption analytics is not usage reporting; it is an evidence system that links exposure, trusted use, behavior change, workflow quality, control performance, unit economics and durable outcome realization.

1. Executive Summary

很多 AI 项目失败不是因为模型完全不可用, 而是因为企业无法证明以下链条成立:

AI shipped
  -> target users were exposed in real workflow
  -> users trusted it for the right tasks
  -> behavior changed in work-as-done
  -> process outcomes improved without hidden risk transfer
  -> benefits were realized after cost, review load and control overhead
  -> operating loop reinforced the new behavior

只看 usage 会产生严重误判:

Vanity signal为什么危险需要补充的证据
Prompt count 高可能是用户反复修错、探索新奇或被迫使用accepted suggestion rate、task completion、rework、sentiment、quality
MAU 高可能只是登录, 没有进入关键工作步骤workflow step coverage、active case penetration、decision influence
Seat activation 高可能是 license 被分配, 行为未改变cohort adoption、returning qualified use、manager reinforcement
Time saved survey容易高估, 也可能转移到复核或投诉cycle time、human review load、exception queue、quality and control evidence
Accuracy 提升模型指标不能自动变成业务价值operational lift、policy compliance、loss reduction、cost-to-serve

高级 AI PM / Architect / BA 的任务不是做一个 adoption dashboard, 而是定义:

  • 什么算真正的 adoption event。
  • 当前 work-as-done baseline 是什么。
  • AI 在流程中改变了哪个 decision、artifact、handoff 或 control。
  • 价值指标如何被 cohort、阶段、风险、成本和质量修正。
  • 低 adoption 是产品问题、流程问题、信任问题、激励问题、经理节奏问题还是 change saturation。
  • 什么时候 scale, 什么时候 redesign, 什么时候 stop。

2. Target Audience and Role Expectations

Role应该负责的问题典型输出
Senior AI PMAI 产品是否被目标用户持续、正确、愿意地纳入核心工作adoption event taxonomy、behavior funnel、scale/stop memo
AI Architecttelemetry、identity、workflow、policy、evidence 和 outcome data 是否可追溯adoption observability architecture、event schema、control trace
CBAP-level BA真实流程、角色、规则、例外、阻力和价值泄漏是否被建模work-as-done baseline、change impact map、stakeholder adoption analysis
Operations LeadAI 是否改善队列、周期、质量、复核负担和客户体验operating review pack、manager coaching loop、exception taxonomy
Risk / Control PartnerAI adoption 是否引入过度依赖、绕控、错误升级或审计盲区control override log、human review evidence、risk acceptance record
Finance / Value Office收益是否可归因、可复现、可规模化benefits register、unit economics、value leakage analysis

成熟组织会把 adoption 作为跨职能证据系统, 而不是交给培训团队或产品分析师单独完成。


3. Thesis: Adoption 是行为改变, 不是使用量

AI adoption 的最小证明单元不是:

user clicked AI button

而是:

in a named workflow step, a target user used an AI output in a governed way that changed or improved the work artifact, decision, handoff, cycle time, quality, control performance or customer outcome.

这意味着 adoption 必须同时回答 7 个问题:

Question解释
Who目标用户是谁, 是新手、专家、经理、承包商、分支机构、区域团队还是中央运营
Where在哪个流程、系统、队列、案例类型、客户旅程、风险等级中使用
WhatAI 输出影响了摘要、建议、决策、下一步动作、客户回复、调查笔记还是控制证据
How用户接受、修改、拒绝、覆盖、升级、重新生成还是绕开
Why使用或拒绝的原因是什么, 是否与信任、质量、政策、激励、时间压力有关
So what行为改变是否带来周期、质量、风险、成本或客户体验变化
For how long变化是否跨 cohort、时间、经理、流程版本和模型版本持续存在

金融零售里的 AI adoption 往往是 workflow adoption:

  • AML investigator 是否使用 AI-generated case narrative 缩短 SAR prep 前的证据整理, 而不是只看打开 copilot 次数。
  • Contact-center agent 是否在受控话术边界内采用建议答案, 降低 hold time 和 repeat contact, 而不是只看 suggestion impressions。
  • KYC onboarding analyst 是否通过 AI document completeness check 减少 rework 和 customer chase, 而不是只看 OCR 调用量。
  • Credit ops analyst 是否使用 AI collateral summary 发现缺失条件并提升 first-pass approval quality, 而不是只看摘要生成数。
  • Branch / relationship manager 是否把 copilot insight 转化为合规的下一步客户行动, 而不是只看周活。

4. Conceptual Model: Adoption-to-Value Evidence Chain

建议使用 8 层 evidence chain:

1. Exposure
2. Qualified use
3. Trust-calibrated use
4. Behavior change
5. Workflow quality
6. Outcome movement
7. Value realization
8. Reinforced operating loop
Layer关键问题示例指标
Exposure目标用户是否在真实工作中看到 AIeligible users exposed、case coverage、workflow step availability
Qualified use是否在目标任务中使用, 而不是随机探索qualified action rate、task-matched AI invocation、returning use
Trust-calibrated use用户是否在正确场景接受, 在不确定场景拒绝或升级accept/edit/reject/override mix、escalation appropriateness、trust calibration
Behavior change工作方式是否改变artifact reuse、new sequence adoption、manual step removal、handoff change
Workflow quality流程质量是否改善first-pass quality、rework、exception rate、control defects
Outcome movement业务结果是否变化AHT、cycle time、STP、loss rate、conversion、complaints
Value realization扣除成本和风险后是否产生收益net benefit、cost-to-serve、human review load、value leakage
Reinforcement组织是否强化新行为manager coaching、SOP updates、performance cadence、feedback loop closure

这条链条的强度取决于最薄弱环节。一个 contact-center AI 可以有很高 exposure, 但如果 agents 只复制建议又触发投诉上升, 它不是成功 adoption, 而是过度依赖。


5. Work-as-Done Baseline

没有 baseline 的 adoption analytics 只是在讲上线故事。

Work-as-done baseline 必须捕捉真实工作, 包括非正式绕行、人工判断、系统切换、聊天求助和经理审批。BA 在这里的价值极高, 因为系统日志只记录 work-as-imagined 的一部分。

Baseline dimension需要捕捉的内容金融零售例子
Case mix案例类型、风险等级、复杂度、渠道、区域AML alert type、KYC entity type、credit exception class
Actor map角色、经验层级、授权边界、经理干预investigator L1/L2、branch RM、contact-center specialist
Activity sequence实际步骤和系统切换CRM -> policy search -> core banking -> notes -> supervisor chat
Decision points哪些判断影响下一步escalate, close, request document, approve with condition
Artifacts产生哪些业务记录和客户沟通investigation narrative、call note、KYC deficiency notice
Controls哪些步骤是控制点sanctions check, suitability disclosure, dual review
Pain points等待、返工、缺信息、政策不清document chase, duplicated note-taking, uncertain policy
Informal workarounds用户实际用什么补系统短板shared spreadsheet, saved templates, peer review chat
Current metrics当前周期、质量、成本、队列、投诉AHT, alert aging, first-pass yield, re-open rate

Baseline 不能只由访谈生成。推荐组合:

  • SME interview and observation。
  • Process mining 或 workflow log analysis。
  • Case note sampling。
  • Screen-flow / clickstream review, 在隐私和授权边界内。
  • Manager review cadence and coaching artifacts。
  • Exception queue and rework analysis。
  • Customer complaint and quality assurance sampling。

6. Adoption Event Taxonomy

Adoption event taxonomy 是整套架构的核心。它把“用户使用了 AI”拆成可解释、可控、可审计的事件。

Event classEvent examples价值含义
Exposureai_surface_shown, suggestion_presented, copilot_available_in_case用户有机会采用
Intentopen_assistant, ask_policy_question, request_case_summary用户表达任务意图
AI outputsummary_generated, recommendation_returned, next_best_action_returned系统产生可用输出
Human responseaccepted, edited, rejected, ignored, regenerated初步信任和可用性
Decision influenceused_in_case_note, used_in_disposition, used_in_customer_responseAI 进入业务工件或决策
Control actionoverride, escalate, dual_review_requested, policy_boundary_hit风险控制和信任校准
Learning signalfeedback_positive, feedback_negative, reason_selected, defect_reported产品和模型学习
Outcome linkcase_closed, customer_contact_completed, document_deficiency_resolved连接到流程结果
Reinforcementmanager_coached, sop_updated, job_aid_viewed, team_review_completed组织强化新行为

关键是区分:

AI touched the workflow
vs
AI changed the workflow
vs
AI improved the workflow

很多项目只能证明第一层, 却向管理层宣称第三层。


7. Reference Architecture

Adoption analytics 需要同时连接产品、流程、身份、模型、控制和业务结果。

AI Experience Layer
  -> Workflow Integration Layer
  -> Adoption Telemetry Layer
  -> Identity / Cohort / Entitlement Layer
  -> Process and Outcome Data Layer
  -> Risk / Control Evidence Layer
  -> Analytics and Attribution Layer
  -> Value Realization and Operating Review Layer

7.1 Architecture Components

ComponentResponsibility关键设计点
AI experience instrumentation捕捉用户看到、询问、接受、修改、拒绝、反馈和升级event taxonomy、low-friction reason capture、privacy minimization
Workflow context resolver将 AI 事件绑定到 case、customer journey、process step、queue、risk tierworkflow id、case id、stage id、control point id
Identity and cohort service定义 eligible users、role、team、region、experience、training、managercohort analysis、manager effect、change saturation
Model and prompt registry记录模型、prompt、policy pack、tool versionadoption 与模型版本、release、eval 结果可关联
Outcome data connector拉取流程和业务结果AHT、cycle time、rework、STP、quality score、loss、complaint
Human review tracker捕捉 review load 和 control overridereviewer time、queue depth、override reason、defect severity
Risk evidence store保存 control hits、escalations、exceptions、audit trail不替代合规判断, 但提供治理证据
Analytics workspacefunnel、cohort、attribution、leakage、saturation 分析baseline comparison、matched cohorts、segment drilldown
Operating review pack将指标转化为行动manager coaching、product backlog、risk actions、scale/stop decision

7.2 Telemetry Design Principles

Principle解释
Instrument workflow, not only UI记录 AI 对业务步骤和工件的影响
Preserve context没有 case type、risk tier、role、stage 的事件很难解释
Capture negative signalsrejection、ignore、override、regenerate、complaint 都是高价值证据
Minimize sensitive payload事件记录应保存引用、分类和必要摘要, 避免不必要客户数据
Version everythingmodel、prompt、policy、workflow、training、SOP、feature flag 都要版本化
Connect to outcomesadoption 事件必须能连接到流程质量和业务结果
Make evidence reviewable指标需要能追溯到样本、定义、口径和责任人

8. Data / Telemetry Schema

下面是一个面向金融零售 AI copilot 的最小可用 adoption event schema。它不是数据库物理设计, 而是 BA / PM / Architect 对齐口径的 canonical event contract。

FieldTypeDescription
event_idstring全局唯一事件 ID
event_timetimestamp事件发生时间
event_nameenumtaxonomy 中的事件名
event_classenumexposure / intent / output / response / influence / control / learning / outcome / reinforcement
user_id_hashstring去标识化用户标识
roleenuminvestigator / agent / analyst / RM / manager / QA / supervisor
team_idstring团队或分支机构
manager_id_hashstring去标识化 manager, 用于 manager effect 分析
cohort_idstringpilot cohort、region、experience cohort、training cohort
workflow_idstringAML investigation、KYC onboarding、credit ops、contact center 等
workflow_stagestringtriage、review、customer contact、decision、quality check
case_id_hashstring去标识化 case reference
case_typestringalert type、call reason、KYC entity type、credit exception
risk_tierenumlow / medium / high / material
customer_segmentstringretail、SMB、wealth、branch、digital
ai_surfacestringembedded panel、inline suggestion、draft generator、policy search
model_idstringmodel registry id
prompt_versionstringprompt or policy pack version
tool_idsarrayagent tools or connectors used
output_typeenumsummary / recommendation / draft / classification / next action / risk flag
confidence_bandenumcalibrated band if used; avoid false precision
user_actionenumaccept / edit / reject / ignore / regenerate / override / escalate
edit_distance_bandenumnone / light / material / rewrite
reason_codeenumuseful / inaccurate / incomplete / unsafe / policy_unclear / too_slow / not_relevant
control_point_idstringlinked control or policy boundary
override_reasonstringrequired when user overrides AI or control suggestion
human_review_requiredboolean是否需要人工复核
human_review_minutesnumber复核负担, 可后补汇总
downstream_artifact_idstringnote、letter、decision record、case narrative
outcome_event_idstringlinked process outcome event
latency_msnumber用户可感知延迟
cost_estimatenumbertoken、tool、license 或单位成本估算
privacy_classenumevent-only / sensitive-reference / restricted
retention_classenumanalytics / business-record-link / control-evidence

8.1 Example Events

{
  "event_name": "ai_recommendation_edited",
  "event_class": "response",
  "workflow_id": "aml_alert_investigation",
  "workflow_stage": "case_narrative_draft",
  "role": "investigator",
  "case_type": "transaction_monitoring_alert",
  "risk_tier": "high",
  "output_type": "investigation_summary",
  "user_action": "edit",
  "edit_distance_band": "material",
  "reason_code": "incomplete",
  "control_point_id": "aml_secondary_review_required",
  "human_review_required": true
}
{
  "event_name": "suggested_answer_accepted",
  "event_class": "decision_influence",
  "workflow_id": "contact_center_agent_assist",
  "workflow_stage": "customer_response",
  "role": "agent",
  "case_type": "card_dispute_status",
  "risk_tier": "medium",
  "output_type": "customer_reply",
  "user_action": "accept",
  "edit_distance_band": "light",
  "reason_code": "useful",
  "outcome_event_id": "call_completed"
}

9. Metrics Hierarchy

Adoption analytics 需要指标层级, 否则团队会把最容易收集的 usage 当成价值。

Telemetry metrics
  -> Adoption metrics
  -> Behavior change metrics
  -> Flow / quality metrics
  -> Risk / control metrics
  -> Value metrics
  -> Durability metrics
Layer指标解释
Telemetryevent completeness、schema coverage、trace join rate数据是否可信
Adoptioneligible-user active rate、qualified use rate、returning qualified use、case penetration是否在目标人群和目标任务中使用
Behavioraccept/edit/reject mix、artifact reuse、manual step reduction、handoff change行为是否改变
Flow / qualitycycle time、queue aging、first-pass quality、rework、QA defects流程是否改善
Risk / controloverride rate、escalation appropriateness、over-reliance signals、control defects风险是否被控制
Valuenet hours released、cost-to-serve、loss reduction、conversion lift、complaint reduction是否产生收益
Durabilitycohort retention、post-novelty persistence、manager variance、model version stability是否可持续

9.1 Leading and Lagging Indicators

TypeExamples用法
Leadingqualified use rate、accepted-with-light-edit rate、feedback density、manager coaching completion判断 adoption 是否形成早期动能
Intermediatefirst-pass quality、rework reduction、review queue depth、policy search time判断行为是否改善流程
Laggingcost reduction、loss reduction、revenue lift、complaint reduction、regulatory finding reduction判断价值是否实现

不要把 leading indicator 当成 business case close。它只能说明值得继续观察或 scale candidate, 不能单独证明收益。

9.2 Anti-Metrics

Anti-metric可能说明
Prompt count 上升但 cycle time 不降用户在与系统搏斗
Accept rate 极高但 defect 上升过度依赖或缺少复核
Reject rate 高且 reason 为 policy unclear边界和话术不可信
Human review load 上升超过节省时间价值泄漏到复核队列
Manager variance 极大采用依赖局部 champion, 未制度化
初期提升 4 周后回落novelty effect 或 reinforcement 不足

10. Behavior Funnel

行为漏斗把 adoption 从“看见”推进到“稳定改变”。

Funnel stepDefinitionDrop-off diagnosis
Eligible用户和案例符合目标场景cohort 定义错误、entitlement 不完整
ExposedAI 在正确工作步骤出现UI / workflow integration 不到位
Engaged用户主动打开或响应 AI价值不明显、入口不自然、速度慢
AssistedAI 输出被阅读并进入任务输出不相关、格式不匹配
Influenced用户接受、编辑后采用或用来决策信任不足、政策边界不清、质量不稳
Completed任务或工件完成下游系统或审批卡住
Improved周期、质量或风险指标改善AI 只转移工作, 未改变瓶颈
Reinforced经理、SOP、培训和绩效节奏支持新方式adoption 依赖个人热情

对于 AML investigator adoption, 漏斗可能是:

eligible AML alerts
  -> alerts with copilot panel visible
  -> investigator requests alert summary
  -> summary used in investigation note
  -> note passes QA with no material correction
  -> alert aging reduced
  -> investigator returns to use in next high-risk case

11. Cohort Analysis

没有 cohort, adoption 指标会把不同人群、经理、风险、案例复杂度和培训波次混在一起。

Cohort dimension为什么重要
Role / level新手和专家采用 AI 的动机相反: 新手需要 guidance, 专家需要 speed and precision
Team / manager经理 reinforcement 往往比培训更影响持续 adoption
Region / branch本地政策、客群、绩效压力和容量约束不同
Case type / risk tier低风险场景高 adoption 不代表高风险场景可 scale
Tenure / experienceAI 可能帮助新人缩短 ramp, 也可能让专家觉得干扰
Training wave可区分产品改进和 enablement 改进
Model / prompt versionadoption 变化可能来自质量变化而不是 change program
Feature flag exposure便于 matched cohort 或 stepped-wedge rollout

推荐分析:

  • New vs experienced investigator adoption curve。
  • Manager A/B 差异和 coaching pattern。
  • Low-risk vs high-risk KYC case penetration。
  • Pre-training vs post-training qualified use。
  • Model version change 前后 accept/edit/reject mix。
  • Contact-center queue type 对 AHT、QA、repeat contact 的影响。

12. Behavior Change Model

Prosci ADKAR 可以作为行为改变诊断框架, 但在 AI 场景必须工程化到 telemetry 和 operating cadence。

ADKAR stageAI adoption interpretationEvidence
Awareness用户知道为什么改变, AI 解决什么流程问题launch narrative recall、manager briefing、problem framing survey
Desire用户愿意尝试, 认为对自己有益且不伤害绩效opt-in demand、champion participation、resistance signal trend
Knowledge用户知道何时用、何时不用、如何升级policy boundary quiz、in-product guidance use、correct escalation
Ability用户能在真实 case 中完成新工作方式qualified task completion、light-edit acceptance、rework reduction
Reinforcement新行为被经理、SOP、指标和反馈循环强化manager coaching log、SOP update、returning use、performance review alignment

12.1 Resistance Signals

Signal可能原因产品/BA/运营动作
High ignore rate入口干扰、输出时机错误调整 trigger 和 placement
High regenerate rate输出不稳定或用户不知道如何提问改 prompt、模板化任务、提升 retrieval
High reject with "policy unclear"边界不可信增加政策引用、审批边界和解释
Shadow use of external AI官方工具不满足实际工作分析 unmet need, 改善 sanctioned tool
Manager discourages use激励或风险责任不清更新 SOP、RACI 和 manager scorecard
Users accept then rewrite格式不符合业务工件以业务 artifact 作为输出 contract
Adoption concentrated in champions组织强化不足建立 peer coaching 和 team-level cadence

12.2 Change Saturation

Change saturation 是高级 adoption 分析必须纳入的变量。一个团队可能不是抵抗 AI, 而是同时承受核心系统迁移、新产品上线、监管整改、组织调整和绩效压力。

Saturation factorAdoption implication
Concurrent process changesAI adoption drop-off 可能来自流程不稳定
Staffing shortage用户没有时间学习和反馈
High queue backlog短期 pressure 会驱动 copy/paste 或绕控
Policy changes用户不敢信任 AI 输出
Manager turnoverreinforcement loop 断裂
Incentive conflict用户被奖励速度但承担质量风险

Scale decision 必须包括 change load review, 否则会把组织容量问题误判为产品失败或用户抵抗。


13. Outcome Attribution

AI value proof 的难点不是“指标变了”, 而是“指标为什么变”。金融零售通常不能随意做简单 A/B, 因为存在客户公平、运营容量、风险等级和监管敏感性约束。可以使用更稳健的证据组合。

Method适用场景风险
Matched cohort comparison有相似团队或案例可以对比匹配不充分导致偏差
Stepped-wedge rollout分批上线但最终覆盖全部目标群体需要强 rollout discipline
Difference-in-differences有上线前后和对照组数据外部变化可能干扰
Interrupted time series有稳定长期指标同期政策或队列变化需解释
Shadow mode comparisonAI 输出不影响生产决策时评估不能证明用户行为改变
Workflow replay用历史 case 比较建议质量和处理路径历史数据代表性有限
Manager-level variance analysis分析 reinforcement 对 adoption 的影响可能混入团队能力差异

归因报告要明确:

  • Baseline period。
  • Exposure and eligibility logic。
  • Cohort selection。
  • Confounders, 如 staffing、seasonality、policy changes、campaigns。
  • Cost and human review adjustment。
  • Quality and risk guardrail。
  • Confidence level in business language, 不伪装成绝对因果。

14. Value Leakage

AI 项目常见问题是 gross benefit 看起来漂亮, net benefit 被泄漏吃掉。

Leakage type例子需要测量
Human review loadAI 生成内容节省 5 分钟, 但 QA 多花 7 分钟reviewer minutes、queue depth、defect rate
Rework用户接受建议后被退回first-pass quality、re-open、correction reason
Control override用户频繁绕过 AI 或 AI 触发过多 false positiveoverride rate、false alert burden
Support burden一线不断问如何使用或如何解释help desk tickets、manager coaching time
Customer harm错误建议导致投诉或误导complaint linkage、customer correction events
LatencyAI 等待时间抵消人工节省latency p95、abandon rate
Model and vendor cost每个 case 的 token/license 成本上升unit cost per completed case
Change cost培训、SOP、经理会议和过渡期双跑enablement cost、dual-run cost
Trust debt早期错误导致长期不用post-incident adoption decay

成熟的 value realization 公式:

Net realized value
= gross process benefit
- AI run cost
- human review load
- rework and exception cost
- support and change cost
- risk/control remediation cost
- customer harm adjustment

15. Risk / Control Architecture

Adoption analytics 不是只为增长服务, 也为风险控制服务。

RiskAdoption analytics signalControl response
Over-relianceaccept rate 极高、edit distance 低、defect 上升sampling QA、friction in high-risk cases、confidence explanation
Under-reliancehigh reject/ignore, strong quality evidenceworkflow placement、trust building、manager coaching
Automation bias用户接受错误建议, 尤其在高风险 casemandatory rationale、dual review、challenge prompts
Deskilling新人只复制 AI, 独立判断下降skill assessment、rotating unaided review、training
Bypass / shadow AI外部 AI 使用或复制敏感内容sanctioned tool improvement、DLP monitoring、policy communication
Control override abusefrequent override without reasonoverride reason required、manager review、risk sampling
Hidden backlog transferAI 前台提速, 后台复核爆仓end-to-end queue monitoring
Unequal adoption某些分支或团队被排除cohort coverage review、access remediation
Model drift impact新版本后 reject、defect、complaint 上升version-linked monitoring、rollback trigger

Control override 不是坏事。成熟系统必须区分:

  • Healthy override: 用户发现 AI 不适用并正确升级。
  • Suspicious override: 用户为追求速度绕过必要控制。
  • Product-caused override: AI 输出格式或边界不符合工作需要。
  • Policy-caused override: 规则不清导致用户不敢采用。

16. Operating Model

Adoption analytics 需要明确节奏, 否则 dashboard 不会改变任何事情。

16.1 Forums

ForumCadenceParticipantsDecision
Daily ops pulse每日或每两日Ops manager、AI PM、support lead是否有使用障碍、队列异常、控制告警
Weekly adoption review每周AI PM、BA、manager champions、analyticsfunnel drop-off、resistance signals、backlog actions
Biweekly risk/control review双周Risk、QA、ops、architect、PMoverride、defect、complaint、human review load
Monthly value review每月Business owner、finance、Value Office、PMbenefit evidence、value leakage、scale/stop
Quarterly architecture review每季度Architect、platform、data、risk、producttelemetry coverage、platform reuse、model/tool lifecycle

16.2 RACI

ActivityAI PMBAArchitectOps ManagerRisk/ControlAnalyticsFinance
Adoption event taxonomyA/RRCCCCI
Work-as-done baselineCA/RIRCCI
Telemetry architectureCCA/RICRI
Behavior funnel reviewA/RRIRCRI
Outcome attributionACICCRC
Risk/control evidenceCCCRA/RCI
Benefits sign-offRCICICA/R
Scale/stop recommendationA/RCCCCCC

16.3 Operational Learning Loop

Telemetry
  -> analysis
  -> hypothesis
  -> workflow/product/control change
  -> manager reinforcement
  -> monitored rollout
  -> evidence review
  -> scale / redesign / stop

如果 review meeting 只解释指标, 没有 backlog、SOP、training、control 或 release action, 它不是 operating model, 只是 reporting ceremony。


17. Financial Retail Patterns

17.1 AML Investigator Adoption

Evidence layerGood signalBad signal
Qualified useCopilot used on eligible alert types and investigation stagesUsed mainly for low-risk easy cases
Behavior changeInvestigation narratives reuse AI summaries with material analyst editsCopy/paste without source verification
QualityQA corrections decrease, missed evidence decreasesQA defects increase after high accept rate
Risk/controlEscalations occur when policy boundary is hitOverrides without rationale
ValueAlert aging and prep time fall after review load adjustmentReview queue grows and SAR quality drops

17.2 Contact-Center Agent Assist

Evidence layerGood signalBad signal
Qualified useAgent assist appears in target call reasonsSuggestions shown for irrelevant call types
Behavior changeAgents use policy-grounded response and reduce hold timeAgents read generic text that frustrates customers
QualityQA score and first contact resolution improveAHT drops but repeat contact rises
Risk/controlSensitive topics trigger approved handoffAgents use AI response outside policy boundary
ValueNet handle time reduction after QA and complaint adjustmentSaved seconds offset by after-call correction

17.3 KYC Onboarding

Evidence layerGood signalBad signal
Qualified useAI completeness check used before customer chaseUsed after analyst already completed manual review
Behavior changeAnalysts request fewer unnecessary documentsAI flags too many deficiencies
QualityFirst-pass completion and approval quality improveFalse deficiency notices increase
Risk/controlHigh-risk entities still receive required reviewAI creates pressure to under-review
ValueCycle time falls without increased remediationCustomer frustration and rework rise

17.4 Credit Ops

Evidence layerGood signalBad signal
Qualified useUsed for collateral summary and covenant extraction in target productsUsed for final credit judgment
Behavior changeAnalysts find missing conditions earlierAnalysts stop reading source documents
QualityApproval package defects decreaseException approvals increase without rationale
Risk/controlHuman decision rights remain explicitControl override lacks audit trail
ValueFaster package prep and lower reworkFaster throughput with worse downstream losses

17.5 Branch / Relationship Manager Copilot

Evidence layerGood signalBad signal
Qualified useUsed before client meeting for permitted insight prepUsed to generate unapproved advice
Behavior changeRM records better next actions and follow-upsTool becomes a generic note generator
QualityFollow-up completion and customer relevance improveCompliance review flags unsuitable content
Risk/controlAdvice boundary and disclosure controls fireRM bypasses prompts to get sales script
ValueRelationship actions improve retention or cross-sell qualityShort-term sales lift creates complaint risk

18. Evidence Pack

Scale decision 需要一个 evidence pack, 而不是一张 usage chart。

Evidence object内容
Problem statement业务问题、目标流程、目标用户、不是 AI 技术愿望
Work-as-done baseline当前步骤、角色、痛点、指标、例外和非正式绕行
Adoption taxonomy事件定义、合格使用口径、control override definition
Telemetry quality reportevent completeness、join rate、missing fields、known limitations
Behavior funneleligible -> exposed -> engaged -> influenced -> completed -> improved
Cohort analysisrole、manager、region、case type、risk tier、training wave
Outcome attribution对照、分批、时间序列或其他归因方法和局限
Value leakage analysisreview load、rework、support、latency、cost、risk adjustment
Risk/control reportoverride、escalation、defect、complaint、over-reliance and under-reliance
User trust signalsfeedback、qualitative themes、reason codes、trust calibration
Operating actionsproduct changes、SOP updates、training、manager coaching、control changes
Scale/stop recommendationcontinue、redesign、scale、restrict、stop and why

Evidence pack 的质量标准:

  • 口径可解释。
  • 指标可追溯。
  • 负面证据没有被隐藏。
  • 成本和复核负担已扣除。
  • 风险和控制不是最后一页附录。
  • 有明确下一步行动和责任人。

19. Anti-Patterns

Anti-pattern为什么危险更成熟做法
Adoption = login / MAU无法证明工作改变定义 qualified adoption event
Training completion = adoption学会和使用是两件事用 work-as-done 行为漏斗验证
Prompt count = value可能是摩擦和返工连接 outcome、quality、review load
Accept rate 越高越好可能是 automation bias同时看 defects、overrides、QA
只看平均值掩盖 cohort、manager 和 case mix 差异做 cohort and segmentation
只报节省小时忽略复核、风险、支持和变更成本做 net value and leakage analysis
用 survey 代替 telemetry主观反馈不够结合 telemetry、case sample、outcome
忽略 resistance把用户问题简单归咎为不配合诊断信任、流程、激励、容量
没有 baseline无法证明改变建立 work-as-done baseline
上线后没有 learning loop指标不会自动转化为改进建立 operating review and action backlog

20. PM / BA / Architect Implications

20.1 For Senior AI PM

  • 在 PRD 中定义 adoption event, 不要等上线后再让 analytics 猜。
  • 把 behavior funnel 和 scale/stop rule 放入 release criteria。
  • 将 user trust、control override 和 value leakage 作为产品指标。
  • 对低 adoption 不急着做培训, 先诊断 workflow fit、output contract、manager incentives 和 risk boundary。

20.2 For CBAP-level BA

  • 用 work-as-done baseline 捕捉真实流程和隐性工作。
  • 将 adoption 需求写成行为改变需求, 例如“investigator can complete narrative with cited evidence and appropriate escalation”。
  • 定义 resistance signal taxonomy 和 reason codes。
  • 确保 adoption event 与业务规则、控制点、异常路径和工件相连。

20.3 For AI Architect

  • 把 telemetry schema 当成架构契约, 不是前端埋点清单。
  • 让 traces 连接 user action、model version、workflow stage、control event 和 outcome。
  • 支持 cohort、版本、风险等级和 case type 分析。
  • 设计 retention、privacy class、evidence store 和 access control。

21. Interview Answers

Q1: 如何证明 AI 工具真的被采用, 而不是只有使用量?

我会把 adoption 定义为 qualified workflow adoption, 而不是 login 或 prompt count。首先建立 work-as-done baseline, 明确目标用户、流程步骤、case type、业务工件和当前痛点。然后定义 adoption event taxonomy: exposure、intent、output、accept/edit/reject、decision influence、control override、outcome link。接着用 behavior funnel 和 cohort analysis 看用户是否在真实任务中持续使用, 并连接 cycle time、first-pass quality、rework、complaint、human review load 和 cost-to-serve。最后用 evidence pack 做 scale/stop 决策, 明确收益、风险、价值泄漏和下一步操作。

Q2: AI adoption 指标和传统 SaaS usage 指标最大区别是什么?

传统 SaaS usage 更关注 seat activation、DAU、feature click 和 retention。AI adoption 必须看 trust-calibrated behavior change, 因为 AI 输出可能影响判断、客户沟通、控制执行和业务记录。高使用量可能代表价值, 也可能代表用户反复修错或过度依赖。因此我会同时看 accept/edit/reject mix、override、escalation、defect、human review load、workflow outcome 和 durability。AI 的好 adoption 不是更多点击, 而是在正确边界内改变工作并改善结果。

Q3: 如果 contact-center agent assist 上线后 MAU 很高, 但 AHT 没下降, 你怎么分析?

我不会先假设用户不配合。会拆 behavior funnel: eligible calls 是否正确 exposure, agent 是否真正使用建议, 建议是否被大量编辑或忽略, 是否把时间从通话中转移到 after-call work 或 QA correction。还要看 call reason cohort、agent tenure、manager team、latency、policy boundary hit、repeat contact、QA defect 和 customer complaint。可能原因包括输出格式不适合通话、政策引用不够可信、建议出现太晚、用户需要额外复核或 case mix 更复杂。下一步是用这些证据决定产品改进、流程调整、manager coaching 或限制场景。

Q4: 如何避免 AI adoption 造成 automation bias?

我会把 over-reliance 当成 adoption risk, 不把高 accept rate 自动解读为成功。设计上要有 confidence calibration、source evidence、policy boundary、high-risk friction、mandatory rationale、dual review 和 sampling QA。指标上看 accept rate 与 defect、complaint、override、escalation appropriateness、edit distance 的组合。如果高风险 case 中 accept rate 极高而 edit distance 极低, 但 QA defect 上升, 说明可能有 automation bias。治理上通过 risk/control review 和 version-linked rollback trigger 处理。

Q5: 如何向 CFO 解释 AI value realization?

我会区分 gross benefit 和 net realized value。Gross benefit 可能是节省处理时间或提高转化, 但 net realized value 必须扣除 AI run cost、license、token、human review load、rework、support、training、dual-run、control remediation 和 customer harm adjustment。然后用 cohort 或分批 rollout 说明归因可信度, 用 finance sign-off 固化口径。对 CFO 来说, 成功不是“用户喜欢 AI”, 而是可重复、可归因、扣除成本和风险后仍成立的价值。

Q6: AML investigator copilot 如何设计 adoption evidence?

我会从 alert investigation 的 work-as-done baseline 开始, 区分 alert type、risk tier、investigator level 和 review path。Adoption event 包括 summary requested、evidence viewed、narrative drafted、analyst edited、source checked、escalation triggered、QA correction、case closed。核心指标不是生成摘要数量, 而是 eligible case penetration、material edit rate、QA defect reduction、alert aging、SAR prep quality、review load、override reason 和 high-risk case control adherence。只有当效率、质量和控制证据同时成立, 才建议扩大 cohort。


22. Portfolio Exercise

目标: 为一个金融零售企业设计 AI adoption analytics evidence pack。

Scenario

企业正在同时推进 5 个 AI 用例:

Use caseBusiness goal
AML investigator copilot缩短 alert investigation aging, 提升 case narrative quality
Contact-center agent assist降低 hold time, 提升 first contact resolution
KYC onboarding assistant减少 document rework 和客户追补
Credit ops package reviewer提升 approval package first-pass quality
Branch / RM copilot提升客户跟进质量和合规下一步行动

Required Artifacts

  1. Work-as-done baseline map for one workflow.
  2. Adoption event taxonomy with at least 12 events.
  3. Telemetry schema subset with role、case type、workflow stage、model version、human action、control action and outcome link。
  4. Behavior funnel with drop-off diagnosis。
  5. Cohort plan by role、manager、case type、risk tier and training wave。
  6. Leading and lagging metrics hierarchy。
  7. Value leakage model including human review load。
  8. Risk/control evidence plan for over-reliance and control override。
  9. Monthly operating review agenda。
  10. Scale/stop recommendation memo.

Evaluation Rubric

CriterionStrong evidence
Baseline qualityCaptures real workflow, exceptions and informal workarounds
Adoption definitionDistinguishes exposure, qualified use, decision influence and outcome
Metrics maturityCombines behavior, quality, risk, value and durability
Attribution disciplineUses cohort or rollout logic and names confounders
Risk integrationTreats over-reliance, override, review load and customer harm as first-class
Operating loopConverts evidence into product, process, control and manager actions
Portfolio judgmentRecommends scale, redesign, restrict or stop with reasons

23. Minimum Viable Architecture

If time is limited, build this minimum version:

  1. One canonical adoption event schema.
  2. Work-as-done baseline for the highest-value workflow.
  3. Behavior funnel dashboard with cohort filters.
  4. Human action capture: accept, edit, reject, ignore, regenerate, override, escalate.
  5. Outcome join to cycle time, quality, rework and complaint.
  6. Human review load and control override report.
  7. Monthly evidence pack with scale/stop recommendation.

The mature version adds attribution models, manager reinforcement analysis, model-version drift monitoring, finance sign-off, value leakage automation and portfolio-level adoption heatmaps.


24. Final Principle

AI adoption analytics should make three uncomfortable truths visible:

Users may touch AI without trusting it.
Users may trust AI without changing the process.
The process may speed up without creating durable net value.

The job of senior AI PM, AI Architect and CBAP-level BA is to design the evidence system that separates those cases and turns adoption from a vanity story into a governed operating capability.