AI Adoption Analytics:行为改变与价值兑现架构
以下来源用于组织 AI 风险管理、AI 管理体系、变更管理、可观测性、工程绩效和价值证据语言。本文是学习和作品集材料, 不构成法律、合规、审计或监管结论。
AI Adoption Analytics / Behavior Change / Value Realization Architecture 解读
Target audience: Senior AI PM / AI Architect / Business Architect / CBAP-level BA / AI Transformation Lead / AI Value Office Lead / Financial Retail Operations Leader. Learning objectives: 建立一套能证明 AI 被真实采用、改变工作方式、改善流程结果并产生可持续价值的 evidence architecture, 而不是只报告 login、prompt count、seat activation 或 demo satisfaction。 Core question: AI 上线以后, 如何证明一线人员真的把它纳入 work-as-done, 行为和流程正在改变, 风险没有被转移到人工复核或客户伤害, 价值不是短期 novelty effect?
Source Anchors
以下来源用于组织 AI 风险管理、AI 管理体系、变更管理、可观测性、工程绩效和价值证据语言。本文是学习和作品集材料, 不构成法律、合规、审计或监管结论。
| Source | Link | 本文采用的思想 |
|---|---|---|
| NIST AI Risk Management Framework | https://www.nist.gov/itl/ai-risk-management-framework | 用 Govern / Map / Measure / Manage 组织 AI adoption 证据、风险测量、持续监控和处置闭环 |
| ISO/IEC 42001 AI management system | https://www.iso.org/standard/81230.html | 用 AI management system 的 policy、objective、operation、performance evaluation、improvement 语言管理 adoption 和 value realization |
| Prosci ADKAR | https://www.prosci.com/blog/adkar-model | 用 Awareness、Desire、Knowledge、Ability、Reinforcement 解释行为改变不是培训完成率 |
| OpenTelemetry Documentation | https://opentelemetry.io/docs/ | 用 traces、metrics、logs、semantic conventions 的思路设计 AI adoption telemetry 和 workflow observability |
| DORA | https://dora.dev/ | 用 deployment frequency、lead time、change fail rate、time to restore 的思想连接 AI 产品变更、运营学习和工程系统健康 |
| NIST AI RMF Playbook | https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook | 用 AI RMF action-oriented language 组织 evidence collection、monitoring 和 review routines |
| FFIEC Management IT Handbook | https://ithandbook.ffiec.gov/it-booklets/management.aspx | 用治理、风险识别、监控和报告语言对接金融机构管理层证据需求 |
一句话:
AI adoption analytics is not usage reporting; it is an evidence system that links exposure, trusted use, behavior change, workflow quality, control performance, unit economics and durable outcome realization.
1. Executive Summary
很多 AI 项目失败不是因为模型完全不可用, 而是因为企业无法证明以下链条成立:
AI shipped
-> target users were exposed in real workflow
-> users trusted it for the right tasks
-> behavior changed in work-as-done
-> process outcomes improved without hidden risk transfer
-> benefits were realized after cost, review load and control overhead
-> operating loop reinforced the new behavior
只看 usage 会产生严重误判:
| Vanity signal | 为什么危险 | 需要补充的证据 |
|---|---|---|
| Prompt count 高 | 可能是用户反复修错、探索新奇或被迫使用 | accepted suggestion rate、task completion、rework、sentiment、quality |
| MAU 高 | 可能只是登录, 没有进入关键工作步骤 | workflow step coverage、active case penetration、decision influence |
| Seat activation 高 | 可能是 license 被分配, 行为未改变 | cohort adoption、returning qualified use、manager reinforcement |
| Time saved survey | 容易高估, 也可能转移到复核或投诉 | cycle time、human review load、exception queue、quality and control evidence |
| Accuracy 提升 | 模型指标不能自动变成业务价值 | operational lift、policy compliance、loss reduction、cost-to-serve |
高级 AI PM / Architect / BA 的任务不是做一个 adoption dashboard, 而是定义:
- 什么算真正的 adoption event。
- 当前 work-as-done baseline 是什么。
- AI 在流程中改变了哪个 decision、artifact、handoff 或 control。
- 价值指标如何被 cohort、阶段、风险、成本和质量修正。
- 低 adoption 是产品问题、流程问题、信任问题、激励问题、经理节奏问题还是 change saturation。
- 什么时候 scale, 什么时候 redesign, 什么时候 stop。
2. Target Audience and Role Expectations
| Role | 应该负责的问题 | 典型输出 |
|---|---|---|
| Senior AI PM | AI 产品是否被目标用户持续、正确、愿意地纳入核心工作 | adoption event taxonomy、behavior funnel、scale/stop memo |
| AI Architect | telemetry、identity、workflow、policy、evidence 和 outcome data 是否可追溯 | adoption observability architecture、event schema、control trace |
| CBAP-level BA | 真实流程、角色、规则、例外、阻力和价值泄漏是否被建模 | work-as-done baseline、change impact map、stakeholder adoption analysis |
| Operations Lead | AI 是否改善队列、周期、质量、复核负担和客户体验 | operating review pack、manager coaching loop、exception taxonomy |
| Risk / Control Partner | AI adoption 是否引入过度依赖、绕控、错误升级或审计盲区 | control override log、human review evidence、risk acceptance record |
| Finance / Value Office | 收益是否可归因、可复现、可规模化 | benefits register、unit economics、value leakage analysis |
成熟组织会把 adoption 作为跨职能证据系统, 而不是交给培训团队或产品分析师单独完成。
3. Thesis: Adoption 是行为改变, 不是使用量
AI adoption 的最小证明单元不是:
user clicked AI button
而是:
in a named workflow step, a target user used an AI output in a governed way that changed or improved the work artifact, decision, handoff, cycle time, quality, control performance or customer outcome.
这意味着 adoption 必须同时回答 7 个问题:
| Question | 解释 |
|---|---|
| Who | 目标用户是谁, 是新手、专家、经理、承包商、分支机构、区域团队还是中央运营 |
| Where | 在哪个流程、系统、队列、案例类型、客户旅程、风险等级中使用 |
| What | AI 输出影响了摘要、建议、决策、下一步动作、客户回复、调查笔记还是控制证据 |
| How | 用户接受、修改、拒绝、覆盖、升级、重新生成还是绕开 |
| Why | 使用或拒绝的原因是什么, 是否与信任、质量、政策、激励、时间压力有关 |
| So what | 行为改变是否带来周期、质量、风险、成本或客户体验变化 |
| For how long | 变化是否跨 cohort、时间、经理、流程版本和模型版本持续存在 |
金融零售里的 AI adoption 往往是 workflow adoption:
- AML investigator 是否使用 AI-generated case narrative 缩短 SAR prep 前的证据整理, 而不是只看打开 copilot 次数。
- Contact-center agent 是否在受控话术边界内采用建议答案, 降低 hold time 和 repeat contact, 而不是只看 suggestion impressions。
- KYC onboarding analyst 是否通过 AI document completeness check 减少 rework 和 customer chase, 而不是只看 OCR 调用量。
- Credit ops analyst 是否使用 AI collateral summary 发现缺失条件并提升 first-pass approval quality, 而不是只看摘要生成数。
- Branch / relationship manager 是否把 copilot insight 转化为合规的下一步客户行动, 而不是只看周活。
4. Conceptual Model: Adoption-to-Value Evidence Chain
建议使用 8 层 evidence chain:
1. Exposure
2. Qualified use
3. Trust-calibrated use
4. Behavior change
5. Workflow quality
6. Outcome movement
7. Value realization
8. Reinforced operating loop
| Layer | 关键问题 | 示例指标 |
|---|---|---|
| Exposure | 目标用户是否在真实工作中看到 AI | eligible users exposed、case coverage、workflow step availability |
| Qualified use | 是否在目标任务中使用, 而不是随机探索 | qualified action rate、task-matched AI invocation、returning use |
| Trust-calibrated use | 用户是否在正确场景接受, 在不确定场景拒绝或升级 | accept/edit/reject/override mix、escalation appropriateness、trust calibration |
| Behavior change | 工作方式是否改变 | artifact reuse、new sequence adoption、manual step removal、handoff change |
| Workflow quality | 流程质量是否改善 | first-pass quality、rework、exception rate、control defects |
| Outcome movement | 业务结果是否变化 | AHT、cycle time、STP、loss rate、conversion、complaints |
| Value realization | 扣除成本和风险后是否产生收益 | net benefit、cost-to-serve、human review load、value leakage |
| Reinforcement | 组织是否强化新行为 | manager coaching、SOP updates、performance cadence、feedback loop closure |
这条链条的强度取决于最薄弱环节。一个 contact-center AI 可以有很高 exposure, 但如果 agents 只复制建议又触发投诉上升, 它不是成功 adoption, 而是过度依赖。
5. Work-as-Done Baseline
没有 baseline 的 adoption analytics 只是在讲上线故事。
Work-as-done baseline 必须捕捉真实工作, 包括非正式绕行、人工判断、系统切换、聊天求助和经理审批。BA 在这里的价值极高, 因为系统日志只记录 work-as-imagined 的一部分。
| Baseline dimension | 需要捕捉的内容 | 金融零售例子 |
|---|---|---|
| Case mix | 案例类型、风险等级、复杂度、渠道、区域 | AML alert type、KYC entity type、credit exception class |
| Actor map | 角色、经验层级、授权边界、经理干预 | investigator L1/L2、branch RM、contact-center specialist |
| Activity sequence | 实际步骤和系统切换 | CRM -> policy search -> core banking -> notes -> supervisor chat |
| Decision points | 哪些判断影响下一步 | escalate, close, request document, approve with condition |
| Artifacts | 产生哪些业务记录和客户沟通 | investigation narrative、call note、KYC deficiency notice |
| Controls | 哪些步骤是控制点 | sanctions check, suitability disclosure, dual review |
| Pain points | 等待、返工、缺信息、政策不清 | document chase, duplicated note-taking, uncertain policy |
| Informal workarounds | 用户实际用什么补系统短板 | shared spreadsheet, saved templates, peer review chat |
| Current metrics | 当前周期、质量、成本、队列、投诉 | AHT, alert aging, first-pass yield, re-open rate |
Baseline 不能只由访谈生成。推荐组合:
- SME interview and observation。
- Process mining 或 workflow log analysis。
- Case note sampling。
- Screen-flow / clickstream review, 在隐私和授权边界内。
- Manager review cadence and coaching artifacts。
- Exception queue and rework analysis。
- Customer complaint and quality assurance sampling。
6. Adoption Event Taxonomy
Adoption event taxonomy 是整套架构的核心。它把“用户使用了 AI”拆成可解释、可控、可审计的事件。
| Event class | Event examples | 价值含义 |
|---|---|---|
| Exposure | ai_surface_shown, suggestion_presented, copilot_available_in_case | 用户有机会采用 |
| Intent | open_assistant, ask_policy_question, request_case_summary | 用户表达任务意图 |
| AI output | summary_generated, recommendation_returned, next_best_action_returned | 系统产生可用输出 |
| Human response | accepted, edited, rejected, ignored, regenerated | 初步信任和可用性 |
| Decision influence | used_in_case_note, used_in_disposition, used_in_customer_response | AI 进入业务工件或决策 |
| Control action | override, escalate, dual_review_requested, policy_boundary_hit | 风险控制和信任校准 |
| Learning signal | feedback_positive, feedback_negative, reason_selected, defect_reported | 产品和模型学习 |
| Outcome link | case_closed, customer_contact_completed, document_deficiency_resolved | 连接到流程结果 |
| Reinforcement | manager_coached, sop_updated, job_aid_viewed, team_review_completed | 组织强化新行为 |
关键是区分:
AI touched the workflow
vs
AI changed the workflow
vs
AI improved the workflow
很多项目只能证明第一层, 却向管理层宣称第三层。
7. Reference Architecture
Adoption analytics 需要同时连接产品、流程、身份、模型、控制和业务结果。
AI Experience Layer
-> Workflow Integration Layer
-> Adoption Telemetry Layer
-> Identity / Cohort / Entitlement Layer
-> Process and Outcome Data Layer
-> Risk / Control Evidence Layer
-> Analytics and Attribution Layer
-> Value Realization and Operating Review Layer
7.1 Architecture Components
| Component | Responsibility | 关键设计点 |
|---|---|---|
| AI experience instrumentation | 捕捉用户看到、询问、接受、修改、拒绝、反馈和升级 | event taxonomy、low-friction reason capture、privacy minimization |
| Workflow context resolver | 将 AI 事件绑定到 case、customer journey、process step、queue、risk tier | workflow id、case id、stage id、control point id |
| Identity and cohort service | 定义 eligible users、role、team、region、experience、training、manager | cohort analysis、manager effect、change saturation |
| Model and prompt registry | 记录模型、prompt、policy pack、tool version | adoption 与模型版本、release、eval 结果可关联 |
| Outcome data connector | 拉取流程和业务结果 | AHT、cycle time、rework、STP、quality score、loss、complaint |
| Human review tracker | 捕捉 review load 和 control override | reviewer time、queue depth、override reason、defect severity |
| Risk evidence store | 保存 control hits、escalations、exceptions、audit trail | 不替代合规判断, 但提供治理证据 |
| Analytics workspace | funnel、cohort、attribution、leakage、saturation 分析 | baseline comparison、matched cohorts、segment drilldown |
| Operating review pack | 将指标转化为行动 | manager coaching、product backlog、risk actions、scale/stop decision |
7.2 Telemetry Design Principles
| Principle | 解释 |
|---|---|
| Instrument workflow, not only UI | 记录 AI 对业务步骤和工件的影响 |
| Preserve context | 没有 case type、risk tier、role、stage 的事件很难解释 |
| Capture negative signals | rejection、ignore、override、regenerate、complaint 都是高价值证据 |
| Minimize sensitive payload | 事件记录应保存引用、分类和必要摘要, 避免不必要客户数据 |
| Version everything | model、prompt、policy、workflow、training、SOP、feature flag 都要版本化 |
| Connect to outcomes | adoption 事件必须能连接到流程质量和业务结果 |
| Make evidence reviewable | 指标需要能追溯到样本、定义、口径和责任人 |
8. Data / Telemetry Schema
下面是一个面向金融零售 AI copilot 的最小可用 adoption event schema。它不是数据库物理设计, 而是 BA / PM / Architect 对齐口径的 canonical event contract。
| Field | Type | Description |
|---|---|---|
| event_id | string | 全局唯一事件 ID |
| event_time | timestamp | 事件发生时间 |
| event_name | enum | taxonomy 中的事件名 |
| event_class | enum | exposure / intent / output / response / influence / control / learning / outcome / reinforcement |
| user_id_hash | string | 去标识化用户标识 |
| role | enum | investigator / agent / analyst / RM / manager / QA / supervisor |
| team_id | string | 团队或分支机构 |
| manager_id_hash | string | 去标识化 manager, 用于 manager effect 分析 |
| cohort_id | string | pilot cohort、region、experience cohort、training cohort |
| workflow_id | string | AML investigation、KYC onboarding、credit ops、contact center 等 |
| workflow_stage | string | triage、review、customer contact、decision、quality check |
| case_id_hash | string | 去标识化 case reference |
| case_type | string | alert type、call reason、KYC entity type、credit exception |
| risk_tier | enum | low / medium / high / material |
| customer_segment | string | retail、SMB、wealth、branch、digital |
| ai_surface | string | embedded panel、inline suggestion、draft generator、policy search |
| model_id | string | model registry id |
| prompt_version | string | prompt or policy pack version |
| tool_ids | array | agent tools or connectors used |
| output_type | enum | summary / recommendation / draft / classification / next action / risk flag |
| confidence_band | enum | calibrated band if used; avoid false precision |
| user_action | enum | accept / edit / reject / ignore / regenerate / override / escalate |
| edit_distance_band | enum | none / light / material / rewrite |
| reason_code | enum | useful / inaccurate / incomplete / unsafe / policy_unclear / too_slow / not_relevant |
| control_point_id | string | linked control or policy boundary |
| override_reason | string | required when user overrides AI or control suggestion |
| human_review_required | boolean | 是否需要人工复核 |
| human_review_minutes | number | 复核负担, 可后补汇总 |
| downstream_artifact_id | string | note、letter、decision record、case narrative |
| outcome_event_id | string | linked process outcome event |
| latency_ms | number | 用户可感知延迟 |
| cost_estimate | number | token、tool、license 或单位成本估算 |
| privacy_class | enum | event-only / sensitive-reference / restricted |
| retention_class | enum | analytics / business-record-link / control-evidence |
8.1 Example Events
{
"event_name": "ai_recommendation_edited",
"event_class": "response",
"workflow_id": "aml_alert_investigation",
"workflow_stage": "case_narrative_draft",
"role": "investigator",
"case_type": "transaction_monitoring_alert",
"risk_tier": "high",
"output_type": "investigation_summary",
"user_action": "edit",
"edit_distance_band": "material",
"reason_code": "incomplete",
"control_point_id": "aml_secondary_review_required",
"human_review_required": true
}
{
"event_name": "suggested_answer_accepted",
"event_class": "decision_influence",
"workflow_id": "contact_center_agent_assist",
"workflow_stage": "customer_response",
"role": "agent",
"case_type": "card_dispute_status",
"risk_tier": "medium",
"output_type": "customer_reply",
"user_action": "accept",
"edit_distance_band": "light",
"reason_code": "useful",
"outcome_event_id": "call_completed"
}
9. Metrics Hierarchy
Adoption analytics 需要指标层级, 否则团队会把最容易收集的 usage 当成价值。
Telemetry metrics
-> Adoption metrics
-> Behavior change metrics
-> Flow / quality metrics
-> Risk / control metrics
-> Value metrics
-> Durability metrics
| Layer | 指标 | 解释 |
|---|---|---|
| Telemetry | event completeness、schema coverage、trace join rate | 数据是否可信 |
| Adoption | eligible-user active rate、qualified use rate、returning qualified use、case penetration | 是否在目标人群和目标任务中使用 |
| Behavior | accept/edit/reject mix、artifact reuse、manual step reduction、handoff change | 行为是否改变 |
| Flow / quality | cycle time、queue aging、first-pass quality、rework、QA defects | 流程是否改善 |
| Risk / control | override rate、escalation appropriateness、over-reliance signals、control defects | 风险是否被控制 |
| Value | net hours released、cost-to-serve、loss reduction、conversion lift、complaint reduction | 是否产生收益 |
| Durability | cohort retention、post-novelty persistence、manager variance、model version stability | 是否可持续 |
9.1 Leading and Lagging Indicators
| Type | Examples | 用法 |
|---|---|---|
| Leading | qualified use rate、accepted-with-light-edit rate、feedback density、manager coaching completion | 判断 adoption 是否形成早期动能 |
| Intermediate | first-pass quality、rework reduction、review queue depth、policy search time | 判断行为是否改善流程 |
| Lagging | cost reduction、loss reduction、revenue lift、complaint reduction、regulatory finding reduction | 判断价值是否实现 |
不要把 leading indicator 当成 business case close。它只能说明值得继续观察或 scale candidate, 不能单独证明收益。
9.2 Anti-Metrics
| Anti-metric | 可能说明 |
|---|---|
| Prompt count 上升但 cycle time 不降 | 用户在与系统搏斗 |
| Accept rate 极高但 defect 上升 | 过度依赖或缺少复核 |
| Reject rate 高且 reason 为 policy unclear | 边界和话术不可信 |
| Human review load 上升超过节省时间 | 价值泄漏到复核队列 |
| Manager variance 极大 | 采用依赖局部 champion, 未制度化 |
| 初期提升 4 周后回落 | novelty effect 或 reinforcement 不足 |
10. Behavior Funnel
行为漏斗把 adoption 从“看见”推进到“稳定改变”。
| Funnel step | Definition | Drop-off diagnosis |
|---|---|---|
| Eligible | 用户和案例符合目标场景 | cohort 定义错误、entitlement 不完整 |
| Exposed | AI 在正确工作步骤出现 | UI / workflow integration 不到位 |
| Engaged | 用户主动打开或响应 AI | 价值不明显、入口不自然、速度慢 |
| Assisted | AI 输出被阅读并进入任务 | 输出不相关、格式不匹配 |
| Influenced | 用户接受、编辑后采用或用来决策 | 信任不足、政策边界不清、质量不稳 |
| Completed | 任务或工件完成 | 下游系统或审批卡住 |
| Improved | 周期、质量或风险指标改善 | AI 只转移工作, 未改变瓶颈 |
| Reinforced | 经理、SOP、培训和绩效节奏支持新方式 | adoption 依赖个人热情 |
对于 AML investigator adoption, 漏斗可能是:
eligible AML alerts
-> alerts with copilot panel visible
-> investigator requests alert summary
-> summary used in investigation note
-> note passes QA with no material correction
-> alert aging reduced
-> investigator returns to use in next high-risk case
11. Cohort Analysis
没有 cohort, adoption 指标会把不同人群、经理、风险、案例复杂度和培训波次混在一起。
| Cohort dimension | 为什么重要 |
|---|---|
| Role / level | 新手和专家采用 AI 的动机相反: 新手需要 guidance, 专家需要 speed and precision |
| Team / manager | 经理 reinforcement 往往比培训更影响持续 adoption |
| Region / branch | 本地政策、客群、绩效压力和容量约束不同 |
| Case type / risk tier | 低风险场景高 adoption 不代表高风险场景可 scale |
| Tenure / experience | AI 可能帮助新人缩短 ramp, 也可能让专家觉得干扰 |
| Training wave | 可区分产品改进和 enablement 改进 |
| Model / prompt version | adoption 变化可能来自质量变化而不是 change program |
| Feature flag exposure | 便于 matched cohort 或 stepped-wedge rollout |
推荐分析:
- New vs experienced investigator adoption curve。
- Manager A/B 差异和 coaching pattern。
- Low-risk vs high-risk KYC case penetration。
- Pre-training vs post-training qualified use。
- Model version change 前后 accept/edit/reject mix。
- Contact-center queue type 对 AHT、QA、repeat contact 的影响。
12. Behavior Change Model
Prosci ADKAR 可以作为行为改变诊断框架, 但在 AI 场景必须工程化到 telemetry 和 operating cadence。
| ADKAR stage | AI adoption interpretation | Evidence |
|---|---|---|
| Awareness | 用户知道为什么改变, AI 解决什么流程问题 | launch narrative recall、manager briefing、problem framing survey |
| Desire | 用户愿意尝试, 认为对自己有益且不伤害绩效 | opt-in demand、champion participation、resistance signal trend |
| Knowledge | 用户知道何时用、何时不用、如何升级 | policy boundary quiz、in-product guidance use、correct escalation |
| Ability | 用户能在真实 case 中完成新工作方式 | qualified task completion、light-edit acceptance、rework reduction |
| Reinforcement | 新行为被经理、SOP、指标和反馈循环强化 | manager coaching log、SOP update、returning use、performance review alignment |
12.1 Resistance Signals
| Signal | 可能原因 | 产品/BA/运营动作 |
|---|---|---|
| High ignore rate | 入口干扰、输出时机错误 | 调整 trigger 和 placement |
| High regenerate rate | 输出不稳定或用户不知道如何提问 | 改 prompt、模板化任务、提升 retrieval |
| High reject with "policy unclear" | 边界不可信 | 增加政策引用、审批边界和解释 |
| Shadow use of external AI | 官方工具不满足实际工作 | 分析 unmet need, 改善 sanctioned tool |
| Manager discourages use | 激励或风险责任不清 | 更新 SOP、RACI 和 manager scorecard |
| Users accept then rewrite | 格式不符合业务工件 | 以业务 artifact 作为输出 contract |
| Adoption concentrated in champions | 组织强化不足 | 建立 peer coaching 和 team-level cadence |
12.2 Change Saturation
Change saturation 是高级 adoption 分析必须纳入的变量。一个团队可能不是抵抗 AI, 而是同时承受核心系统迁移、新产品上线、监管整改、组织调整和绩效压力。
| Saturation factor | Adoption implication |
|---|---|
| Concurrent process changes | AI adoption drop-off 可能来自流程不稳定 |
| Staffing shortage | 用户没有时间学习和反馈 |
| High queue backlog | 短期 pressure 会驱动 copy/paste 或绕控 |
| Policy changes | 用户不敢信任 AI 输出 |
| Manager turnover | reinforcement loop 断裂 |
| Incentive conflict | 用户被奖励速度但承担质量风险 |
Scale decision 必须包括 change load review, 否则会把组织容量问题误判为产品失败或用户抵抗。
13. Outcome Attribution
AI value proof 的难点不是“指标变了”, 而是“指标为什么变”。金融零售通常不能随意做简单 A/B, 因为存在客户公平、运营容量、风险等级和监管敏感性约束。可以使用更稳健的证据组合。
| Method | 适用场景 | 风险 |
|---|---|---|
| Matched cohort comparison | 有相似团队或案例可以对比 | 匹配不充分导致偏差 |
| Stepped-wedge rollout | 分批上线但最终覆盖全部目标群体 | 需要强 rollout discipline |
| Difference-in-differences | 有上线前后和对照组数据 | 外部变化可能干扰 |
| Interrupted time series | 有稳定长期指标 | 同期政策或队列变化需解释 |
| Shadow mode comparison | AI 输出不影响生产决策时评估 | 不能证明用户行为改变 |
| Workflow replay | 用历史 case 比较建议质量和处理路径 | 历史数据代表性有限 |
| Manager-level variance analysis | 分析 reinforcement 对 adoption 的影响 | 可能混入团队能力差异 |
归因报告要明确:
- Baseline period。
- Exposure and eligibility logic。
- Cohort selection。
- Confounders, 如 staffing、seasonality、policy changes、campaigns。
- Cost and human review adjustment。
- Quality and risk guardrail。
- Confidence level in business language, 不伪装成绝对因果。
14. Value Leakage
AI 项目常见问题是 gross benefit 看起来漂亮, net benefit 被泄漏吃掉。
| Leakage type | 例子 | 需要测量 |
|---|---|---|
| Human review load | AI 生成内容节省 5 分钟, 但 QA 多花 7 分钟 | reviewer minutes、queue depth、defect rate |
| Rework | 用户接受建议后被退回 | first-pass quality、re-open、correction reason |
| Control override | 用户频繁绕过 AI 或 AI 触发过多 false positive | override rate、false alert burden |
| Support burden | 一线不断问如何使用或如何解释 | help desk tickets、manager coaching time |
| Customer harm | 错误建议导致投诉或误导 | complaint linkage、customer correction events |
| Latency | AI 等待时间抵消人工节省 | latency p95、abandon rate |
| Model and vendor cost | 每个 case 的 token/license 成本上升 | unit cost per completed case |
| Change cost | 培训、SOP、经理会议和过渡期双跑 | enablement cost、dual-run cost |
| Trust debt | 早期错误导致长期不用 | post-incident adoption decay |
成熟的 value realization 公式:
Net realized value
= gross process benefit
- AI run cost
- human review load
- rework and exception cost
- support and change cost
- risk/control remediation cost
- customer harm adjustment
15. Risk / Control Architecture
Adoption analytics 不是只为增长服务, 也为风险控制服务。
| Risk | Adoption analytics signal | Control response |
|---|---|---|
| Over-reliance | accept rate 极高、edit distance 低、defect 上升 | sampling QA、friction in high-risk cases、confidence explanation |
| Under-reliance | high reject/ignore, strong quality evidence | workflow placement、trust building、manager coaching |
| Automation bias | 用户接受错误建议, 尤其在高风险 case | mandatory rationale、dual review、challenge prompts |
| Deskilling | 新人只复制 AI, 独立判断下降 | skill assessment、rotating unaided review、training |
| Bypass / shadow AI | 外部 AI 使用或复制敏感内容 | sanctioned tool improvement、DLP monitoring、policy communication |
| Control override abuse | frequent override without reason | override reason required、manager review、risk sampling |
| Hidden backlog transfer | AI 前台提速, 后台复核爆仓 | end-to-end queue monitoring |
| Unequal adoption | 某些分支或团队被排除 | cohort coverage review、access remediation |
| Model drift impact | 新版本后 reject、defect、complaint 上升 | version-linked monitoring、rollback trigger |
Control override 不是坏事。成熟系统必须区分:
- Healthy override: 用户发现 AI 不适用并正确升级。
- Suspicious override: 用户为追求速度绕过必要控制。
- Product-caused override: AI 输出格式或边界不符合工作需要。
- Policy-caused override: 规则不清导致用户不敢采用。
16. Operating Model
Adoption analytics 需要明确节奏, 否则 dashboard 不会改变任何事情。
16.1 Forums
| Forum | Cadence | Participants | Decision |
|---|---|---|---|
| Daily ops pulse | 每日或每两日 | Ops manager、AI PM、support lead | 是否有使用障碍、队列异常、控制告警 |
| Weekly adoption review | 每周 | AI PM、BA、manager champions、analytics | funnel drop-off、resistance signals、backlog actions |
| Biweekly risk/control review | 双周 | Risk、QA、ops、architect、PM | override、defect、complaint、human review load |
| Monthly value review | 每月 | Business owner、finance、Value Office、PM | benefit evidence、value leakage、scale/stop |
| Quarterly architecture review | 每季度 | Architect、platform、data、risk、product | telemetry coverage、platform reuse、model/tool lifecycle |
16.2 RACI
| Activity | AI PM | BA | Architect | Ops Manager | Risk/Control | Analytics | Finance |
|---|---|---|---|---|---|---|---|
| Adoption event taxonomy | A/R | R | C | C | C | C | I |
| Work-as-done baseline | C | A/R | I | R | C | C | I |
| Telemetry architecture | C | C | A/R | I | C | R | I |
| Behavior funnel review | A/R | R | I | R | C | R | I |
| Outcome attribution | A | C | I | C | C | R | C |
| Risk/control evidence | C | C | C | R | A/R | C | I |
| Benefits sign-off | R | C | I | C | I | C | A/R |
| Scale/stop recommendation | A/R | C | C | C | C | C | C |
16.3 Operational Learning Loop
Telemetry
-> analysis
-> hypothesis
-> workflow/product/control change
-> manager reinforcement
-> monitored rollout
-> evidence review
-> scale / redesign / stop
如果 review meeting 只解释指标, 没有 backlog、SOP、training、control 或 release action, 它不是 operating model, 只是 reporting ceremony。
17. Financial Retail Patterns
17.1 AML Investigator Adoption
| Evidence layer | Good signal | Bad signal |
|---|---|---|
| Qualified use | Copilot used on eligible alert types and investigation stages | Used mainly for low-risk easy cases |
| Behavior change | Investigation narratives reuse AI summaries with material analyst edits | Copy/paste without source verification |
| Quality | QA corrections decrease, missed evidence decreases | QA defects increase after high accept rate |
| Risk/control | Escalations occur when policy boundary is hit | Overrides without rationale |
| Value | Alert aging and prep time fall after review load adjustment | Review queue grows and SAR quality drops |
17.2 Contact-Center Agent Assist
| Evidence layer | Good signal | Bad signal |
|---|---|---|
| Qualified use | Agent assist appears in target call reasons | Suggestions shown for irrelevant call types |
| Behavior change | Agents use policy-grounded response and reduce hold time | Agents read generic text that frustrates customers |
| Quality | QA score and first contact resolution improve | AHT drops but repeat contact rises |
| Risk/control | Sensitive topics trigger approved handoff | Agents use AI response outside policy boundary |
| Value | Net handle time reduction after QA and complaint adjustment | Saved seconds offset by after-call correction |
17.3 KYC Onboarding
| Evidence layer | Good signal | Bad signal |
|---|---|---|
| Qualified use | AI completeness check used before customer chase | Used after analyst already completed manual review |
| Behavior change | Analysts request fewer unnecessary documents | AI flags too many deficiencies |
| Quality | First-pass completion and approval quality improve | False deficiency notices increase |
| Risk/control | High-risk entities still receive required review | AI creates pressure to under-review |
| Value | Cycle time falls without increased remediation | Customer frustration and rework rise |
17.4 Credit Ops
| Evidence layer | Good signal | Bad signal |
|---|---|---|
| Qualified use | Used for collateral summary and covenant extraction in target products | Used for final credit judgment |
| Behavior change | Analysts find missing conditions earlier | Analysts stop reading source documents |
| Quality | Approval package defects decrease | Exception approvals increase without rationale |
| Risk/control | Human decision rights remain explicit | Control override lacks audit trail |
| Value | Faster package prep and lower rework | Faster throughput with worse downstream losses |
17.5 Branch / Relationship Manager Copilot
| Evidence layer | Good signal | Bad signal |
|---|---|---|
| Qualified use | Used before client meeting for permitted insight prep | Used to generate unapproved advice |
| Behavior change | RM records better next actions and follow-ups | Tool becomes a generic note generator |
| Quality | Follow-up completion and customer relevance improve | Compliance review flags unsuitable content |
| Risk/control | Advice boundary and disclosure controls fire | RM bypasses prompts to get sales script |
| Value | Relationship actions improve retention or cross-sell quality | Short-term sales lift creates complaint risk |
18. Evidence Pack
Scale decision 需要一个 evidence pack, 而不是一张 usage chart。
| Evidence object | 内容 |
|---|---|
| Problem statement | 业务问题、目标流程、目标用户、不是 AI 技术愿望 |
| Work-as-done baseline | 当前步骤、角色、痛点、指标、例外和非正式绕行 |
| Adoption taxonomy | 事件定义、合格使用口径、control override definition |
| Telemetry quality report | event completeness、join rate、missing fields、known limitations |
| Behavior funnel | eligible -> exposed -> engaged -> influenced -> completed -> improved |
| Cohort analysis | role、manager、region、case type、risk tier、training wave |
| Outcome attribution | 对照、分批、时间序列或其他归因方法和局限 |
| Value leakage analysis | review load、rework、support、latency、cost、risk adjustment |
| Risk/control report | override、escalation、defect、complaint、over-reliance and under-reliance |
| User trust signals | feedback、qualitative themes、reason codes、trust calibration |
| Operating actions | product changes、SOP updates、training、manager coaching、control changes |
| Scale/stop recommendation | continue、redesign、scale、restrict、stop and why |
Evidence pack 的质量标准:
- 口径可解释。
- 指标可追溯。
- 负面证据没有被隐藏。
- 成本和复核负担已扣除。
- 风险和控制不是最后一页附录。
- 有明确下一步行动和责任人。
19. Anti-Patterns
| Anti-pattern | 为什么危险 | 更成熟做法 |
|---|---|---|
| Adoption = login / MAU | 无法证明工作改变 | 定义 qualified adoption event |
| Training completion = adoption | 学会和使用是两件事 | 用 work-as-done 行为漏斗验证 |
| Prompt count = value | 可能是摩擦和返工 | 连接 outcome、quality、review load |
| Accept rate 越高越好 | 可能是 automation bias | 同时看 defects、overrides、QA |
| 只看平均值 | 掩盖 cohort、manager 和 case mix 差异 | 做 cohort and segmentation |
| 只报节省小时 | 忽略复核、风险、支持和变更成本 | 做 net value and leakage analysis |
| 用 survey 代替 telemetry | 主观反馈不够 | 结合 telemetry、case sample、outcome |
| 忽略 resistance | 把用户问题简单归咎为不配合 | 诊断信任、流程、激励、容量 |
| 没有 baseline | 无法证明改变 | 建立 work-as-done baseline |
| 上线后没有 learning loop | 指标不会自动转化为改进 | 建立 operating review and action backlog |
20. PM / BA / Architect Implications
20.1 For Senior AI PM
- 在 PRD 中定义 adoption event, 不要等上线后再让 analytics 猜。
- 把 behavior funnel 和 scale/stop rule 放入 release criteria。
- 将 user trust、control override 和 value leakage 作为产品指标。
- 对低 adoption 不急着做培训, 先诊断 workflow fit、output contract、manager incentives 和 risk boundary。
20.2 For CBAP-level BA
- 用 work-as-done baseline 捕捉真实流程和隐性工作。
- 将 adoption 需求写成行为改变需求, 例如“investigator can complete narrative with cited evidence and appropriate escalation”。
- 定义 resistance signal taxonomy 和 reason codes。
- 确保 adoption event 与业务规则、控制点、异常路径和工件相连。
20.3 For AI Architect
- 把 telemetry schema 当成架构契约, 不是前端埋点清单。
- 让 traces 连接 user action、model version、workflow stage、control event 和 outcome。
- 支持 cohort、版本、风险等级和 case type 分析。
- 设计 retention、privacy class、evidence store 和 access control。
21. Interview Answers
Q1: 如何证明 AI 工具真的被采用, 而不是只有使用量?
我会把 adoption 定义为 qualified workflow adoption, 而不是 login 或 prompt count。首先建立 work-as-done baseline, 明确目标用户、流程步骤、case type、业务工件和当前痛点。然后定义 adoption event taxonomy: exposure、intent、output、accept/edit/reject、decision influence、control override、outcome link。接着用 behavior funnel 和 cohort analysis 看用户是否在真实任务中持续使用, 并连接 cycle time、first-pass quality、rework、complaint、human review load 和 cost-to-serve。最后用 evidence pack 做 scale/stop 决策, 明确收益、风险、价值泄漏和下一步操作。
Q2: AI adoption 指标和传统 SaaS usage 指标最大区别是什么?
传统 SaaS usage 更关注 seat activation、DAU、feature click 和 retention。AI adoption 必须看 trust-calibrated behavior change, 因为 AI 输出可能影响判断、客户沟通、控制执行和业务记录。高使用量可能代表价值, 也可能代表用户反复修错或过度依赖。因此我会同时看 accept/edit/reject mix、override、escalation、defect、human review load、workflow outcome 和 durability。AI 的好 adoption 不是更多点击, 而是在正确边界内改变工作并改善结果。
Q3: 如果 contact-center agent assist 上线后 MAU 很高, 但 AHT 没下降, 你怎么分析?
我不会先假设用户不配合。会拆 behavior funnel: eligible calls 是否正确 exposure, agent 是否真正使用建议, 建议是否被大量编辑或忽略, 是否把时间从通话中转移到 after-call work 或 QA correction。还要看 call reason cohort、agent tenure、manager team、latency、policy boundary hit、repeat contact、QA defect 和 customer complaint。可能原因包括输出格式不适合通话、政策引用不够可信、建议出现太晚、用户需要额外复核或 case mix 更复杂。下一步是用这些证据决定产品改进、流程调整、manager coaching 或限制场景。
Q4: 如何避免 AI adoption 造成 automation bias?
我会把 over-reliance 当成 adoption risk, 不把高 accept rate 自动解读为成功。设计上要有 confidence calibration、source evidence、policy boundary、high-risk friction、mandatory rationale、dual review 和 sampling QA。指标上看 accept rate 与 defect、complaint、override、escalation appropriateness、edit distance 的组合。如果高风险 case 中 accept rate 极高而 edit distance 极低, 但 QA defect 上升, 说明可能有 automation bias。治理上通过 risk/control review 和 version-linked rollback trigger 处理。
Q5: 如何向 CFO 解释 AI value realization?
我会区分 gross benefit 和 net realized value。Gross benefit 可能是节省处理时间或提高转化, 但 net realized value 必须扣除 AI run cost、license、token、human review load、rework、support、training、dual-run、control remediation 和 customer harm adjustment。然后用 cohort 或分批 rollout 说明归因可信度, 用 finance sign-off 固化口径。对 CFO 来说, 成功不是“用户喜欢 AI”, 而是可重复、可归因、扣除成本和风险后仍成立的价值。
Q6: AML investigator copilot 如何设计 adoption evidence?
我会从 alert investigation 的 work-as-done baseline 开始, 区分 alert type、risk tier、investigator level 和 review path。Adoption event 包括 summary requested、evidence viewed、narrative drafted、analyst edited、source checked、escalation triggered、QA correction、case closed。核心指标不是生成摘要数量, 而是 eligible case penetration、material edit rate、QA defect reduction、alert aging、SAR prep quality、review load、override reason 和 high-risk case control adherence。只有当效率、质量和控制证据同时成立, 才建议扩大 cohort。
22. Portfolio Exercise
目标: 为一个金融零售企业设计 AI adoption analytics evidence pack。
Scenario
企业正在同时推进 5 个 AI 用例:
| Use case | Business goal |
|---|---|
| AML investigator copilot | 缩短 alert investigation aging, 提升 case narrative quality |
| Contact-center agent assist | 降低 hold time, 提升 first contact resolution |
| KYC onboarding assistant | 减少 document rework 和客户追补 |
| Credit ops package reviewer | 提升 approval package first-pass quality |
| Branch / RM copilot | 提升客户跟进质量和合规下一步行动 |
Required Artifacts
- Work-as-done baseline map for one workflow.
- Adoption event taxonomy with at least 12 events.
- Telemetry schema subset with role、case type、workflow stage、model version、human action、control action and outcome link。
- Behavior funnel with drop-off diagnosis。
- Cohort plan by role、manager、case type、risk tier and training wave。
- Leading and lagging metrics hierarchy。
- Value leakage model including human review load。
- Risk/control evidence plan for over-reliance and control override。
- Monthly operating review agenda。
- Scale/stop recommendation memo.
Evaluation Rubric
| Criterion | Strong evidence |
|---|---|
| Baseline quality | Captures real workflow, exceptions and informal workarounds |
| Adoption definition | Distinguishes exposure, qualified use, decision influence and outcome |
| Metrics maturity | Combines behavior, quality, risk, value and durability |
| Attribution discipline | Uses cohort or rollout logic and names confounders |
| Risk integration | Treats over-reliance, override, review load and customer harm as first-class |
| Operating loop | Converts evidence into product, process, control and manager actions |
| Portfolio judgment | Recommends scale, redesign, restrict or stop with reasons |
23. Minimum Viable Architecture
If time is limited, build this minimum version:
- One canonical adoption event schema.
- Work-as-done baseline for the highest-value workflow.
- Behavior funnel dashboard with cohort filters.
- Human action capture: accept, edit, reject, ignore, regenerate, override, escalate.
- Outcome join to cycle time, quality, rework and complaint.
- Human review load and control override report.
- Monthly evidence pack with scale/stop recommendation.
The mature version adds attribution models, manager reinforcement analysis, model-version drift monitoring, finance sign-off, value leakage automation and portfolio-level adoption heatmaps.
24. Final Principle
AI adoption analytics should make three uncomfortable truths visible:
Users may touch AI without trusting it.
Users may trust AI without changing the process.
The process may speed up without creating durable net value.
The job of senior AI PM, AI Architect and CBAP-level BA is to design the evidence system that separates those cases and turns adoption from a vanity story into a governed operating capability.