目录
AI Adoption Analytics / Behavior Change / Value Realization Playbook
目标: 给 Senior AI PM / AI Architect / CBAP-level BA 一套可执行的方法, 用于证明 AI 在金融零售场景中被真实采用、改变工作方式、改善流程质量并产生可持续净价值。
核心原则: AI adoption 不是培训完成率、登录数、prompt 数或 license 激活率; adoption 是有证据的工作行为改变和价值实现。
适用场景: AML investigator adoption、contact-center agent-assist、KYC onboarding、credit ops、branch / relationship manager copilot、内部运营 copilots 和 AI workflow automation。
1. Target Audience
Audience 使用本手册完成什么 Senior AI PM 定义 adoption success criteria、behavior funnel、scale/stop gates 和 product improvement loop AI Architect 设计 telemetry、trace、schema、identity、workflow outcome join 和 evidence store CBAP-level BA 建立 work-as-done baseline、change impact map、resistance taxonomy 和 process outcome model AI Value Office 将 adoption 证据纳入 portfolio value realization、funding gate 和 finance sign-off Operations Leader 用 adoption evidence 管理经理 coaching、SOP 调整、队列负荷和服务质量 Risk / Control Partner 查看 over-reliance、control override、human review load、complaint 和 exception evidence
2. Learning Objectives
完成本 playbook 后, 你应该能:
把 AI usage metric 改造成 adoption event taxonomy。
用 work-as-done baseline 识别真实工作中的行为改变点。
设计 telemetry schema, 连接 user action、workflow context、model version、control action 和 outcome。
建立 leading / lagging / risk / value / durability 指标层级。
用 behavior funnel、cohort analysis 和 resistance signals 诊断 adoption。
用 ADKAR 思路建立行为改变 operating model, 但不把培训当 adoption。
计算 value leakage, 包括 human review load、rework、support、latency、control overhead 和 customer harm adjustment。
为 AI use case 建立 evidence pack 和 scale/stop decision memo。
3. Executive Summary
AI 项目上线后, 常见报告是:
licenses activated: 1,200
weekly active users: 870
prompts submitted: 42,000
average satisfaction: 4.2/5
这些数字不能证明 AI 创造了价值。它们只证明有人接触了工具。成熟的 adoption analytics 必须证明:
Eligible workflow population
-> real exposure
-> qualified task use
-> trust-calibrated human action
-> changed work artifact or decision
-> improved process flow / quality / control
-> realized net value
-> reinforced behavior over time
本手册把 adoption analytics 拆成 12 个执行资产:
Asset 用途 Work-as-done baseline 捕捉真实流程和当前价值基线 Adoption event taxonomy 定义什么算真实 adoption Telemetry schema 让 adoption 可测、可追溯、可审查 Metrics hierarchy 防止 usage 指标冒充业务价值 Behavior funnel 定位 adoption drop-off Cohort analysis 识别角色、经理、case type 和风险等级差异 Resistance signal map 解释用户不用、误用或绕用的原因 Change saturation review 判断组织是否有容量吸收变化 Outcome attribution model 解释结果变化与 AI 的关系 Value leakage model 从 gross benefit 到 net realized value Risk/control pack 监控 over-reliance、override、review load 和客户影响 Operating review loop 把证据变成产品、流程、控制和管理动作
4. Source Anchors
这些来源用于对齐治理、管理体系、变更管理、可观测性和工程绩效语言。本文不提供法律意见; 所有治理内容均作为产品、架构和管理证据设计。
5. Conceptual Model
5.1 Adoption-to-Value Chain
Problem and baseline
-> AI intervention
-> exposure
-> qualified use
-> human trust action
-> behavior change
-> process quality change
-> business outcome
-> net value
-> reinforcement
5.2 Definitions
Term Definition Exposure 目标用户在真实工作步骤中有机会看到或调用 AI Qualified use 用户在目标任务、目标 case type、目标流程阶段使用 AI Trust-calibrated action 用户能正确接受、编辑、拒绝、升级或覆盖 AI 输出 Behavior change 工作顺序、工件、决策、handoff 或控制执行发生可观察变化 Workflow outcome 周期、质量、返工、队列、客户体验、风险控制等流程结果 Net realized value 扣除运行、复核、返工、支持、风险和变更成本后的收益 Durability adoption 和 outcome 在 novelty effect 后仍然持续
5.3 The Senior Test
如果一个 AI use case 不能回答下面 6 个问题, 不应进入 scale:
当前 work-as-done baseline 是什么?
什么事件证明用户在目标流程中真实采用?
采用行为改变了哪个工件、判断、handoff 或控制?
哪些 leading 和 lagging indicators 证明流程改善?
human review load、override、rework 和 cost-to-serve 是否吞掉价值?
组织如何通过经理节奏、SOP、培训和产品改进强化新行为?
6. Architecture Components
Component Owner Execution details Workflow map BA / Ops AS-IS, work-as-done, exception path, control point, artifact map Event taxonomy AI PM / BA Define exposure, intent, output, response, influence, control, outcome events Instrumentation SDK Architect / Engineering Emit events with workflow context, model version and user action Identity and cohort layer Architect / Analytics Role, team, manager, training wave, region, risk entitlement Model and prompt registry Platform / Architect model_id, prompt_version, tool version, policy pack Outcome connector Data / Analytics Join events to AHT, cycle time, quality, rework, STP, complaint, loss Control evidence store Risk / Architect Overrides, escalations, QA defects, dual review, policy boundary hits Adoption mart Analytics Curated tables for funnel, cohort, attribution and value leakage Dashboard and evidence pack PM / Value Office Monthly operating pack and scale/stop memo Operational learning loop PM / Ops Product backlog, SOP update, coaching, training, control tuning
6.1 Reference Data Flow
AI surface
-> adoption event stream
-> workflow context resolver
-> event validation and privacy filtering
-> adoption mart
-> outcome and control joins
-> behavior funnel / cohort / value analytics
-> operating review and action backlog
6.2 Architecture Principles
Principle Design implication Context first Every event carries workflow_id, stage, role, case type and risk tier Version everything model_id, prompt_version, policy_pack_version, SOP_version and feature_flag Capture human judgment accept/edit/reject/ignore/regenerate/override/escalate are first-class Do not over-collect payload Store event facts and references, not unnecessary customer content Link to outcomes Adoption metrics without outcome join are not value evidence Preserve negative evidence Rejection, complaint, defect and bypass signals drive learning Reviewability Metric definitions, lineage and sample drilldown must be inspectable
7. Work-as-Done Baseline Template
Use this before instrumentation. Do not start with the AI tool; start with how work happens today.
Field Questions Example: KYC onboarding Workflow Which end-to-end process? New SMB account onboarding Trigger What starts the work? Application submitted with documents Actor Who does the work? KYC analyst, RM, onboarding ops, QA Case mix What types and complexity? Sole proprietor, LLC, high-risk country exposure Systems Which systems are used? CRM, document store, screening, core banking Artifacts What records are created? deficiency notice, review note, approval record Controls Which control points matter? sanctions, beneficial ownership, risk rating Pain points Where is work slow or poor? repeated customer document chase Informal work What unofficial workarounds exist? analyst checklist spreadsheet Current metrics What is baseline? cycle time, first-pass completeness, rework Failure modes What causes defects? outdated policy, missing doc, unclear ownership Change capacity What else is changing? new onboarding policy and CRM migration
7.1 Baseline Evidence Sources
Source What it proves SME observation Actual sequence, friction and judgment Process logs Timing, queue, handoff and rework Case notes Artifact quality and evidence gaps QA samples Defect type and severity Manager coaching logs Behavioral patterns and recurring issues Complaint records Customer harm or confusion Policy and SOP Expected controls and business rules Informal tools review Workarounds not visible in system logs
8. Adoption Event Taxonomy
8.1 Event Classes
Class Required events Why it matters Exposure AI panel shown, suggestion presented, feature available in eligible case Proves opportunity to use Intent User opens assistant, asks task-specific question, requests summary Proves user pull Output Summary, classification, recommendation, draft, next action generated Proves AI response existed Human response Accept, edit, reject, ignore, regenerate Proves trust and fit Decision influence Used in note, customer response, disposition, package, handoff Proves workflow impact Control action Override, escalation, dual review, policy boundary hit Proves governed use Learning signal Feedback reason, defect report, manager comment Proves improvement signal Outcome Case closed, call completed, application approved, package passed QA Proves process link Reinforcement Manager coaching, SOP update, training wave, team review Proves behavior support
8.2 Event Naming Convention
Use this pattern:
<workflow>.<stage>.<ai_surface>.<event_action>
Examples:
Event name Meaning aml.triage.case_summary.generatedAML summary produced during triage aml.investigation.narrative.accepted_with_editInvestigator used AI narrative with edits contact_center.customer_response.suggestion.rejectedAgent rejected suggested response kyc.document_review.completeness_flag.overriddenAnalyst overrode AI document flag credit_ops.package_review.condition.extractedCredit condition extracted into review package branch.rm_prep.next_action.escalatedRM escalated AI next action due to policy boundary
8.3 Qualified Adoption Event
Recommended definition:
A qualified adoption event occurs when an eligible user, in an eligible workflow stage and case type, uses an AI output to influence a governed work artifact, decision, handoff or customer interaction, with human action and control outcome captured.
9. Data / Telemetry Schema
9.1 Canonical Event Contract
Field Required Description event_id Yes Unique event id event_time Yes Event timestamp event_name Yes Taxonomy event name event_class Yes Exposure, intent, output, response, influence, control, learning, outcome, reinforcement user_id_hash Yes Pseudonymous worker id role Yes Agent, investigator, analyst, RM, manager, QA team_id Yes Team, branch, region or operations unit manager_id_hash Recommended Enables manager effect analysis cohort_id Yes Pilot wave, training wave or feature flag cohort workflow_id Yes AML, KYC, contact center, credit ops, branch RM workflow_stage Yes Triage, document review, customer response, QA, decision case_id_hash Yes Pseudonymous case id case_type Yes Alert type, call reason, product, onboarding type case_complexity Recommended Low, medium, high or scoring band risk_tier Yes Business risk tier ai_surface Yes Panel, inline suggestion, draft generator, policy search model_id Yes Model registry id prompt_version Yes Prompt or policy pack version tool_ids Recommended Tools or connectors invoked output_type Yes Summary, recommendation, draft, classification, next action user_action Yes Accept, edit, reject, ignore, regenerate, override, escalate edit_distance_band Recommended None, light, material, rewrite reason_code Recommended Useful, inaccurate, incomplete, unsafe, policy unclear, slow, irrelevant control_point_id Recommended Link to control or policy boundary override_reason Conditional Required when override occurs human_review_required Yes True or false human_review_minutes Recommended Review load downstream_artifact_id Recommended Note, letter, case record or decision package outcome_event_id Recommended Link to process outcome latency_ms Recommended Response latency cost_estimate Recommended Unit cost estimate privacy_class Yes Event-only, sensitive-reference, restricted retention_class Yes Analytics, business-record-link, control-evidence
9.2 OpenTelemetry Mapping
Adoption concept Observability mapping Case journey Trace Workflow step Span AI call Span with model and prompt attributes User action Event on span Control override Event with control attributes Outcome Linked span or downstream event Aggregate adoption Metric Defect or complaint Log/event with trace link
Example trace structure:
trace: kyc_onboarding_case_abc
span: document_review
event: ai_completeness_check.generated
event: analyst_flag.accepted
span: customer_deficiency_notice
event: ai_notice_draft.edited
span: qa_review
event: qa_passed
10. Metrics Hierarchy
10.1 Metric Stack
Layer Metrics Owner Telemetry quality event completeness, missing context, join rate, schema drift Architect / Analytics Exposure eligible users exposed, eligible case exposure, workflow placement coverage PM / Analytics Qualified adoption qualified use rate, returning qualified use, case penetration PM Trust and behavior accept/edit/reject mix, edit distance, override, escalation, artifact reuse PM / BA Flow and quality cycle time, AHT, queue aging, first-pass quality, rework, QA defects Ops Risk and control over-reliance, under-reliance, policy boundary hits, complaint linkage Risk / QA Value net hours released, cost-to-serve, loss reduction, conversion, complaint reduction Value Office / Finance Durability 4/8/12-week retention, manager variance, post-release stability PM / Ops
10.2 Leading and Lagging Indicators
Indicator type Example Decision use Leading eligible exposure, qualified use, light-edit acceptance, feedback density Improve product and enablement Behavior artifact reuse, reduced manual search, correct escalation, SOP-aligned action Confirm work is changing Flow AHT, cycle time, queue depth, first-pass quality, rework Confirm process improvement Risk defect rate, control override, complaint linkage, review burden Confirm governed adoption Lagging cost reduction, revenue lift, loss reduction, customer retention Confirm value realization Durability cohort retention after 8-12 weeks, manager variance, version stability Confirm scale readiness
10.3 Metric Guardrails
Metric Must not be interpreted alone High prompt count Could mean confusion or poor output High accept rate Could mean automation bias Low override rate Could mean users do not understand controls AHT reduction Could hide repeat contact or QA rework Time saved survey Could ignore review load and support cost High NPS from users Could coexist with customer harm
11. Behavior Change Model
11.1 ADKAR-to-Evidence Map
ADKAR stage Execution evidence Analytics signal Awareness Managers communicate why workflow changes Awareness pulse, team briefing completion Desire Users believe AI helps their work and does not punish them opt-in demand, low resistance, champion pull Knowledge Users know when to use, avoid, escalate and override correct reason codes, policy quiz, guidance views Ability Users perform the new workflow in real cases qualified completion, light-edit acceptance, reduced rework Reinforcement Managers, SOP and metrics reinforce new behavior returning use, coaching logs, SOP_version adoption
11.2 Resistance Signal Taxonomy
Signal Diagnostic question Response Ignore Is AI shown at the wrong time? Move trigger closer to decision point Reject Is output inaccurate, irrelevant or untrusted? Improve retrieval, prompt, source evidence Regenerate Is user trying to force a better answer? Add structured task templates Heavy edit Is output format mismatched to artifact? Redesign output contract Override Is user bypassing control or correcting AI? Require reason and review patterns Shadow AI Is sanctioned tool missing a real need? Bring unmet need into roadmap Low returning use Was initial experience poor or reinforcement absent? Fix first-run quality and manager coaching Team variance Is adoption manager-led? Add manager enablement and peer learning Complaint rise Is AI improving internal speed at customer expense? Stop or restrict affected scenario
11.3 Change Saturation Review
Score each factor 1-5 before scale:
Factor Evidence Concurrent initiatives Core migration, CRM change, policy update, org redesign Queue pressure Backlog, staffing shortage, overtime Risk pressure Active audit issue, recent incident, regulatory remediation Manager capacity Manager span, turnover, coaching time Training load Other mandatory training, certification, release fatigue Process stability SOP changes, exception volume, policy ambiguity
Decision rule:
High value + high change saturation = controlled rollout with manager capacity plan, not broad launch.
12. Behavior Funnel
12.1 Standard Funnel
Step Measure Drop-off interpretation Eligible Users and cases in target population Scope and entitlement Exposed AI shown in target workflow Integration and placement Engaged User opens or responds Initial relevance and desire Assisted AI returns usable output Model and retrieval quality Influenced User accepts, edits, uses in artifact or decision Trust and workflow fit Completed Task completed in governed process Downstream process compatibility Improved Flow, quality or control improves Business value potential Repeated User returns in next eligible case Durability Reinforced Manager or SOP supports behavior Organizational adoption
Step Metric Eligible Calls in target call reasons Exposed Percent of eligible calls where suggestion panel appears before response point Engaged Agent views or expands suggested response Assisted Suggestion returned with policy citation within latency SLO Influenced Agent accepts or edits suggestion into response Completed Call completes without transfer caused by AI confusion Improved AHT and after-call work decrease, FCR and QA do not decline Repeated Agent uses assist in next 5 eligible calls Reinforced Team manager reviews quality and coaching signals weekly
13. Cohort Analysis
13.1 Required Cohorts
Cohort Why it matters Role Agent, investigator, analyst, RM and manager have different adoption paths Experience New users may need guidance; experts need precision and speed Manager/team Reinforcement is often manager-mediated Region/branch Local processes, customers and capacity differ Case type Adoption in easy cases does not prove hard-case adoption Risk tier High-risk adoption needs stronger control evidence Training wave Separates enablement from product quality Feature flag Supports staged rollout and attribution Model/prompt version Detects quality changes and trust debt
13.2 Cohort Dashboard Rows
Row Columns Team eligible cases, exposed cases, qualified use, influence rate, quality, risk, value Manager returning use, resistance reasons, coaching completion, defect trend Experience time to first qualified use, edit mix, rework, confidence Case type penetration, accept/edit/reject, cycle time, QA defect, complaint Risk tier adoption, override, escalation, dual review, defect severity
14. Outcome Attribution
14.1 Evidence Options
Method Use when Required evidence Matched cohort Similar teams or cases exist matching variables, baseline balance Stepped rollout Rollout can be phased rollout schedule, no major confounder Difference-in-differences Pre/post and comparison group exist trend check, external change log Interrupted time series Long stable history exists seasonality, policy and staffing changes Shadow mode Need pre-production quality evidence historical case set, expert judgment Workflow replay Need compare AI-assisted path replay rules, representative samples
14.2 Attribution Checklist
Item Question Baseline period Is baseline long enough and representative? Eligibility Who and what cases were included? Exposure Did eligible users actually see AI? Confounders Staffing, campaign, policy, queue, seasonality, system changes? Quality guardrail Did quality, complaint or risk degrade? Cost adjustment Are review load and AI run costs included? Durability Does effect persist after initial novelty? Decision Does evidence support scale, redesign, restrict or stop?
15. Value Realization and Leakage
15.1 Benefits Register
Field Definition use_case_id Unique AI use case workflow_id Target workflow baseline_metric Current performance target_metric Expected improvement benefit_type Cost, revenue, risk reduction, quality, capacity, customer experience attribution_method Cohort, rollout, time series, expert-reviewed gross_benefit Benefit before cost and leakage AI_run_cost Model, infra, license, vendor review_load_cost Human review and QA time rework_cost Corrections, reopened cases, repeat contact support_change_cost Training, manager coaching, support tickets risk_adjustment Control remediation, complaint, incident net_realized_value Gross benefit minus costs and leakage finance_status Draft, challenged, accepted, rejected scale_decision Scale, continue pilot, redesign, restrict, stop
15.2 Value Leakage Patterns
Leakage Detection Action Review load Reviewer minutes per AI-assisted case rise Improve confidence, sampling strategy, output quality Rework Reopen or correction rate rises Fix artifact contract and source evidence Support burden Help desk or manager questions rise Improve in-product guidance Latency User abandons or works around AI Optimize model route or precompute Control friction Overrides and false positives rise Tune policy boundary and escalation Customer harm Complaint or repeat contact rises Restrict scenario and review content Cost creep Unit cost grows with scale Cost guardrails and routing Trust debt Adoption drops after incident Recovery plan and quality evidence
net_realized_value
= gross_benefit
- AI_run_cost
- human_review_load_cost
- rework_cost
- support_and_change_cost
- risk_and_control_cost
- customer_harm_adjustment
16. Risk / Control Framework
16.1 Risk Signals
Risk Signal Control Over-reliance High accept, low edit, rising defects sampling QA, rationale, high-risk friction Under-reliance High reject despite good quality trust evidence, workflow placement, coaching Control bypass Override without reason mandatory reason, manager review Hidden review burden Review queue grows end-to-end capacity dashboard Policy boundary drift AI answers outside allowed domain policy engine, refusal, escalation Customer harm Complaint, repeat contact, correction scenario restriction, content review Uneven access Low exposure in certain branches entitlement and training remediation Model version trust decay Adoption drops after release version rollback and communication
16.2 Control Override Classification
Classification Meaning Review action Corrective override User corrected AI error Feed defect into model/product backlog Risk override User bypassed control Manager/risk review Policy ambiguity User could not determine boundary Clarify SOP and policy evidence Workflow mismatch AI suggestion did not fit actual process Redesign output or trigger Emergency override Used due to service or customer urgency Review exception governance
16.3 Human Review Load
Human-in-the-loop is not free. Track:
Metric Why review_minutes_per_case Measures hidden labor reviewer_queue_depth Detects backlog transfer review_defect_yield Shows whether review finds real issues review_sampling_rate Controls auditability and cost review_escalation_rate Shows uncertainty and boundary issues review_reversal_rate Shows AI or user judgment quality
17. Operating Model
17.1 Review Cadence
Cadence Forum Decision Daily Ops pulse Blockers, latency, incidents, support questions Weekly Adoption working session Funnel drop-off, resistance, product fixes Biweekly Risk/control review Overrides, defects, complaints, review load Monthly Value realization review Benefit, leakage, finance challenge, scale/stop Quarterly Architecture and portfolio review Platform reuse, telemetry maturity, lifecycle
17.2 RACI
Activity AI PM BA Architect Ops Risk Analytics Finance Define adoption taxonomy A/R R C C C C I Build work-as-done baseline C A/R I R C C I Implement telemetry C C A/R I C R I Validate data quality C C R I C A/R I Run behavior funnel review A/R R I R C R I Manage resistance actions A/R R I R I C I Review risk/control C C C R A/R C I Calculate value R C I C C C A/R Decide scale/stop A/R C C C C C C
17.3 Operational Learning Loop
Step Output Observe Adoption telemetry, quality, review and outcome evidence Diagnose Funnel drop-off, cohort variance, resistance signal Decide Product, workflow, control, training or manager action Change Release, SOP update, coaching, control tuning Reinforce Team cadence and performance conversation Measure Same metrics after change Record Decision log and evidence pack update
18. Evidence Pack
18.1 Evidence Pack Structure
Section Content Executive summary Adoption, behavior, risk, value, decision Problem and baseline Work-as-done, pain points, baseline metrics Intervention AI capability, workflow integration, model/prompt version Event taxonomy Qualified adoption definition and events Telemetry quality Completeness, join rate, known limitations Behavior funnel Step conversion and drop-off Cohort analysis Role, manager, team, case type, risk tier Outcome attribution Method, baseline, confounders, confidence Value realization Gross benefit, leakage, net value Risk/control Over-reliance, override, review load, defects, complaints User trust Reason codes, qualitative themes, sentiment Operating actions Backlog, SOP, training, manager coaching Decision Scale, continue pilot, redesign, restrict or stop
18.2 Evidence Quality Rubric
Level Meaning Weak Usage-only, no baseline, no outcome join Developing Baseline and adoption funnel exist, limited cohort analysis Strong Cohort, outcome, risk, review load and value leakage included Executive-ready Finance-challenged value, risk-reviewed controls, clear scale/stop action
19. Execution Roadmap
Days 1-15: Baseline and Taxonomy
Day range Work 1-3 Select one high-value workflow and define business owner 4-6 Build work-as-done baseline from observation, logs, QA and SME review 7-9 Define adoption event taxonomy and qualified adoption event 10-12 Define metrics hierarchy and risk/control signals 13-15 Review with ops, risk, architecture and finance
Days 16-35: Instrumentation and First Dashboard
Day range Work 16-20 Implement event contract with workflow context and model version 21-24 Connect identity, cohort, feature flag and training wave 25-28 Join process outcome and QA/control data 29-32 Build behavior funnel and cohort dashboard 33-35 Validate telemetry completeness and metric definitions
Days 36-60: Pilot Evidence
Day range Work 36-42 Run pilot with manager reinforcement and support path 43-48 Analyze resistance signals, edit/reject/override reasons 49-53 Calculate review load, rework, cost and early value leakage 54-57 Run risk/control review 58-60 Produce pilot evidence pack and decision recommendation
Days 61-90: Scale, Redesign or Stop
Day range Work 61-70 Expand only if risk and value evidence meet gate 71-78 Add attribution method and durability monitoring 79-84 Update SOP, training, manager coaching and product backlog 85-88 Finance challenge net value and unit economics 89-90 Finalize scale/stop memo and portfolio recommendation
20. Financial Retail Examples
20.1 AML Investigator Copilot
Area Execution Qualified adoption Investigation summary used in narrative or evidence review for eligible alert Leading indicators Summary request rate, source citation view, light-edit narrative acceptance Lagging indicators Alert aging, QA correction, SAR prep quality, re-open rate Risk controls High-risk alert dual review, source verification, override reason Value leakage Senior reviewer time, false comfort, rework due to incomplete narrative Scale rule Scale only if aging improves and QA/control defects do not rise
Area Execution Qualified adoption Suggested response used in target call reason with policy citation Leading indicators Suggestion view, accept-with-edit, latency under SLO Lagging indicators AHT, FCR, repeat contact, QA score, complaint Risk controls Sensitive topic handoff, script boundary, supervisor sampling Value leakage After-call correction, customer repeat contact, QA review burden Scale rule Scale by call reason, not by whole contact center
20.3 KYC Onboarding Assistant
Area Execution Qualified adoption AI document completeness flag used before customer chase Leading indicators Completeness check rate, deficiency draft edit rate Lagging indicators First-pass completeness, cycle time, customer chase count Risk controls High-risk entity review, sanctions and beneficial ownership controls Value leakage False deficiency notices, analyst re-review, customer frustration Scale rule Scale only where false deficiency rate and remediation do not rise
20.4 Credit Ops Reviewer
Area Execution Qualified adoption AI extraction used in package review, not final credit judgment Leading indicators Extracted condition reviewed, collateral summary edited Lagging indicators First-pass package quality, approval rework, condition miss rate Risk controls Human decision owner, exception rationale, QA sampling Value leakage Analyst over-review, downstream correction, risk review escalation Scale rule Scale after package quality improves across product cohorts
20.5 Branch / Relationship Manager Copilot
Area Execution Qualified adoption RM uses permitted insight to prepare client follow-up Leading indicators Meeting prep use, next-action capture, follow-up completion Lagging indicators Retention, qualified referral, customer satisfaction, complaint Risk controls Advice boundary, disclosure, approved product language Value leakage Compliance review, unsuitable suggestion correction, trust damage Scale rule Scale by relationship segment and product boundary
21. Templates
21.1 Adoption Event Card
Field Fill with concrete value Event name workflow.stage.surface.actionEvent class Exposure / intent / output / response / influence / control / learning / outcome Workflow Named workflow Stage Exact process step Eligible users Roles and cohorts Eligible cases Case types and risk tiers Human action Accept, edit, reject, ignore, regenerate, override, escalate Business artifact Note, decision, response, package, handoff Control point Policy or control id Outcome link Downstream result Misinterpretation risk How this event could be over-read
21.2 Behavior Funnel Dashboard
Section Metrics Population eligible users, eligible cases, cohort mix Exposure surface shown, workflow availability Engagement open, view, ask, response Influence accept, edit, reject, artifact use Completion task completed, handoff completed Improvement cycle, quality, rework, customer result Risk override, escalation, defect, complaint Durability 4/8/12-week repeat use
21.3 Monthly Operating Review Agenda
Agenda item Decision Telemetry quality Can we trust the data? Funnel drop-off What is the biggest adoption bottleneck? Cohort variance Which manager/team/case type needs action? Resistance signals Product, workflow, trust, control or incentive issue? Risk/control Any over-reliance, override or complaint trend? Value leakage Is review load or rework consuming benefit? Product backlog What changes ship next? Ops and manager actions What coaching, SOP or process changes happen? Scale/stop Continue, scale, redesign, restrict or stop?
21.4 Scale / Stop Memo
# AI Adoption Scale / Stop Memo
## Decision requested
Scale / continue pilot / redesign / restrict / stop.
## Workflow and target population
Named workflow, users, case types, risk tiers and rollout cohort.
## Baseline
Work-as-done summary and baseline metrics.
## Adoption evidence
Qualified adoption, behavior funnel, cohort findings and durability.
## Outcome evidence
Flow, quality, customer and risk/control outcomes.
## Value evidence
Gross benefit, cost, human review load, rework, support, risk adjustment and net realized value.
## Risk/control evidence
Override, escalation, defects, complaints, over-reliance and under-reliance.
## Recommendation
Decision, rationale, constraints and next review date.
22. Anti-Patterns
Anti-pattern Consequence Replacement Reporting MAU as adoption Hides whether work changed Qualified adoption event Counting prompts as value Rewards friction Outcome-linked behavior metrics Treating training as adoption Ignores real workflow Work-as-done and behavior funnel Celebrating high accept rate Encourages automation bias Accept/edit/reject with quality and defects Ignoring rejection reasons Misses product and trust issues Structured reason codes Using averages only Hides manager and case mix effects Cohort analysis Not measuring review load Overstates benefit Human review load and value leakage No control override taxonomy Confuses healthy challenge with bypass Override classification No change saturation view Overloads teams Rollout capacity review Dashboard without action Creates reporting theater Operating learning loop and decision log
23. Interview Answers
Q1: 你如何设计 AI adoption analytics?
我会先建立 work-as-done baseline, 明确目标流程、目标用户、case type、当前周期、质量、返工、控制点和非正式绕行。然后定义 adoption event taxonomy, 把 exposure、qualified use、human action、decision influence、control override 和 outcome link 分开。架构上用 telemetry schema 连接 user、workflow、model version、prompt version、case risk、artifact 和 outcome。指标上分 leading、behavior、flow、risk、value、durability。最后通过 behavior funnel、cohort analysis、value leakage 和 monthly operating review 把证据转化成 scale、redesign、restrict 或 stop 决策。
Q2: 为什么不能用 prompt count 证明 AI 价值?
Prompt count 只能说明交互次数, 不能说明 AI 是否在正确任务中被采用, 也不能说明是否改善流程。高 prompt count 可能代表用户反复修错、输出不稳定、政策不清或系统摩擦。真正的价值需要连接 qualified adoption、work artifact influence、cycle time、quality、risk/control、human review load 和 net realized value。AI adoption 的重点是行为和结果, 不是聊天量。
Q3: 如何判断低 adoption 是用户抵抗还是产品问题?
我会看 behavior funnel 和 resistance signals。若 exposure 低, 可能是 workflow placement 或 entitlement 问题。若 engagement 低, 可能是价值不清或入口干扰。若 reject、regenerate、heavy edit 高, 多半是输出质量、格式或信任问题。若某些团队 adoption 高而其他低, 可能是 manager reinforcement 或 change saturation。还要看 reason codes、访谈、QA 和 support tickets。不要先归因给用户, 要把产品、流程、控制、激励和组织容量一起诊断。
Q4: 如何证明 AI 带来的收益是可归因的?
我会优先设计分批 rollout、matched cohort、difference-in-differences 或 interrupted time series, 根据业务和风险约束选择。报告中明确 baseline period、eligible population、exposure、case mix、manager/team、model version 和同时发生的政策、人员、系统变化。收益还要扣除 AI run cost、human review load、rework、support 和 risk adjustment。最后给出 business confidence, 不把相关性包装成绝对因果。
Q5: AI adoption 为什么要看 control override?
Control override 是判断 AI 是否被正确信任的关键。用户覆盖 AI 可能是健康的专业判断, 也可能是绕过控制, 还可能说明 AI 输出不适合真实流程。没有 override reason 和 review, 管理层无法区分这些情况。对金融零售, override 还直接关系到客户影响、审计证据、QA 和风险边界。因此我会把 override 作为 first-class event, 连接 role、case risk、workflow stage、reason、defect 和 manager review。
Q6: 如何向业务负责人解释 value leakage?
我会说 AI 的 gross saving 不是最终收益。比如 agent assist 可能每通电话节省 30 秒, 但如果 repeat contact 上升、QA 复核增加、投诉增加或 after-call correction 增加, 净价值会下降。Value leakage 就是这些被转移或新增的成本。成熟的 value realization 必须把 review load、rework、support、latency、模型成本、控制成本和客户影响纳入 net realized value。
Q7: AML investigator copilot 的 scale gate 怎么设?
我会按 alert type 和 risk tier 分阶段 scale。Gate 包括 qualified case penetration、narrative quality、source verification、QA defect、review load、override reason、alert aging 和 high-risk escalation appropriateness。只有当 aging 或 prep time 改善, QA/control defects 不上升, human review load 可控, 且 investigator 在 8-12 周仍持续使用, 才扩大到更复杂 alert cohort。
24. Portfolio Exercise
24.1 Assignment
为一家金融零售企业建立 AI Adoption Analytics Portfolio Pack。选择 3 个用例:
Use case Suggested focus AML investigator copilot Risk, quality and review load Contact-center agent assist AHT, FCR, QA and customer impact KYC onboarding assistant Cycle time, rework and document completeness Credit ops reviewer First-pass package quality and decision support boundary Branch / RM copilot Trust, compliance boundary and relationship action quality
24.2 Deliverables
Deliverable Required content Portfolio adoption map Use cases, users, workflows, risk tiers, value hypotheses Work-as-done baseline One detailed baseline and two lighter baselines Event taxonomy At least 15 events across exposure, response, influence, control and outcome Telemetry schema Required fields and privacy/retention class Metrics hierarchy Leading, behavior, flow, risk, value and durability Behavior funnel Funnel and drop-off diagnosis per use case Cohort plan Role, team, manager, case type, risk tier, training wave Value leakage model Human review load and net realized value formula Risk/control pack Override, over-reliance, complaint, QA and escalation Operating model Review cadence, RACI and decision log Scale/stop memo Recommendation for each use case
24.3 Scoring Rubric
Criterion Strong answer Senior framing Treats adoption as behavior and operating model change BA rigor Captures work-as-done, exceptions, controls and artifacts Architecture rigor Provides event schema, traceability and outcome joins Product rigor Defines funnel, cohorts, resistance signals and backlog actions Risk rigor Includes override, review load, over-reliance and customer impact Value rigor Calculates net realized value and value leakage Executive readiness Produces scale/stop decisions with evidence
25. Quality Bar
An AI adoption analytics pack is ready for senior review only if:
Qualified adoption is defined separately from exposure and usage.
Work-as-done baseline includes real exceptions and informal work.
Telemetry joins workflow context, model version, human action, control and outcome.
Metrics include leading, lagging, risk, value and durability.
Cohort analysis can explain manager, role, case type and risk-tier differences.
Value realization subtracts human review load, rework, support and AI run cost.
Risk/control evidence includes override, escalation, defect and complaint.
Operating review produces actions, not just reporting.
Scale/stop recommendation is explicit and evidence-backed.
Final principle:
Do not ask whether users used AI.
Ask whether AI changed governed work in a way that improved durable net outcomes.