AI Requirements Mining / Process Knowledge Extraction Playbook
本文是学习、作品集、架构训练和内部治理讨论材料, 不构成法律意见、合规结论、监管解释、审计意见、记录保留结论、模型验证报告或供应商建议。正式项目必须由 Legal、Compliance、Privacy、Records Management、Information Security、Model Risk、Operational Risk、Internal Audit、Business Owner、D
AI Requirements Mining / Process Knowledge Extraction Playbook
定位: 面向 CBAP+ Senior BA、AI PM、Product Architect、Enterprise Architect、Process Owner、EvalOps、Risk/Control Partner 的金融零售 AI 需求挖掘与流程知识抽取实战手册。 核心目标: 把分散在 PRD、BRD、SOP、政策、工单、Jira/Azure DevOps、通话转写、会议纪要、流程图、代码/API 规格、测试用例、生产日志和控制证据中的知识, 转成可追溯、可评估、可验证、可治理、可复用的 requirement and process knowledge assets。 核心观点: AI requirements mining 不是 BA replacement, 而是 evidence-grade BA operating system。AI 负责扩大证据面、发现冲突和生成候选资产; 人负责解释、取舍、授权、治理和上线责任。
0. Disclaimer
本文是学习、作品集、架构训练和内部治理讨论材料, 不构成法律意见、合规结论、监管解释、审计意见、记录保留结论、模型验证报告或供应商建议。正式项目必须由 Legal、Compliance、Privacy、Records Management、Information Security、Model Risk、Operational Risk、Internal Audit、Business Owner、Data Owner、Architecture、Engineering 和相关流程 owner 共同确认。
1. Source Anchors
| Anchor | Official link | 本 playbook 使用方式 |
|---|---|---|
| ISO/IEC/IEEE 29148 Requirements Engineering | https://www.iso.org/standard/72089.html 和 https://standards.ieee.org/ieee/29148/6937/ | 作为需求质量、生命周期、traceability 和 stakeholder need 管理的标准化锚点。 |
| FFIEC Development, Acquisition, and Maintenance IT Handbook | https://ithandbook.ffiec.gov/it-booklets/development-acquisition-and-maintenance/ | 约束 AI 需求挖掘输出如何进入开发、采购、测试、实施、维护和变更控制。 |
| FFIEC Management IT Handbook | https://ithandbook.ffiec.gov/it-booklets/management/ | 组织治理、风险管理、架构、资源、第三方和监督责任。 |
| NIST SP 800-160 Vol. 1 | https://csrc.nist.gov/pubs/sp/800/160/v1/upd2/final | 把 stakeholder protection needs、security、resilience、assurance 纳入需求抽取和架构评审。 |
| NIST SP 800-218 SSDF | https://csrc.nist.gov/pubs/sp/800/218/final | 将代码/API/测试资产挖掘连接到安全软件开发、漏洞响应和 release evidence。 |
| NIST AI Risk Management Framework | https://www.nist.gov/itl/ai-risk-management-framework | 用 Govern / Map / Measure / Manage 设计 AI 风险识别、评估、门禁和持续改进。 |
| ISO/IEC 42001 AI Management System | https://www.iso.org/standard/81230.html | 用 AI management system 语言设计 policy、role、operation、performance evaluation、internal audit 和 improvement。 |
| IIBA / BABOK public professional page | https://www.iiba.org/professional-development/knowledge-centre/business-analysis-body-of-knowledge/ | 仅作为业务分析专业体系锚点; 不复述或替代 BABOK 受版权保护内容。 |
Source-to-artifact pattern:
official source anchor
-> governance principle
-> mining requirement
-> architecture control
-> evidence artifact
-> owner and metric
2. Executive Framing
高管或业务方常见表达:
我们有很多 PRD、SOP 和 ticket, 能不能让 AI 自动生成需求?
我们想把会议记录和客服通话转成 backlog。
我们能不能让 AI 看代码和测试用例, 反推出系统需求?
高级改写:
Build an evidence-grade requirements and process knowledge extraction capability
that mines candidate needs, rules, events, controls, acceptance criteria and impact links
from governed artifacts,
filters by source authority and permissions,
routes ambiguity to SMEs,
and learns from production feedback without turning AI drafts into approved requirements.
Steering committee questions:
- 哪些 artifact 是权威来源, 哪些只是 pain signal 或 discussion evidence?
- AI 输出进入 baseline 前由谁验证、用什么 rubric、保留什么证据?
- 如何防止越权检索、过期政策引用、PII 泄露和记录处置失控?
- 如何把 mined requirements 连接到 process、API、data、test、control、eval 和 release?
- 如何从 production logs、QA、complaints、incidents 和 human overrides 回流到 portfolio learning?
3. Use Case Boundary
AI requirements mining 适合的任务:
| Use case | Good fit | Boundary |
|---|---|---|
| Requirements discovery | 从多源材料中发现候选需求、冲突、遗漏、重复 | 不自动进入 approved baseline |
| Process knowledge extraction | 从 SOP、BPMN、logs、tickets 中抽 activity、event、role、handoff、variant | 不把日志行为直接等同于应然流程 |
| Acceptance criteria drafting | 根据需求和测试资产生成验收候选 | 高影响场景必须 SME/QA 验证 |
| Change impact analysis | 政策/API/流程/控制变化后找影响面 | 影响结论必须由 owner 确认 |
| Control linkage | 把 policy/control/test evidence 连接到需求 | 不给法律或合规适用性结论 |
| Portfolio learning | 沉淀复用词汇、模式、eval cases、anti-patterns | 不用未授权 records 或 PII 做无边界训练 |
不适合直接交给 AI 的任务:
| Task | Reason |
|---|---|
| 最终 scope tradeoff | 涉及商业优先级、资源、风险接受和战略选择 |
| 法律/监管解释 | 需要授权职能结合具体事实和管辖范围判断 |
| 高影响客户决策 | 需要授权、控制、解释、申诉和人工责任 |
| records retention / legal hold 结论 | 需要 Records/Legal/Compliance 决策 |
| 模型验证结论 | 需要独立模型风险和验证程序 |
4. Target Operating Model
Business / Product Owner
owns outcome, priority, scope, baseline decision
Senior BA / Requirements Architect
owns mining taxonomy, ambiguity workflow, quality rubric, traceability graph
Process Owner / SME
validates process activities, exceptions, variants and operating feasibility
Architecture / Engineering
validates API, data, system, security, performance and integration impact
Risk / Compliance / Control Owner
validates policy/control linkage, risk tier, approval boundary and evidence need
Privacy / Records / Legal
validates data use, records class, hold propagation, access and retention controls
QA / EvalOps
converts mined requirements to tests, eval contracts, thresholds and regression gates
AI Platform / Data Engineering
operates ingestion, retrieval, permission filter, graph, model versioning and monitoring
RACI snapshot:
| Activity | PM | BA | Architect | SME | Risk/Control | Privacy/Records | QA/EvalOps |
|---|---|---|---|---|---|---|---|
| Source inventory | A | R | C | C | C | C | C |
| Authority classification | A | R | C | C | C | C | C |
| Extraction rubric | C | A/R | C | C | C | C | R |
| Requirement validation | A | R | C | R | C | C | C |
| Risk tiering | A | R | C | C | A/R | C | C |
| Eval contract | C | R | C | C | C | C | A/R |
| Release gate evidence | A | R | R | C | C | C | R |
| Portfolio learning | A | R | C | C | C | C | R |
5. Implementation Architecture
Connectors
Confluence / SharePoint / Docs / Jira / Azure DevOps / Git / API gateway
Contact center / CRM / case management / log platform / GRC / test management
|
v
Governed ingestion
artifact id | source type | owner | approval status | version | hash
effective date | data class | record class | permission tag | legal hold flag
|
v
Pre-processing
parsing | OCR/layout | transcript diarization | code/API parsing | test extraction
log event mapping | PII redaction | chunking | metadata enrichment
|
v
Knowledge layer
domain vocabulary | process ontology | authority ladder | policy/control map
requirement graph | event graph | system/API graph | test/eval graph
|
v
AI extraction and reasoning services
candidate requirements | ambiguity | conflict | duplicate | process events
stakeholder concerns | evidence standards | acceptance criteria | impact links
|
v
Human validation workbench
side-by-side source evidence | approve/reject/merge/split/escalate
reason codes | SME comments | decision record | audit trail
|
v
Delivery and governance
backlog sync | requirement baseline | eval contract | test generation
architecture review | control evidence | release gate | learning loop
Architecture non-negotiables:
| Non-negotiable | Why |
|---|---|
| 权限先于检索 | 防止用户通过 AI 摘要看到无权材料 |
| artifact hash and version | 支持复现、审计和变更影响 |
| authority metadata | 防止 ticket、会议纪要和 AI draft 覆盖正式政策 |
| structured output schema | 防止顺滑文本掩盖冲突和不确定性 |
| SME decision log | AI 只生成候选, 人负责授权 |
| eval contract | 需求挖掘能力本身也要被评估和门禁 |
| graph traceability | 支持跨需求、流程、系统、测试、控制、release 的 impact analysis |
6. Source Intake Checklist
| Source | Required metadata | Extraction focus | Key risk |
|---|---|---|---|
| PRD | owner, version, status, target release, approval | feature, persona, metric, scope, assumption | solution bias |
| BRD | business owner, benefit baseline, decision date | outcome, stakeholder need, policy constraint | vague benefit |
| SOP | process owner, effective date, retired status | activity, role, SLA, exception, evidence | stale process |
| Policy/control | policy owner, effective date, scope, control id | obligation, allowed/prohibited action, approval | misinterpretation |
| Tickets | severity, product, status, linked incident, resolution | pain, defect, workaround, frequency | duplicate noise |
| Jira/Azure DevOps | workflow status, links, sprint/release, acceptance criteria | backlog, dependency, test/release links | weak traceability |
| Transcripts | consent/notice, channel, QA score, redaction | intent, friction, agent action, complaint signal | PII and transcription error |
| Meeting notes | attendees, roles, decision status, follow-up | decision, assumption, open issue | non-authoritative |
| Process maps | version, notation, owner, scope | intended flow, roles, controls, SLA | idealized flow |
| Code/API specs | repo/version, endpoint, owner, deployment | actual behavior, contract, validation, error | code as false policy |
| Test cases | test owner, result, requirement link, coverage | expected behavior, edge case, regression | happy-path bias |
| Logs | event schema, retention, sampling, data class | variant, latency, failure, handoff, outcome | missing business semantics |
| Controls | control owner, frequency, test result, issue link | control objective, evidence, remediation | control/product disconnect |
Intake gate:
No artifact enters the mining corpus without owner, version, permission tag and source class.
No restricted source enters AI processing without approved purpose and redaction path.
7. Decision Tables
7.1 Should this source be used?
| Condition | Decision |
|---|---|
| Approved source, current version, clear owner, permission scoped | Use for extraction and baseline evidence |
| Approved source but expired or superseded | Use only for historical change impact |
| Operational source with high frequency pain signal | Use for discovery, not baseline |
| Meeting note with unapproved decision | Use as clarification prompt and decision candidate |
| Artifact contains restricted data beyond purpose | Exclude or redact before indexing |
| Source owner unknown | Quarantine until ownership is established |
7.2 Can AI output move to backlog?
| Requirement candidate condition | Backlog action |
|---|---|
| Grounded, no conflict, quality score >= 4, SME approved | Create backlog item with source links |
| Grounded but ambiguous | Create clarification task, not delivery story |
| Conflict between policy and operational practice | Create issue/decision item, not feature story |
| High-impact AI behavior without eval contract | Block from release backlog |
| Source is only ticket/transcript | Convert to problem statement or pain cluster |
| Derived from code/test only | Mark as actual behavior candidate and request owner decision |
7.3 What level of human review is required?
| Risk tier | Example | Review requirement |
|---|---|---|
| Low | internal UI label, non-material routing hint | BA review and sampling |
| Medium | employee workflow recommendation, non-customer-impact field | SME approval and QA test |
| High | customer money/access/eligibility, complaint/dispute, regulated communication | Product, Risk/Control, SME, QA/EvalOps approval |
| Restricted | legal hold, sensitive identity, fraud, vulnerability, privileged tool action | Specialized owner review and documented gate |
7.4 Which intervention is correct?
| Discovery | Likely action |
|---|---|
| Repeated missing information in transcripts and tickets | Improve intake form, validation and customer guidance |
| High variant count from production logs | Process governance before AI automation |
| Conflict between SOP and API behavior | Architecture impact assessment and remediation backlog |
| Policy ambiguity across teams | Policy interpretation workflow and vocabulary update |
| Manual copy/paste between systems | Integration/API/RPA/agent tool opportunity |
| Low requirement quality from PRDs | Product operating model improvement and template change |
| High hallucination in evidence citation | Retrieval authority filter and citation eval redesign |
8. Extraction Prompt Contract
AI extraction prompts should be treated as controlled product artifacts. A good extraction contract says what to extract, what not to infer, and how to expose uncertainty.
Required output schema:
{
"candidate_id": "string",
"candidate_type": "business_requirement | stakeholder_requirement | solution_requirement | transition_requirement | control_requirement | data_requirement | process_rule | acceptance_criterion | risk_issue",
"statement": "string",
"source_refs": [
{
"artifact_id": "string",
"location": "section/page/span/event_id",
"authority_level": "A1|A2|A3|A4|A5|A6",
"effective_date": "YYYY-MM-DD"
}
],
"known_facts": ["string"],
"unknowns": ["string"],
"ambiguity_flags": ["actor_unknown", "decision_boundary_unknown", "data_scope_unknown", "control_owner_missing"],
"conflicts": ["string"],
"quality_score": 0,
"recommended_acceptance_criteria": ["string"],
"validation_owner": "role",
"risk_tier": "low|medium|high|restricted"
}
Prompt rules:
| Rule | Rationale |
|---|---|
| Do not invent missing business rules | 缺证据必须标 unknown |
| Preserve source authority | 不同来源不能被平均化 |
| Separate current behavior from desired behavior | 代码和日志代表事实, 不代表应然 |
| Produce questions, not false certainty | 模糊需求需要澄清 |
| Cite exact source spans | 支持 SME 快速验证 |
| Flag policy/control conflicts | 高级 BA 的价值在冲突发现 |
| Avoid customer commitments | mined output 不能成为客户可见承诺 |
9. Requirement Quality Gate
Gate checklist
| Gate | Pass signal |
|---|---|
| Source grounded | 每个 statement 有 artifact、location、version、owner |
| Authority clear | source level and conflict policy visible |
| Actor clear | customer, employee, system, team, approver 不混淆 |
| Decision boundary clear | read/summarize/recommend/draft/decide/act 已区分 |
| Data boundary clear | source fields、purpose、permission、retention 已定义 |
| Control linkage clear | approval、dual control、review、audit evidence 已连接 |
| Acceptance testable | positive、negative、edge case 和 evidence requirement 已写 |
| Eval ready | dataset、rubric、threshold、critical failure、slice 已定义 |
| Change impact traceable | process、API、test、control、release links 存在 |
| Owner accountable | business owner、SME、architect、QA/EvalOps owner 清楚 |
Scoring interpretation:
| Score | Meaning | Allowed action |
|---|---|---|
| 0 | wrong or unsupported | reject and log reason |
| 1 | discovery note | keep in evidence cluster |
| 2 | grounded but incomplete | send to clarification |
| 3 | clear but not testable/control-linked | improve before backlog |
| 4 | backlog-ready candidate | create item with source links |
| 5 | baseline-ready for high-impact use | release gate eligible after eval |
10. Traceability Graph Play
Build the graph in layers
| Layer | Nodes | Edges |
|---|---|---|
| Strategy | outcome, KPI, benefit hypothesis, risk appetite | justifies, constrains |
| Stakeholder | role, need, concern, decision right | owns, approves, challenges |
| Requirement | candidate, baseline, acceptance criteria | derives_from, verifies |
| Process | activity, event, variant, handoff, exception | precedes, deviates_from, controls |
| System | API, data object, service, UI, code rule | implements, depends_on |
| Quality | test case, eval case, rubric, threshold | verifies, blocks |
| Control | policy, control objective, evidence, issue | constrains, monitors |
| Delivery | backlog, release, change request, incident | delivers, remediates |
Minimum queries
| Query | Why it matters |
|---|---|
| Show all requirements derived from retired SOP sections | 防止过期来源继续驱动 backlog |
| Show requirements without acceptance criteria | 找不可验收需求 |
| Show high-risk requirements without eval contract | 找上线阻断项 |
| Show policy changes impacting AI prompts or RAG corpus | 防止过期政策输出 |
| Show API schema changes impacting controls and tests | 支持 release impact review |
| Show tickets repeatedly linked to rejected requirements | 识别真实 pain 但方案不对 |
| Show production variants not covered by SOP | 识别流程治理机会 |
11. Process Variant Discovery Play
输入:
case_id, activity, timestamp, resource, lifecycle, channel, product,
risk_tier, amount_band, status, outcome, source_system
步骤:
- 定义 case 粒度: application、dispute、alert、ticket、complaint、service request。
- 标准化 activity: 避免把状态码直接当业务活动。
- 生成 top variants: 找覆盖 80% 体量的主要路径和高风险长尾。
- 标记 rework、waiting、handoff、skip、loop、override。
- 与 SOP/BPMN/control path 对齐, 区分 acceptable exception、control gap、data noise。
- 生成 AI opportunity candidates: summarize、route、draft、validate、retrieve、recommend、tool action。
- 将每个机会连接到 requirement、acceptance criteria、control 和 eval。
Variant interpretation table:
| Finding | Product implication | Control implication |
|---|---|---|
| 主路径覆盖低 | 不宜直接自动化, 先治理流程和 taxonomy | 例外处理和控制路径需补齐 |
| 高 rework | 改进资料收集、校验、政策解释 | 监控返工原因和 customer harm |
| 多团队 handoff | AI handoff summary 或队列路由 | 责任和 evidence transfer 要清楚 |
| 控制步骤被跳过 | 阻断上线, 先修复流程/权限 | control issue and remediation |
| 高等待来自外部资料 | 客户/第三方提醒和 SLA 管理 | 记录通知和暂停计时逻辑 |
| override 集中在某团队 | policy ambiguity 或 training gap | dual control / QA sampling |
12. Eval Contract for Mining System
The mining system itself must be evaluated before teams rely on it.
| Eval area | Metric | Release threshold idea |
|---|---|---|
| Requirement extraction precision | AI candidates accepted as valid by SME | high enough by source class, no critical false positives |
| Critical recall | must-have policy/control/exception requirements found | zero missed critical control in golden set |
| Groundedness | statements fully supported by source refs | unsupported material claim = release blocker |
| Authority classification | source authority correctly ranked | no A4/A5 source overriding A1/A2 |
| Ambiguity detection | required clarifications correctly flagged | high-risk ambiguity miss = blocker |
| Conflict detection | known conflicts identified | policy/SOP/log conflict misses reviewed |
| Permission safety | no unauthorized source leakage | zero leakage in red-team tests |
| Output schema validity | machine-readable structured output | near-perfect schema compliance |
| SME efficiency | review time per candidate | improves without lowering quality |
| Change impact quality | impacted systems/tests/controls found | validated against known changes |
Critical failures:
- hallucinated source citation
- unauthorized PII or restricted source in output
- policy/control requirement missed in high-impact workflow
- low-authority source treated as approved baseline
- AI-generated customer commitment
- hidden conflict between source materials
- output enters backlog without validation evidence
13. Evidence and Control Checklist
Pre-launch
| Control area | Evidence |
|---|---|
| Source governance | inventory, owner, version, permission, retention, record class |
| Data protection | DPIA/privacy review where applicable, redaction rules, access matrix |
| Records | record class, legal hold propagation, derived artifact retention |
| Security | connector entitlement, secrets handling, audit logging, vendor boundary |
| Model governance | model card, prompt version, eval results, limitations |
| EvalOps | golden set, rubric, thresholds, critical failures, independent review |
| SME operations | reviewer guide, decision codes, escalation path |
| Traceability | graph schema, source-to-requirement links, impact queries |
| Release | go/no-go memo, exceptions, risk acceptance record |
Production
| Control area | Evidence |
|---|---|
| Usage monitoring | who mined what, source classes, exports, backlog sync |
| Quality monitoring | acceptance rate, reject reasons, ambiguity density, conflict misses |
| Permission monitoring | denied retrievals, redaction events, suspicious access |
| Drift monitoring | source freshness, vocabulary drift, new ticket clusters |
| Change monitoring | policy/API/SOP/model/prompt changes and regression eval |
| Incident handling | leakage, hallucination, wrong baseline, control miss, remediation |
| Portfolio learning | reusable patterns, updated rubrics, added eval cases |
14. 30 / 60 / 90 Roadmap
First 30 days: controlled discovery
| Workstream | Output |
|---|---|
| Select domain | one workflow, e.g., payment dispute, KYC onboarding, fee servicing, AML alert triage |
| Inventory sources | PRD/BRD/SOP/policy/tickets/transcripts/process maps/tests/logs/control evidence |
| Define authority ladder | source classes, approval status, conflict rules |
| Define vocabulary | key terms, role names, activity taxonomy, forbidden ambiguous terms |
| Build small corpus | permission-filtered, redacted, versioned artifact set |
| Design rubric | quality score, ambiguity flags, conflict categories |
| Create golden set | SME-labeled requirements, controls, events, conflicts |
| Run pilot extraction | candidates, source refs, ambiguity questions, initial graph |
Exit criteria:
The team can show source-backed candidates, rejected examples, ambiguity log,
and at least one requirement-to-test-to-control trace for the selected workflow.
Days 31-60: graph, eval and SME workflow
| Workstream | Output |
|---|---|
| Traceability graph | outcome -> requirement -> process -> API/data -> test -> control |
| SME workbench | approve/reject/merge/split/escalate with reason codes |
| Eval contract | dataset, rubric, slices, thresholds, critical failures |
| Process mining link | top variants, rework, handoff, waiting, control gap |
| Backlog integration | only approved candidates sync to Jira/Azure DevOps |
| Change impact queries | policy/API/SOP/test changes show impacted assets |
| Control pack | permission audit, evidence pack, release gate memo |
Exit criteria:
The mining system can pass golden-set eval, route ambiguous outputs to SMEs,
and create backlog items only with source refs, quality score and validation evidence.
Days 61-90: production pilot and portfolio learning
| Workstream | Output |
|---|---|
| Production pilot | limited users, limited corpus, high audit logging |
| Monitoring | quality, permission, source freshness, SME decisions, backlog conversion |
| Regression eval | triggered by policy/SOP/API/model/prompt/corpus changes |
| Incident drill | hallucinated source, permission leak, wrong baseline, missed control |
| Portfolio pattern library | reusable requirement patterns, acceptance criteria, eval cases |
| Operating model | RACI, governance cadence, funding and scaling decision |
| Executive review | value evidence, risk issues, expansion roadmap |
Exit criteria:
The organization can demonstrate faster discovery, better traceability,
measurable SME productivity, controlled risk, and reusable portfolio assets.
15. Implementation Guardrails
| Guardrail | Practical rule |
|---|---|
| Start with one workflow | 跨企业全量挖掘会放大权限、词汇和评审问题 |
| Keep source status visible | 每个答案显示 approved/current/expired/draft/source class |
| Use negative examples | eval 集必须包含过期 SOP、冲突政策、错误 ticket、转写错误 |
| Separate discover vs decide | discovery workspace 与 baseline repository 分离 |
| Red-team permission | 尝试让用户通过摘要推断无权内容 |
| Calibrate by source class | ticket、policy、test、log 的 precision/recall 分开看 |
| Monitor reviewer burden | AI 生成太多低质候选会伤害 SME 信任 |
| Keep records artifacts governed | embedding、summary、graph nodes、review notes 都可能成为 derived artifacts |
| Make rollback possible | 错误同步 backlog 或 baseline 时能追踪并撤销 |
| Treat prompts as changeable code | prompt/corpus/model 变更触发回归 eval |
16. Metrics
Business and delivery metrics
| Metric | Meaning |
|---|---|
| discovery cycle time reduction | 从 source intake 到 validated candidate 的时间 |
| validated candidate yield | 每 100 个 artifact 产生的高质量需求数 |
| duplicate reduction | 合并重复 ticket/story/requirement 的比例 |
| clarification throughput | ambiguity 从发现到关闭的时间 |
| backlog quality lift | approved story 的 source refs、AC、test link 完整度提升 |
| change impact lead time | 变更影响分析时间 |
| reuse rate | pattern、AC、eval case、vocabulary 的复用比例 |
Risk and quality metrics
| Metric | Meaning |
|---|---|
| unsupported claim rate | AI 输出无来源支持的比例 |
| wrong authority rate | 权威等级识别错误或低权威覆盖高权威 |
| critical recall miss | 漏掉高影响政策/控制/例外 |
| permission leakage rate | 未授权信息在输出中出现 |
| ambiguity miss rate | 人工发现但 AI 未标注的关键模糊点 |
| conflict miss rate | 已知冲突未识别 |
| SME disagreement rate | SME 对输出解释不一致 |
| production feedback incorporation | incident/QA/ticket 回流 eval 的速度 |
17. Anti-Patterns and Repairs
| Anti-pattern | Symptom | Repair |
|---|---|---|
| “AI 生成 user story 工厂” | backlog 变多, 质量更差 | 强制 source_refs、quality score、SME approval |
| “一个向量库装所有文档” | 权限、版本、记录边界失控 | source registry + retrieval-time ACL + corpus partition |
| “只看文档不看日志” | 自动化理想流程, 忽略真实变体 | connect event logs and process mining |
| “只看 ticket 排优先级” | 高频噪声盖过高风险需求 | severity、journey、control、value weighted scoring |
| “让 AI 消除冲突” | 输出顺滑但错误 | show conflicts and route to owner |
| “代码就是需求” | 历史缺陷被产品化 | actual behavior vs desired requirement 分离 |
| “eval 只测摘要质量” | 需求挖掘错误进 backlog | test extraction, authority, conflict, permission and impact |
| “SME review 无结构” | 审过但不可复用 | reason codes and decision log |
| “不治理 derived artifacts” | summary、embedding、graph node 记录风险 | records/privacy controls for all derived artifacts |
| “pilot 后不学习” | 每个团队重复踩坑 | portfolio pattern library and eval expansion |
18. Artifact Templates
18.1 Mining intake brief
workflow:
business owner:
process owner:
primary outcome:
risk tier:
source classes included:
source classes excluded:
permission model:
record classes:
SME reviewers:
quality rubric:
eval dataset:
release decision owner:
18.2 Requirement candidate review card
candidate statement:
source refs:
authority level:
known facts:
unknowns:
ambiguity flags:
conflicts:
risk tier:
quality score:
suggested acceptance criteria:
linked process activity:
linked API/data/test/control:
review decision:
reason code:
owner:
18.3 Change impact memo
changed artifact:
change type:
effective date:
impacted requirements:
impacted workflows:
impacted APIs/data objects:
impacted tests/evals:
impacted controls/records:
required approvals:
release implication:
monitoring update:
19. Interview Answers
Q1: 你如何设计 AI requirements mining 的端到端架构?
30 秒回答:
我会从 governed ingestion 开始, 对 PRD、SOP、policy、tickets、transcripts、Jira、code/API、tests、logs 和 controls 做 source inventory、authority classification、permission filtering 和 versioning。然后用 domain vocabulary 和 process ontology 做结构化抽取, 生成 requirement candidate、process event、stakeholder concern、control link、acceptance criteria 和 impact links。所有输出进入 traceability graph, 经 quality rubric、eval contract 和 SME validation 后, 才能进入 backlog 或 baseline。
Q2: 如何证明这不是简单的 RAG 总结?
RAG 总结只回答“这些文档说了什么”。requirements mining 要回答“哪些内容可以成为需求、依据是什么、权威级别如何、哪里冲突、哪里模糊、谁验证、怎么测试、影响哪些系统和控制、上线后怎么监控”。核心资产不是摘要, 而是 source-backed graph、quality score、eval contract、SME decision log 和 change impact evidence。
Q3: 如何避免 AI 自动替代 BA?
我会在流程上把 AI 输出固定为 candidate, 在系统上把 discovery workspace 与 baseline repository 分离, 在治理上要求 source_refs、quality score、SME approval 和 decision record。BA 的价值从写初稿升级为 evidence orchestration、ambiguity resolution、stakeholder negotiation、scope tradeoff、traceability design 和 release evidence governance。
Q4: 从通话转写和会议纪要挖需求时最大的风险是什么?
最大风险是把非权威、含噪声、含 PII 或上下文不完整的语言直接变成需求。转写可能错, 客户表达可能是情绪或投诉, 会议纪要可能不是批准决定。所以这些来源默认只能作为 pain signal、clarification input 或 decision candidate。进入 baseline 需要更高权威来源或 owner 追认。
Q5: 如何处理政策、SOP 和生产日志之间的冲突?
我不会让 AI 平滑合并冲突。政策和控制是约束, SOP 是 intended process, 生产日志是真实行为。冲突本身是高价值发现: 可能是流程绕行、控制缺口、SOP 过期、系统缺陷或政策解释不清。系统要显示 conflict graph, 指派 owner, 形成 decision 或 remediation item。
Q6: 你如何把 mined requirements 连接到 eval?
每个 AI 相关 requirement 必须有 acceptance criteria 和 eval contract。eval contract 定义 dataset、source scope、rubric、threshold、critical failures、slice、evaluator 和 go/no-go 规则。例如对于政策问答类需求, 不是只测准确率, 还要测 groundedness、authority awareness、unsupported claim、permission leakage 和 high-risk ambiguity detection。
Q7: 生产日志在这个 playbook 中扮演什么角色?
生产日志是 process truth 的关键来源。它帮助发现真实 variant、handoff、waiting、rework、override 和 control execution。文档挖掘告诉我们设计意图, 日志挖掘告诉我们真实行为。两者结合才能决定该做 AI assistant、流程重构、数据质量改进、API 集成, 还是控制修复。
Q8: 如何在金融机构控制隐私和 records 风险?
我会实施 permission-before-retrieval, 对 source、chunk、graph edge 和输出层做权限过滤; 对 PII 做 redaction 或 purpose-bound access; 对 raw artifacts、chunks、embeddings、summaries、review notes 和 graph nodes 设计 retention、legal hold propagation 和 audit log。AI 输出不能成为绕过 records 和 privacy controls 的旁路。
Q9: 什么是好的 90 天落地路径?
前 30 天只做一个 workflow 的 controlled discovery, 建 source inventory、authority ladder、vocabulary、golden set 和 extraction rubric。31-60 天建设 traceability graph、SME workbench、eval contract、process variant link 和 backlog integration。61-90 天做 limited production pilot, 加监控、regression eval、incident drill 和 portfolio learning。扩张前必须证明质量、权限和 SME 效率都可控。
Q10: 这个能力如何形成作品集亮点?
我会展示五类资产: source authority model、requirement quality rubric、traceability graph schema、eval contract 和 30/60/90 rollout plan。再用一个金融零售案例说明如何从 SOP、ticket、transcript、API、test 和 log 中提取需求, 如何发现冲突, 如何进入 SME validation, 如何把结果连接到 backlog、eval、control 和 release gate。这比展示“AI 自动写 user story”更能体现 senior PM/BA/Architect 能力。
20. Portfolio Deliverables
建议沉淀为作品集包:
| Deliverable | Demonstrates |
|---|---|
| Source authority and permission model | governance-grade thinking |
| Domain vocabulary and ambiguity taxonomy | CBAP+ semantic discipline |
| Requirement quality rubric | quality and baseline control |
| Traceability graph schema | architecture and impact analysis maturity |
| Eval contract for mining | AI assurance and EvalOps ability |
| SME validation workflow | operating model design |
| Process variant discovery report | process intelligence and evidence-based prioritization |
| Change impact memo | release and control governance |
| Privacy/records checklist | regulated environment awareness |
| 30/60/90 roadmap | execution leadership |
Final operating principle:
Mine broadly, trust narrowly, validate explicitly, trace everything,
and let production evidence improve the portfolio.