AI 扩展计划 / Playbooks
ABPA 模板 04:Requirements-to-Eval Matrix
50 行abpa/templates/04-requirements-to-eval-matrix.md
Requirements-to-Eval Matrix
AI requirements are weak until they are connected to eval data, graders, thresholds, and production signals.
1. Requirement Inventory
| Req ID | Requirement | Type | Business outcome | Priority | Owner |
|---|---|---|---|---|---|
| R-001 | functional / NFR / governance | must / should / could |
2. Eval Mapping
| Req ID | Acceptance criteria | Eval data | Grader | Threshold | Production signal | Review cadence |
|---|---|---|---|---|---|---|
| R-001 | code / LLM judge / human / metric |
3. Guardrail Mapping
| Req ID | Failure mode | Guardrail | Stop / escalation rule |
|---|---|---|---|
| R-001 |
4. Requirement Cards
R-001
| Field | Answer |
|---|---|
| User / stakeholder | |
| Problem | |
| Requirement | |
| Non-goal | |
| Acceptance criteria | |
| Eval data | |
| Grader | |
| Threshold | |
| Manual review | |
| Observability signal | |
| Owner |
5. Coverage Check
| Coverage question | Status | Gap |
|---|---|---|
| Does every must-have requirement have an eval? | pass / fail | |
| Does every high-risk requirement have human review or control? | pass / fail | |
| Are any requirements impossible to evaluate today? | pass / fail | |
| Are production signals defined before launch? | pass / fail | |
| Are stop rules defined? | pass / fail |