Files
solution-erp/docs/governance/error-ledger.md
pqhuy1987 009dd94f22 [CLAUDE] Docs: S48 adap-* verify closure post-restart + Gov-v2 error-ledger + §L.b
- store_memory strip VERIFIED-runtime (registry 0/8 subs) — adap-report updated
- frontend-designer FD2 loop VERIFIED-RAN (first spawn) — adap-report updated
- Gov-v2 delta CLOSED: NEW docs/governance/error-ledger.md (blameless RCA + Active-Guards
  index + AS-1..AS-9 deterministic-detect + 3-ledger triad) + session-end.md Phase 1.5 §L.b 6-step
- STATUS/HANDOFF S48 + session log + frontend-designer MEMORY flush (FD2 rig + Tailwind-v4 fact)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 00:05:39 +07:00

87 lines
8.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Error-Ledger — SOLUTION_ERP (Gov-v2 §L keystone)
> **Living artifact.** Blameless RCA + Active-Guards index for SE. Closes the open delta from adap-report `2026-06-02-Governance-gov-v2-session-cmd-framework` (the only Gov-v2 floor item SE had distributed-but-not-formalized).
> **Maintained at `/session-end` §L.b** (deterministic step, not a daemon — G-015). Blameless = root-cause + guard, NOT blame.
## 📐 The 3-ledger triad (Gov-v2 §L.b / §G3 — form gộp, function intact)
SE maps the mandated 3 living ledgers onto existing + new artifacts (§F4 form-freedom):
| Ledger (function) | SE artifact | Role |
|---|---|---|
| **(i) error-ledger** | **this file** (`docs/governance/error-ledger.md`) | RCA blameless · Active-Guards index · 3-axis tag · 2-strike promote |
| **(ii) comms-ledger** | `docs/governance/README.md` "Cross-Project Adoption Ledger" + `docs/governance/adap-reports/` | 2-way cross-project OUT→ACK / IN→decided, link-not-copy |
| **(iii) summary-index** | `docs/STATUS.md` "Recently Done" + `docs/changelog/sessions/` | timeline spine, pointer-not-log, reverse-chron |
## 🔍 §L.a — Deterministic detect (action-signature scan @ session-end)
Detect by **action-signature** (NOT "AI tự phán có vi phạm không"). Scan the session for these; each hit → an RCA entry below. List is **open** — extend when a new class appears. (G-015: catches signatures in this list, NOT "mọi vi phạm".)
| # | Action-signature (grep/observe) | Rule it violates | On hit |
|---|---|---|---|
| AS-1 | `git add -A` / `git add .` | add-specific-files (concurrency safety, `feedback_rag_mcp_recovery_concurrency`) | RCA + re-stage specific |
| AS-2 | `--no-verify` / `--no-gpg-sign` / `commit.gpgsign=false` | no hook/sign bypass unless asked | RCA, justify or revert |
| AS-3 | sub-agent invokes `store_memory` | lead = sole RAG-writer (S47, mechanized) | should be impossible (allowlist-stripped); if chunk-count jumps w/o lead write → investigate |
| AS-4 | EF Mig adds UNIQUE/composite index on a soft-delete (`IsDeleted`) entity **without** `.HasFilter("[IsDeleted]=0")` | gotcha #57 (recreate-on-soft-deleted-slot → 500) | RCA + test-before + filter |
| AS-5 | heavy/long agent spawn in **foreground** | `feedback_background_spawn_visibility` (looks-frozen) | note; prefer `run_in_background` |
| AS-6 | docs-only commit that triggers a CI run | gotcha #41 path-filter (`paths-ignore`) | verify path-filter intact |
| AS-7 | model downgrade (haiku/sonnet) on codegen/guard/financial/security | critical-algo needs Max tier | RCA, re-run on Max |
| AS-8 | session-end memory `.md` Write leaving **0 bytes** | `feedback_session_end_memory_write_verify` (S46) | re-write + verify byte>0 |
| AS-9 | A/B/C choice handed to anh **without** decision-brief trục | Gov-v2 §G2 | reframe as full brief |
## 🛡️ Active-Guards index (2-strike promote: episodic → procedural)
> **net-effect rule:** a guard that costs more than it saves (hại>lợi) → **retire**. `verified` = ran ≥1× and held. `strikes` = times the underlying error recurred before the guard.
| Guard | Counters | Tier | Strikes | Verified | Net |
|---|---|---|---|---|---|
| CI `paths-ignore` docs-only skip | gotcha #41 (AS-6) | procedural | 2 | ✅ (every docs commit 0s) | +++ |
| em-main verify-on-disk + proxy-append after agent return | gotcha #53 truncation | procedural | 5× (S35-S42) | ✅ | +++ |
| test-before bug-fix + soft-delete-UNIQUE `.HasFilter` | gotcha #57 (AS-4) | procedural | 2 (Holiday S45 + latent LeaveType/Shift) | ✅ Mig 43 | ++ |
| authz regression test per-action policy | gotcha #44 silent-403 | procedural | 1 (promoted S45 +10 test) | ✅ | ++ |
| agent frontmatter `model: inherit` (not `[1m]`) | gotcha #37 | procedural | — | ✅ (FD agent loaded S48) | ++ |
| **lead = sole RAG-writer** (`store_memory` stripped, mechanized) | store_memory rebootstrap-loss (S41) + AS-3 | procedural | 2 (NamGroup + SE S41) | ✅ runtime S48 (0/8 subs) | +++ (failure-safe) |
| session-end verify memory byte>0 | S46 0-byte (AS-8) | **episodic→promote** | 1 (S46) | ⏳ wired §L.b S48, verify next run | ++ |
| heavy spawn → `run_in_background` | looks-frozen | episodic | 2 (S45, S48) | ✅ S48 (FD bg) | + |
| RAG glob `**/`-anchored (not root) | gotcha #10 node_modules leak | procedural | 1 (S41) | ✅ (2406 clean) | ++ |
## 📋 RCA entries (blameless — newest on top)
> Format: `E-NNN | date | rule | what | 5-why root | fix (prod-bug = 2-fix: code + guard) | prevention | tags[TYPE/ACTOR/COMPONENT]`
### E-004 — gotcha #53 agent truncation mid-MEMORY (recurring S35-S42)
- **rule:** agent must flush MEMORY before return; em main must receive complete work.
- **what:** heavy WRITE-agent (implementer/test-specialist) output truncates mid-MEMORY-update; return looks complete but isn't.
- **5-why:** brief too heavy → spawn output cap hit → truncation at the tail → MEMORY update is last step → silent partial.
- **fix:** (code/process) em main grep-verify-on-disk after return + proxy-append the agent's MEMORY next session (Strategy B, `feedback_implementer_truncation_mitigation`). (guard) brief ≤8K + Tiered Memory L1 ~30KB cap.
- **prevention/guard:** Active-Guard "verify-on-disk + proxy-append" (promoted, 5 strikes). 529 → em main solo fallback, no retry-loop.
- **tags:** [process-truncation / sub-agent / agent-memory]
### E-003 — gotcha #44 silent 403 (S18, regression-tested S45)
- **rule:** authorization must fail loud, not silently break UX.
- **what:** class-level `[Authorize(Policy="Workflows.Read")]` → non-admin 403 → TanStack Query catch silent → Drafter saw empty Workspace dropdown, no error.
- **5-why:** broad class-level policy → GET blocked for non-admin → FE swallowed 403 → no surfaced error → looked like "no data".
- **fix:** (code) class-level `[Authorize]` only; GET for any-authenticated; POST/DELETE keep admin policy. (guard) test-specialist authz regression test +10 (S45) reflection-scan per-action policy.
- **prevention/guard:** Active-Guard "authz regression test per-action policy" (promoted S45).
- **tags:** [authz-regression / backend+frontend / ApprovalWorkflowsV2Controller]
### E-002 — gotcha #57 Holiday UNIQUE unfiltered → 500 (S45, fixed Mig 43)
- **rule (AS-4):** soft-delete entity + UNIQUE index MUST `.HasFilter("[IsDeleted]=0")`.
- **what:** `Holidays` DB UNIQUE (Year,Date) unfiltered vs handler `!IsDeleted` → admin delete + re-add same-date holiday = reachable 500.
- **5-why:** UNIQUE created unfiltered → soft-deleted row keeps the slot → handler allows logical re-create → INSERT hits dead UNIQUE → 500.
- **fix:** (code) Mig 43 `.HasFilter("[IsDeleted]=0")` (matches 13× existing pattern). (guard) Gap1 test-before reproduced the 500 first.
- **prevention/guard:** Active-Guard AS-4 + test-before. ⚠️ **OPEN latent:** `LeaveType.Code` + `ShiftPattern.Code` same class, still unfiltered → backlog test-before (2nd strike of this guard).
- **tags:** [soft-delete-invariant / em-main+test-specialist / Holidays,LeaveType,ShiftPattern]
### E-001 — S46 user-memory 0-byte (close-out truncation)
- **rule (AS-8):** memory `.md` writes must persist (byte>0); index must not be empty.
- **what:** S45 close-out left `MEMORY.md` index + 1 entry at 0 bytes → S46 bootstrap ran with NO memory auto-inject (silent degrade).
- **5-why:** session-end Write created stub → body Write truncated (gotcha #53) → 0-byte file → not git-tracked (outside repo) → undetected until next bootstrap audit.
- **fix:** (process) rebuilt index + repopulated entry (S46). (guard) `feedback_session_end_memory_write_verify` + now session-end §L.b step (e)/(c) byte-check.
- **prevention/guard:** Active-Guard "session-end verify byte>0" (episodic→promoted S48, wired §L.b). `/session-start` audit also re-checks 0-byte (caught it S46, re-ran clean S48).
- **tags:** [memory-integrity / em-main / user-memory]
---
> **Maintenance:** append RCA on each AS-hit; promote a guard to `procedural` on its 2nd strike; mark `verified` once it holds through a session; retire by net-effect. Pointer entries only — full narrative lives in session-logs (summary-index).