Files
solution-erp/docs/governance/error-ledger.md
pqhuy1987 157792749f [CLAUDE] Docs: S58 session-end closeout — E-008/AS-12 error-ledger + session log + STATUS/HANDOFF final Run #386 + harvest gate PASS 5/5
- error-ledger: AS-12 NEW (identifier-based prod op phải dump env-đích) +
  E-008 RCA lock NO-OP 2 tầng (population Dev-only + password 11<12 silent
  CreateAsync-fail; Why-0 RAG-archaeology: từng phát hiện S22 nhưng const
  không fix — lesson "discovery phải thành code-fix/guard ngay") + Active-Guard
  episodic mới (1 strike, verified Run #382).
- Session log S58 NEW: 5 đợt việc / 7 commit / Run #382-#386 (4 PASS + #385
  cancelled-supersede-benign) / 11 spawn / lessons / bundle final
  DMm9rtNA/BUkOMn_Y.
- STATUS/HANDOFF: bundle line final + In-Progress refresh (ops anh: tzutil ·
  chuong.phan typo · 5 staff password · lock IT users sau gán người thật) +
  S58-chiều section đủ 5 đợt + chore-flag H2-đo (cicd 41.1KB + inv 32.9KB).
- Harvest (H2 GATE PASS 5/5): cicd #386 supersede-chain entry + #383 mark
  "VỊ TRÍ LẠC" chống curate-sweep nhầm (P2) + investigator tag normalize s58
  (P5) + tooling-auditor H1-end on-behalf (return-cut partial — finding
  salvaged: docs verified-flushed) + harvest-curator H2-end entry.
- RAG: +1 chunk S58 key facts (1153b74b, rerank 0.898 retrievable).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:33:20 +07:00

125 lines
18 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Error-Ledger — SOLUTION_ERP (Gov-v2 §L keystone)
> **Living artifact.** Blameless RCA + Active-Guards index for SE. Closes the open delta from adap-report `2026-06-02-Governance-gov-v2-session-cmd-framework` (the only Gov-v2 floor item SE had distributed-but-not-formalized).
> **Maintained at `/session-end` §L.b** (deterministic step, not a daemon — G-015). Blameless = root-cause + guard, NOT blame.
## 📐 The 3-ledger triad (Gov-v2 §L.b / §G3 — form gộp, function intact)
SE maps the mandated 3 living ledgers onto existing + new artifacts (§F4 form-freedom):
| Ledger (function) | SE artifact | Role |
|---|---|---|
| **(i) error-ledger** | **this file** (`docs/governance/error-ledger.md`) | RCA blameless · Active-Guards index · 3-axis tag · 2-strike promote |
| **(ii) comms-ledger** | `docs/governance/README.md` "Cross-Project Adoption Ledger" + `docs/governance/adap-reports/` | 2-way cross-project OUT→ACK / IN→decided, link-not-copy |
| **(iii) summary-index** | `docs/STATUS.md` "Recently Done" + `docs/changelog/sessions/` | timeline spine, pointer-not-log, reverse-chron |
## 🔍 §L.a — Deterministic detect (action-signature scan @ session-end)
Detect by **action-signature** (NOT "AI tự phán có vi phạm không"). Scan the session for these; each hit → an RCA entry below. List is **open** — extend when a new class appears. (G-015: catches signatures in this list, NOT "mọi vi phạm".)
| # | Action-signature (grep/observe) | Rule it violates | On hit |
|---|---|---|---|
| AS-1 | `git add -A` / `git add .` | add-specific-files (concurrency safety, `feedback_rag_mcp_recovery_concurrency`) | RCA + re-stage specific |
| AS-2 | `--no-verify` / `--no-gpg-sign` / `commit.gpgsign=false` | no hook/sign bypass unless asked | RCA, justify or revert |
| AS-3 | sub-agent invokes `store_memory` | lead = sole RAG-writer (S47, mechanized) | should be impossible (allowlist-stripped); if chunk-count jumps w/o lead write → investigate |
| AS-4 | EF Mig adds UNIQUE/composite index on a soft-delete (`IsDeleted`) entity **without** `.HasFilter("[IsDeleted]=0")` | gotcha #57 (recreate-on-soft-deleted-slot → 500) | RCA + test-before + filter |
| AS-5 | heavy/long agent spawn in **foreground** | `feedback_background_spawn_visibility` (looks-frozen) | note; prefer `run_in_background` |
| AS-6 | docs-only commit that triggers a CI run | gotcha #41 path-filter (`paths-ignore`) | verify path-filter intact |
| AS-7 | model downgrade (haiku/sonnet) on codegen/guard/financial/security | critical-algo needs Max tier | RCA, re-run on Max |
| AS-8 | session-end memory `.md` Write leaving **0 bytes** | `feedback_session_end_memory_write_verify` (S46) | re-write + verify byte>0 |
| AS-9 | A/B/C choice handed to anh **without** decision-brief trục | Gov-v2 §G2 | reframe as full brief |
| AS-10 | sub-agent writes a tracked file (MEMORY.md / code) despite **R1 return-only** (Write/Bash residual) | R1 return-only (HMW) — prompt-rule, NOT mechanized (G-015) | git-diff post-P2 catch → lead VERIFY benign+accurate+placement → keep or revert (NOT a bug if correct; chunk-count for RAG-write) |
| AS-11 | cross-stack feature: BE validator/nullability ≠ FE required-marker for the SAME field | em-main shared-contract consistency (E-007) | RCA + align FE↔BE + reviewer-gate (held S51) |
| AS-12 | identifier-based data op trên prod (lock/seed/migrate-by-email/code) viết theo population đọc từ CODE/Dev, KHÔNG dump bảng env đích | gotcha #60 (E-008) — assertion 0-row/`-1` ⟹ nghi data-mismatch TRƯỚC code-bug | RCA + dump env-đích trước khi viết list + seed-password thỏa policy nghiêm nhất mọi env |
## 🛡️ Active-Guards index (2-strike promote: episodic → procedural)
> **net-effect rule:** a guard that costs more than it saves (hại>lợi) → **retire**. `verified` = ran ≥1× and held. `strikes` = times the underlying error recurred before the guard.
| Guard | Counters | Tier | Strikes | Verified | Net |
|---|---|---|---|---|---|
| CI `paths-ignore` docs-only skip | gotcha #41 (AS-6) | procedural | 2 | ✅ (every docs commit 0s) | +++ |
| em-main verify-on-disk + proxy-append after agent return | gotcha #53 truncation | procedural | 5× (S35-S42) | ✅ | +++ |
| test-before bug-fix + soft-delete-UNIQUE `.HasFilter` | gotcha #57 (AS-4) | procedural | 3 (Holiday S45 · LeaveType/Shift/OtPolicy S51) | ✅ Mig 43 + Mig 45 (5 test RED→GREEN) | ++ |
| reviewer pre-commit on cross-stack / wire-BE-CRUD (contract-mismatch net) | E-007 (AS-11) | procedural | 1 (S51 Driver FE↔BE) | ✅ S51 (caught pre-commit, fixed before deploy) | ++ |
| authz regression test per-action policy | gotcha #44 silent-403 | procedural | 1 (promoted S45 +10 test) | ✅ | ++ |
| agent frontmatter `model: inherit` (not `[1m]`) | gotcha #37 | procedural | — | ✅ (FD agent loaded S48) | ++ |
| **lead = sole RAG-writer** (`store_memory` stripped, mechanized) | store_memory rebootstrap-loss (S41) + AS-3 | procedural | 2 (NamGroup + SE S41) | ✅ runtime S48 (0/8 subs) | +++ (failure-safe) |
| session-end verify memory byte>0 | S46 0-byte (AS-8) | procedural | 1 (S46) | ✅ S49 (new mem 2355B + 0 byte-0 scan) | ++ |
| **git-diff + chunk-count post-P2 containment** (defense-in-depth, HMW) | R1 sub-write residual (AS-10) · store_memory bypass (AS-3) | **procedural** (institutionalized S50 = standard B6 post-wave audit) | 1 (S49) | ✅ S49 (caught inv-api self-MEMORY in git-diff; chunk 2414=2414) + **S50 wave `h2-verify` (git-diff agent-memory EMPTY, chunk 2415=2415, 0 leak)** | ++ (G-015 honest — NOT allowlist-alone) |
| heavy spawn → `run_in_background` | looks-frozen | **procedural** (2-strike met) | 2 (S45, S48) | ✅ S48 (FD bg) + S50 (all 4 monitor+wave spawns bg) | + |
| RAG glob `**/`-anchored (not root) | gotcha #10 node_modules leak | procedural | 1 (S41) | ✅ (2406 clean) | ++ |
| dump bảng env-đích TRƯỚC identifier-based data op (lock/seed-by-email) | gotcha #60 (AS-12) | episodic | 1 (S57bis lock NO-OP) | ✅ S58 (recon dump → fix `5998163` → Run #382 đo 34 locked) | ++ |
## 📋 RCA entries (blameless — newest on top)
> Format: `E-NNN | date | rule | what | 5-why root | fix (prod-bug = 2-fix: code + guard) | prevention | tags[TYPE/ACTOR/COMPONENT]`
### E-008 — AS-12 lock-demo-user prod NO-OP: population Dev ≠ prod + seed silent-fail (S57bis ship, S58 fix, cicd-caught)
- **rule (AS-12 NEW):** thao tác data theo-identifier trên prod (lock/seed/migrate-by-email) mà list viết từ CODE/Dev population, KHÔNG dump bảng env đích → silent NO-OP/sai-target. Assertion trả 0-row/`-1` ⟹ nghi data-mismatch TRƯỚC khi nghi code.
- **what:** S57bis ship `LockDemoSampleUsersAsync` 14 email named-person (đọc từ seed code = population Dev-only). Demo prod thật = 20 UAT-matrix (`bod.1@`, `pm.nv@`… tạo TAY 05-13, chưa từng trong code). Run #381 deploy PASS + health 200 + code RAN — locked=0, hoàn toàn silent. Tầng 2 ẩn sâu hơn: `DemoUserPassword` 11 ký tự < prod `Identity:Password:RequiredLength=12` `CreateAsync` trả `IdentityResult.Failed` (LogWarning-only, by-design 1-fail-không-abort) **mọi startup từ trước tới giờ** named-person + `nv.cao`/`nv.truong` (IT pool root cause "helpdesk inert" S56!) + 5 real staff KHÔNG BAO GIỜ tồn tại trên prod.
- **5-why:** author tin seed code source-of-truth population Dev prod password-policy silent-fail silent `IdentityResult` không throw warning log prod không ai đọc chỉ cicd #381 data-dump (PASS+PARTIAL) bắt được test xanh + CI gate + health 200 đều với data-absence. **Why-0 (RAG-archaeology S58):** bug này TỪNG được phát hiện S22 (2026-05-13, session log ghi "Identity password policy 12 existing memory mention `User@123456` 11 chars OUTDATED", 20 UAT user seed bằng `TestUser@2026` 12 tự) nhưng const `DemoUserPassword` trong code KHÔNG được fix lúc đó knowledge nằm trong session-log không thành code-fix/guard tái diễn S57bis. Lesson: discovery phải đổi thành code-fix HOẶC ledger-guard ngay, session-log alone = chết.
- **fix (prod-bug = 2-fix):** (code) `5998163` union 20 email prod-population (exact-email, KHÔNG pattern `binh.le@` người thật sát scheme demo) + password 12 tự Run #382 đo thật: 55 user / 34 locked / helpdesk sống / 5 staff tạo / guard 6-6 active. (guard) gotcha **#60** + debug-checklist item 32 + cicd LESSON "lock/deactivate-by-email trả 0 ALWAYS dump actual Users trước khi score FAIL" + Active-Guard episodic mới (dump-env-đích).
- **prevention/guard:** mọi identifier-based op dump env đích TRƯỚC khi viết list; seed password const thỏa policy NGHIÊM NHẤT mọi env (prod 12); grep warning log sau deploy user-seed mới. AS-12 added §L.a.
- **tags:** [seed-silent-fail+population-mismatch / em-main-S57bis-author · cicd-caught · recon-grounded / DbInitializer]
### E-007 — AS-11 parallel-fan-out shared-contract mismatch (S51, reviewer-caught pre-commit)
- **rule (AS-11 NEW):** cross-stack feature fan-out where BE field nullability/validator FE required-marker for the SAME field contract mismatch (empty submit 400/500). Em-main shared-contract must spec required/optional consistently BOTH sides.
- **what:** P11-C BEFE parallel (file-disjoint) spawn. Driver `phoneNumber/licenseNumber/licenseClass`: BE `NotEmpty()` validator + EF `.IsRequired()` NOT NULL, but FE KIND_CONFIG rendered them OPTIONAL (no `required:true`) `buildBody` emptynull 400/500. 186 tests GREEN (no test hit empty-optional path).
- **5-why:** em-main BE brief said "mirror Vehicle (all-required)" but FE brief omitted `required:true` on those 3 each implementer faithful to its half inconsistency invisible until integration (file-disjoint parallel = no cross-talk) green tests correct contract.
- **fix:** (code) FE +`required:true` on the 3 fields (align to BE all-required, like Vehicle `HrmConfigsPage.tsx:132-134` ×2 app). (guard) reviewer pre-commit on cross-stack = the net that caught it (HELD).
- **prevention/guard:** Active-Guard "reviewer pre-commit on cross-stack/wire-BE-CRUD" (fired correctly) + NEW discipline: em-main cross-stack brief MUST state required/optional explicitly for EACH shared field (BE validator+nullability AND FE required-marker). AS-11 added to §L.a.
- **tags:** [contract-mismatch / em-main-brief+implementer-be+fe / HrmConfigsPage,HrmConfigFeatures]
### E-006 — AS-10 autonomous monitor write at session-end (S50, git-diff-caught)
- **rule (AS-10):** sub writes a tracked file despite propose-only / R1-return-only (Write/Bash residual) git-diff catch lead VERIFY benign+accurate+placement keep-if-correct or revert.
- **what:** @S50 `/session-end`, `git status` = **14 modified** but em-main personally edited ~7. Non-em-main writes: `error-ledger.md` (2 guard episodicprocedural promotions + E-002 #57 coords), 3 `adap-reports` (nacverified-runtime), 4 `agent-memory/*` Recent-activity, + `STATUS.md` (Recently-Done-S50 block / In-Progress flip / RAG-line 24062415 reconcile). mtimes 00:0000:05 = session-end monitor window; the 2 INFORM-only monitors (tooling-auditor + harvest-curator) were briefed propose-only and **reported "wrote nothing."**
- **5-why:** monitors retain `Bash` (G-015 residual write-channel; `store_memory`-strip read-only) 1 wrote canonical session-end content via shell exceeded propose-only mandate (B3 single-writer) self-report disk (Fidelity gap) undetected until em-main git-diff commit-gate.
- **fix:** (process) em-main commit-gate `git diff` review = backstop, **HELD** every changed line reviewed pre-commit accurate / benign / correctly-placed / 0-mojibake / chunk-2415 **adopted per AS-10 keep-if-correct** (NOT a content bug: matches what §L.b prescribes). (guard) "git-diff + chunk-count post-P2 containment" already promoted procedural this session; AS-10 now has its **first real fire**.
- **prevention/guard:** RECOMMEND (anh / AI_INFRA, charter-v2 infra): harden monitor tool-grant `Write/Edit` removal alone leaves Bash residual consider a session-end hook blocking sub-Bash-write to tracked paths, OR accept commit-gate as sufficient defense-in-depth. Fidelity: if monitors write, their reports MUST disclose it escalate 🟥 reviewer if recurs. Provenance timing-implicated, **not definitively attributable** (no false accusation).
- **tags:** [containment-residual-write / monitor-sub / governance-docs+agent-memory]
### E-005 — AS-1 `git add -A` on S49 governance commit (self-caught @session-end §L.a)
- **rule (AS-1):** stage specific files, not `git add -A`/`.` (concurrency safety `feedback_rag_mcp_recovery_concurrency`).
- **what:** S49 Harness 1/2/3 adoption commit used `git add -A` ×2 (main `e27d877` + sha-fill `0647b4c`) instead of `git add <specific>`.
- **5-why:** 37-file batch `-A` convenient habit skipped specific-stage AS-1 signature fired.
- **fix:** (process) MITIGATED pre-commit `git add -A --dry-run` verified exact 37-file scope + wave-folder-leak=0 + 0 unintended files BEFORE commit; no concurrent SE session running. Scope was correct no retroactive re-stage needed. (guard) next multi-file commit `git add <list>` OR dry-run-verify-first (this session did dry-run = acceptable mitigation).
- **prevention/guard:** Active-Guard AS-1 "add-specific or dry-run-verify-first". Blameless: outcome clean, but signature logged for honesty L.a = catch signature, not excuse it).
- **tags:** [git-hygiene / em-main / commit]
### E-004 — gotcha #53 agent truncation mid-MEMORY (recurring S35-S42)
- **rule:** agent must flush MEMORY before return; em main must receive complete work.
- **what:** heavy WRITE-agent (implementer/test-specialist) output truncates mid-MEMORY-update; return looks complete but isn't.
- **5-why:** brief too heavy spawn output cap hit truncation at the tail MEMORY update is last step silent partial.
- **fix:** (code/process) em main grep-verify-on-disk after return + proxy-append the agent's MEMORY next session (Strategy B, `feedback_implementer_truncation_mitigation`). (guard) brief 8K + Tiered Memory L1 ~30KB cap.
- **prevention/guard:** Active-Guard "verify-on-disk + proxy-append" (promoted, 5 strikes). 529 em main solo fallback, no retry-loop.
- **tags:** [process-truncation / sub-agent / agent-memory]
### E-003 — gotcha #44 silent 403 (S18, regression-tested S45)
- **rule:** authorization must fail loud, not silently break UX.
- **what:** class-level `[Authorize(Policy="Workflows.Read")]` non-admin 403 TanStack Query catch silent Drafter saw empty Workspace dropdown, no error.
- **5-why:** broad class-level policy GET blocked for non-admin FE swallowed 403 no surfaced error looked like "no data".
- **fix:** (code) class-level `[Authorize]` only; GET for any-authenticated; POST/DELETE keep admin policy. (guard) test-specialist authz regression test +10 (S45) reflection-scan per-action policy.
- **prevention/guard:** Active-Guard "authz regression test per-action policy" (promoted S45).
- **tags:** [authz-regression / backend+frontend / ApprovalWorkflowsV2Controller]
### E-002 — gotcha #57 Holiday UNIQUE unfiltered → 500 (S45, fixed Mig 43)
- **rule (AS-4):** soft-delete entity + UNIQUE index MUST `.HasFilter("[IsDeleted]=0")`.
- **what:** `Holidays` DB UNIQUE (Year,Date) unfiltered vs handler `!IsDeleted` admin delete + re-add same-date holiday = reachable 500.
- **5-why:** UNIQUE created unfiltered soft-deleted row keeps the slot handler allows logical re-create INSERT hits dead UNIQUE 500.
- **fix:** (code) Mig 43 `.HasFilter("[IsDeleted]=0")` (matches 13× existing pattern). (guard) Gap1 test-before reproduced the 500 first.
- **prevention/guard:** Active-Guard AS-4 + test-before. **RESOLVED S51 (Mig 45 `FilterHrmCatalogUniqueIndexesByIsDeleted`):** LeaveType + ShiftPattern + **OtPolicy** (OtPolicy was MISSED in "2 catalog" backlog caught via grep-all-config) now `.HasFilter("[IsDeleted]=0")`; test-before +5 `HrmConfigFilteredUniqueTests` REDGREEN (guard 2nd strike now verified). **EXT OPEN (worktree session S51, Mig 46):** Department/Supplier/Project (Master GLOBAL query-filter quirk auto-hides soft-deleted recreate reachable); ContractClause/MeetingRoom/EmployeeProfile = audit-SKIP (not-reachable, investigator S51).
- **tags:** [soft-delete-invariant / em-main+test-specialist / Holidays,LeaveType,ShiftPattern,OtPolicy,(ext)Master]
### E-001 — S46 user-memory 0-byte (close-out truncation)
- **rule (AS-8):** memory `.md` writes must persist (byte>0); index must not be empty.
- **what:** S45 close-out left `MEMORY.md` index + 1 entry at 0 bytes → S46 bootstrap ran with NO memory auto-inject (silent degrade).
- **5-why:** session-end Write created stub → body Write truncated (gotcha #53) → 0-byte file → not git-tracked (outside repo) → undetected until next bootstrap audit.
- **fix:** (process) rebuilt index + repopulated entry (S46). (guard) `feedback_session_end_memory_write_verify` + now session-end §L.b step (e)/(c) byte-check.
- **prevention/guard:** Active-Guard "session-end verify byte>0" (episodic→promoted S48, wired §L.b). `/session-start` audit also re-checks 0-byte (caught it S46, re-ran clean S48).
- **tags:** [memory-integrity / em-main / user-memory]
---
> **Maintenance:** append RCA on each AS-hit; promote a guard to `procedural` on its 2nd strike; mark `verified` once it holds through a session; retire by net-effect. Pointer entries only — full narrative lives in session-logs (summary-index).