Files
solution-erp/docs/governance/adap-reports/2026-06-10-Governance-harness-4-model-tier-promotion.md
pqhuy1987 17b23a418a
Some checks failed
Deploy SOLUTION_ERP / build-deploy (push) Has been cancelled
[CLAUDE] Docs: Harness-4 two-tier runtime-VERIFIED (spawn-test 2 chiều post-restart) + email-back AI_INFRA
- Spawn-test 2 chiều S57bis: H1 tooling-auditor (demote pin) self-report claude-opus-4-8[1m] + H2 harvest-curator (promote inherit) self-report claude-fable-5[1m] → nấc executed-file/PENDING-RESTART → RUNTIME-VERIFIED (adap-report §2/§5 + STATUS row). [1m] 1M-resolve SE tự verify.
- Email update 2026-06-11-se-to-ai_infra-harness-4-runtime-verified (nac sent, sha ecf1d587, honest n=1/chiều, hmw.js executed-file giữ) + _index OUTBOUND.
- Lesson env: CCD harness cache agent frontmatter — restart CLI mới ăn (2 data-point 06-10/06-11).
- Bundle 06-10 carry: 7 agent pin opus-4-8 + 4 inherit + hmw.js tier-map H4.5 + agents/README two-tier + 2 adap-report + email 06-10 + agent-memory delta (KEEP-ALL-5 H2-verified) + investigator L1→L2 archive curate.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 12:12:22 +07:00

41 lines
8.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# adap-report — 2026-06-10-Governance-harness-4-model-tier-promotion
> SISTER = SOLUTION_ERP. Report-format LOCK (5 trường). Generated 2026-06-10 (S57-resume), apply by em main solo (governance file-work, single-writer; 2 spawn-test demoted + 1 reviewer-gate spawn). Refine/supersede phần SUB của `model-fable-5-max` (AM cùng ngày).
## 1. id-broadcast
`2026-06-10-Governance-harness-4-model-tier-promotion` (from: ai_infra · category: **Governance** · reviewer_gate: **PASS_WITH_FIXES ×2 applied** · nac: published · targets: **all-fit** · content_sha256 `3e9b9461…`). HARNESS-4 two-tier: lead giữ Fable 5 (1M) Max · sub default demote pin `claude-opus-4-8` (Max) · CHỈ promote-list justify theo tiêu chí H4.3 (a-d) giữ `inherit` · workflow tier-map H4.5 · email-back AI_INFRA H4.7 BẮT BUỘC · honest-claim H4.8.
## 2. nac G-011
**RUNTIME-VERIFIED (2026-06-11 — spawn-test 2 chiều post-restart, S57bis).** H1 tooling-auditor (demote-list) self-report nguyên văn "Opus 4.8 (1M context) · `claude-opus-4-8[1m]`" → **pin ĂN runtime** + 1M-resolve `[1m]` SE TỰ verify (hết lệ thuộc claim AI_INFRA s20). H2 harvest-curator (promote `inherit`) self-report `claude-fable-5[1m]`**inherit Fable 5 GIỮ nguyên**. Lead Fable 5 (1M) Max live. *Trail lịch sử:* **EXECUTED-FILE (2026-06-10) · runtime-PENDING-RESTART.** 11 file edited + verified grep. **Spawn-test n=2 demoted (H4.8) đã chạy NGAY → phát hiện SE-env divergence:** cả 2 self-report `claude-fable-5[1m]` (KHÔNG phải opus-4-8) → **SE env (CCD harness) KHÔNG fresh-read agent frontmatter mỗi spawn** — registry cache definition session-start (khớp SE-fact đã biết S47/S52: agent `.md` no-hot-reload, restart CLI mới ăn). Khác AI_INFRA s20 (fresh-read NGAY). → Demote pin = executed-file, hiệu lực runtime SAU restart; **re-spawn-test post-restart** mới nâng runtime-verified. **Lead Fable 5 = verified-runtime** (session live). **Promote-tier inherit→Fable = verified-live** (n=2 spawn 06-10 inherit-chain — định nghĩa promote không đổi nên post-restart giữ nguyên hành vi).
## 3. evidence
**PROMOTION-LIST (4 — giữ `model: inherit` = Fable 5 1M), justification + evidence per-vị-trí:**
| Vị trí | Tiêu chí | Justification | Evidence track-record |
|---|---|---|---|
| 🟥 reviewer | **(a) gate≥writer** | adversarial pre-commit cuối sau lead — gate yếu hơn writer = writer qua mặt vô tình | S51 bắt 1 MAJOR pre-commit (Driver FE↔BE required-mismatch) · S54 chain-verify role-string "Admin" thật (disarm decoy QTV) · S49 gate 3 adap-report |
| 🟦 investigator-codebase | **(b) verdict-nuôi-quyết-định** | verdict recon/audit mà em main + anh quyết theo; sai 1 verdict lan cả chuỗi | S56 pre-golive verdict (bắt dept-IT 0-active-user prod — test xanh không thấy) · S57 flag "master-write-open" → thành fix authz thật ×3 controller |
| 🔵 database-agent | **(b) verdict-class DB-lens** | read-advisory verdict schema/migration/concurrency nuôi quyết định migration-design | S53 first-spawn bắt Mig 46 committed-but-unapplied-local (203 SQLite-test + CI-prod đều MISS) · S56 review MAJOR → bump Serializable (gotcha #58 fix) |
| ⬜ harvest-curator | **(c) chống-rubber-stamp** | fidelity-gate phải DÁM từ-chối xác-nhận cái sai của CHÍNH LEAD | S56 GATE chấm 4.5/5 (KHÔNG rubber-stamp 5/5) buộc em main append Serializable-correction vào impl/test MEMORY trước đóng |
**DEMOTE-LIST (7 — pin `model: claude-opus-4-8`, H4.4):** 🟨 implementer-backend + 🟧 implementer-frontend (deterministic scaffold spec-sạch, double-gate reviewer+test+cicd sau lưng) · 🟪 test-specialist (artifact RED→GREEN observable, reviewer gate sau) · 🟩 cicd-monitor (checklist-verify deploy, evidence curl/sqlcmd deterministic — AI_INFRA cũng demote con tương đương) · 🟫 tooling-auditor (checklist-class propose-only — AI_INFRA demote chính con này) · 🟦 investigator-api (external research-gathering, ít verdict hệ-trọng; re-assess nếu track-record verdict xuất hiện) · 🩷 frontend-designer (execute-layer design, FD-rubric + screenshot-loop + reviewer-gate sau; giữ `effort: max` riêng sẵn có).
**WORKFLOW tier-map H4.5 (`hmw.js`):** `resolveModel(role, tier, i)` — promote-roles (investigator-codebase·reviewer·database-agent ∈ VALID_ROLES) → `undefined` (inherit Fable qua frontmatter) · demoted-roles → `undefined` (frontmatter pin tự lo) · **role-less → `'opus'`** (sweep-class CÓ CHỦ ĐÍCH — taskList do lead author từng run = đã phân loại lúc author, mirror lý-lẽ AI_INFRA worked-example) + log nhắc khai `tier:'fable'` khi hệ-trọng · **per-task `tier:'fable'|'opus'` override** (tier lạ → WARN + default). meta.description đã sync.
**Files edited (11 core + 3 bookkeeping):** 7× `.claude/agents/{implementer-backend,implementer-frontend,test-specialist,cicd-monitor,tooling-auditor,investigator-api,frontend-designer}.md` (`model: inherit``claude-opus-4-8`, GIỮ effort/tools) · `hmw.js` (tier-map + meta + args-comment + agent-call; invalid-role typo → fail-UP inherit per reviewer #7) · `agents/README.md` (header tier + upgrade-note 06-10 + ASCII EM-main Fable + spawn-line) · `STATUS.md` Sub-agents row (attribution per reviewer #2) · `ultra-on.md` P2 row — bookkeeping: 2 adap-report + 1 email outbox.
**SELF-CHECK broadcast:** lead Fable 5 (1M) Max live ✓ · justification per-vị-trí (bảng trên, KHÔNG list suông) ✓ · grep `model:` = đúng 7 `claude-opus-4-8` + 4 `inherit`, 0 file `[1m]` frontmatter ✓ · workflow gate=inherit/sweep='opus' ✓ · spawn-test đã chạy (n=2 demoted — kết quả = env-divergence finding, see §2) ✓ · email `/send-email ai_infra` SENT (`broadcasts/outbox/ai_infra/2026-06-10-se-to-ai_infra-harness-4-adopt-report.md`, hash-verified) ✓ · nấc G-011 chuẩn (executed-file, KHÔNG claim runtime-verified) ✓.
**🟥 reviewer-gate (pre-send + pre-commit): PASS-with-fixes — ALL applied.** 0 CRITICAL / 0 MAJOR floor-violation · 1 REQUIRED (sequencing: hash thật trước `nac: sent` — enforced) + 6 MINOR (STATUS attribution · hmw.js log banner "same-model"→two-tier · meta 8→9-agent · files-count thống nhất · `[1m]` body-match count cập nhật · email ④ cite direct-evidence) + #7 design-fix (invalid-role typo → fail-UP inherit, KHÔNG rơi 'opus'). **Reviewer self-report `claude-fable-5[1m]` = promote-tier DIRECT evidence n=1** (promote-list member thật, mạnh hơn inherit-chain n=2 demoted-on-cache). Cross-verify độc lập: 8/8 evidence-claim khớp HANDOFF/STATUS/memory/working-tree, 0 claim phịa.
## 4. tailored-gì + skip-gì-vì-sao
- **FUNCTION-floor adopt FULLY:** H4.1→H4.8 giữ đủ (lead không demote · tiêu chí a-d nguyên văn · email-back · honest-claim).
- **FORM tailored SE:** (a) map vai theo function-class SE roster 11 (không copy tên AI_INFRA): promote 4/11 — trong đó database-agent là vị trí SE-specific (AI_INFRA worked-example không có); (b) harvest-curator promote (c) dù AI_INFRA chỉ ghi "vùng-xám-nghiêng-giữ" — SE evidence S56 4.5/5 đủ mạnh; (c) hmw.js role-less default `'opus'` = đúng nhánh "taskList lead-author từng run" của worked-example (KHÔNG phải nhánh "default nghiêng inherit"); (d) frontend-designer giữ `effort: max` frontmatter riêng (demote model KHÔNG đụng effort).
- **SKIP:** không có helper Sonnet by-design (n-a) · không project-pin model phải gỡ (n-a — khác VIPIX).
## 5. honest-caveat
- **Nấc = RUNTIME-VERIFIED 06-11** (G-011) — ✅ RESOLVED post-restart: spawn-test demote n=1 (tooling-auditor `claude-opus-4-8[1m]`) + promote n=1 (harvest-curator `claude-fable-5[1m]`), S57bis. Lịch sử 06-10: executed-file · runtime-PENDING-RESTART. SE env ≠ AI_INFRA env: spawn-test n=2 PROVE frontmatter KHÔNG fresh-read trên CCD harness (cả 2 demoted vẫn self-report `claude-fable-5[1m]`). Sau restart CLI → re-spawn-test ≥2 demoted self-report `claude-opus-4-8*` mới claim runtime-verified. **Đây là finding ngược với AI_INFRA s20 "fresh-read không cần restart"** — đã báo trong email H4.7 mục ⑤ (surface-the-need §K; broadcast tự dặn "đừng kế thừa kết luận mù" — đúng).
- **Context-variant demoted: ✅ VERIFIED 06-11** — tooling-auditor self-report "Opus 4.8 (1M context)" → `[1m]` runtime-resolve 1M trên SE confirmed (trước đó chỉ là claim AI_INFRA s20).
- **Effort cả 2 tier:** env machine-wide `max` — KHÔNG introspect được từ trong agent (H4.8), không claim "verified effort".
- **Risk demote thấy được:** tooling-auditor/cicd-monitor nếu verdict-quality giảm rõ sau vài session → adap-request đề xuất promote lại (theo dõi qua H1/H2 chính chúng + em main).
- **Quota note: ✅ RESOLVED 06-11** — restart done, demote-pin ăn runtime → tiết kiệm two-tier bắt đầu từ S57bis.