Files
solution-erp/docs/governance/adap-reports/2026-06-10-Governance-harness-4-model-tier-promotion.md
pqhuy1987 17b23a418a
Some checks failed
Deploy SOLUTION_ERP / build-deploy (push) Has been cancelled
[CLAUDE] Docs: Harness-4 two-tier runtime-VERIFIED (spawn-test 2 chiều post-restart) + email-back AI_INFRA
- Spawn-test 2 chiều S57bis: H1 tooling-auditor (demote pin) self-report claude-opus-4-8[1m] + H2 harvest-curator (promote inherit) self-report claude-fable-5[1m] → nấc executed-file/PENDING-RESTART → RUNTIME-VERIFIED (adap-report §2/§5 + STATUS row). [1m] 1M-resolve SE tự verify.
- Email update 2026-06-11-se-to-ai_infra-harness-4-runtime-verified (nac sent, sha ecf1d587, honest n=1/chiều, hmw.js executed-file giữ) + _index OUTBOUND.
- Lesson env: CCD harness cache agent frontmatter — restart CLI mới ăn (2 data-point 06-10/06-11).
- Bundle 06-10 carry: 7 agent pin opus-4-8 + 4 inherit + hmw.js tier-map H4.5 + agents/README two-tier + 2 adap-report + email 06-10 + agent-memory delta (KEEP-ALL-5 H2-verified) + investigator L1→L2 archive curate.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 12:12:22 +07:00

8.9 KiB
Raw Blame History

adap-report — 2026-06-10-Governance-harness-4-model-tier-promotion

SISTER = SOLUTION_ERP. Report-format LOCK (5 trường). Generated 2026-06-10 (S57-resume), apply by em main solo (governance file-work, single-writer; 2 spawn-test demoted + 1 reviewer-gate spawn). Refine/supersede phần SUB của model-fable-5-max (AM cùng ngày).

1. id-broadcast

2026-06-10-Governance-harness-4-model-tier-promotion (from: ai_infra · category: Governance · reviewer_gate: PASS_WITH_FIXES ×2 applied · nac: published · targets: all-fit · content_sha256 3e9b9461…). HARNESS-4 two-tier: lead giữ Fable 5 (1M) Max · sub default demote pin claude-opus-4-8 (Max) · CHỈ promote-list justify theo tiêu chí H4.3 (a-d) giữ inherit · workflow tier-map H4.5 · email-back AI_INFRA H4.7 BẮT BUỘC · honest-claim H4.8.

2. nac G-011

RUNTIME-VERIFIED (2026-06-11 — spawn-test 2 chiều post-restart, S57bis). H1 tooling-auditor (demote-list) self-report nguyên văn "Opus 4.8 (1M context) · claude-opus-4-8[1m]" → pin ĂN runtime + 1M-resolve [1m] SE TỰ verify (hết lệ thuộc claim AI_INFRA s20). H2 harvest-curator (promote inherit) self-report claude-fable-5[1m]inherit Fable 5 GIỮ nguyên. Lead Fable 5 (1M) Max live. Trail lịch sử: EXECUTED-FILE (2026-06-10) · runtime-PENDING-RESTART. 11 file edited + verified grep. Spawn-test n=2 demoted (H4.8) đã chạy NGAY → phát hiện SE-env divergence: cả 2 self-report claude-fable-5[1m] (KHÔNG phải opus-4-8) → SE env (CCD harness) KHÔNG fresh-read agent frontmatter mỗi spawn — registry cache definition session-start (khớp SE-fact đã biết S47/S52: agent .md no-hot-reload, restart CLI mới ăn). Khác AI_INFRA s20 (fresh-read NGAY). → Demote pin = executed-file, hiệu lực runtime SAU restart; re-spawn-test post-restart mới nâng runtime-verified. Lead Fable 5 = verified-runtime (session live). Promote-tier inherit→Fable = verified-live (n=2 spawn 06-10 inherit-chain — định nghĩa promote không đổi nên post-restart giữ nguyên hành vi).

3. evidence

PROMOTION-LIST (4 — giữ model: inherit = Fable 5 1M), justification + evidence per-vị-trí:

Vị trí Tiêu chí Justification Evidence track-record
🟥 reviewer (a) gate≥writer adversarial pre-commit cuối sau lead — gate yếu hơn writer = writer qua mặt vô tình S51 bắt 1 MAJOR pre-commit (Driver FE↔BE required-mismatch) · S54 chain-verify role-string "Admin" thật (disarm decoy QTV) · S49 gate 3 adap-report
🟦 investigator-codebase (b) verdict-nuôi-quyết-định verdict recon/audit mà em main + anh quyết theo; sai 1 verdict lan cả chuỗi S56 pre-golive verdict (bắt dept-IT 0-active-user prod — test xanh không thấy) · S57 flag "master-write-open" → thành fix authz thật ×3 controller
🔵 database-agent (b) verdict-class DB-lens read-advisory verdict schema/migration/concurrency nuôi quyết định migration-design S53 first-spawn bắt Mig 46 committed-but-unapplied-local (203 SQLite-test + CI-prod đều MISS) · S56 review MAJOR → bump Serializable (gotcha #58 fix)
harvest-curator (c) chống-rubber-stamp fidelity-gate phải DÁM từ-chối xác-nhận cái sai của CHÍNH LEAD S56 GATE chấm 4.5/5 (KHÔNG rubber-stamp 5/5) buộc em main append Serializable-correction vào impl/test MEMORY trước đóng

DEMOTE-LIST (7 — pin model: claude-opus-4-8, H4.4): 🟨 implementer-backend + 🟧 implementer-frontend (deterministic scaffold spec-sạch, double-gate reviewer+test+cicd sau lưng) · 🟪 test-specialist (artifact RED→GREEN observable, reviewer gate sau) · 🟩 cicd-monitor (checklist-verify deploy, evidence curl/sqlcmd deterministic — AI_INFRA cũng demote con tương đương) · 🟫 tooling-auditor (checklist-class propose-only — AI_INFRA demote chính con này) · 🟦 investigator-api (external research-gathering, ít verdict hệ-trọng; re-assess nếu track-record verdict xuất hiện) · 🩷 frontend-designer (execute-layer design, FD-rubric + screenshot-loop + reviewer-gate sau; giữ effort: max riêng sẵn có).

WORKFLOW tier-map H4.5 (hmw.js): resolveModel(role, tier, i) — promote-roles (investigator-codebase·reviewer·database-agent ∈ VALID_ROLES) → undefined (inherit Fable qua frontmatter) · demoted-roles → undefined (frontmatter pin tự lo) · role-less → 'opus' (sweep-class CÓ CHỦ ĐÍCH — taskList do lead author từng run = đã phân loại lúc author, mirror lý-lẽ AI_INFRA worked-example) + log nhắc khai tier:'fable' khi hệ-trọng · per-task tier:'fable'|'opus' override (tier lạ → WARN + default). meta.description đã sync.

Files edited (11 core + 3 bookkeeping): 7× .claude/agents/{implementer-backend,implementer-frontend,test-specialist,cicd-monitor,tooling-auditor,investigator-api,frontend-designer}.md (model: inheritclaude-opus-4-8, GIỮ effort/tools) · hmw.js (tier-map + meta + args-comment + agent-call; invalid-role typo → fail-UP inherit per reviewer #7) · agents/README.md (header tier + upgrade-note 06-10 + ASCII EM-main Fable + spawn-line) · STATUS.md Sub-agents row (attribution per reviewer #2) · ultra-on.md P2 row — bookkeeping: 2 adap-report + 1 email outbox.

SELF-CHECK broadcast: lead Fable 5 (1M) Max live ✓ · justification per-vị-trí (bảng trên, KHÔNG list suông) ✓ · grep model: = đúng 7 claude-opus-4-8 + 4 inherit, 0 file [1m] frontmatter ✓ · workflow gate=inherit/sweep='opus' ✓ · spawn-test đã chạy (n=2 demoted — kết quả = env-divergence finding, see §2) ✓ · email /send-email ai_infra SENT (broadcasts/outbox/ai_infra/2026-06-10-se-to-ai_infra-harness-4-adopt-report.md, hash-verified) ✓ · nấc G-011 chuẩn (executed-file, KHÔNG claim runtime-verified) ✓.

🟥 reviewer-gate (pre-send + pre-commit): PASS-with-fixes — ALL applied. 0 CRITICAL / 0 MAJOR floor-violation · 1 REQUIRED (sequencing: hash thật trước nac: sent — enforced) + 6 MINOR (STATUS attribution · hmw.js log banner "same-model"→two-tier · meta 8→9-agent · files-count thống nhất · [1m] body-match count cập nhật · email ④ cite direct-evidence) + #7 design-fix (invalid-role typo → fail-UP inherit, KHÔNG rơi 'opus'). Reviewer self-report claude-fable-5[1m] = promote-tier DIRECT evidence n=1 (promote-list member thật, mạnh hơn inherit-chain n=2 demoted-on-cache). Cross-verify độc lập: 8/8 evidence-claim khớp HANDOFF/STATUS/memory/working-tree, 0 claim phịa.

4. tailored-gì + skip-gì-vì-sao

  • FUNCTION-floor adopt FULLY: H4.1→H4.8 giữ đủ (lead không demote · tiêu chí a-d nguyên văn · email-back · honest-claim).
  • FORM tailored SE: (a) map vai theo function-class SE roster 11 (không copy tên AI_INFRA): promote 4/11 — trong đó database-agent là vị trí SE-specific (AI_INFRA worked-example không có); (b) harvest-curator promote (c) dù AI_INFRA chỉ ghi "vùng-xám-nghiêng-giữ" — SE evidence S56 4.5/5 đủ mạnh; (c) hmw.js role-less default 'opus' = đúng nhánh "taskList lead-author từng run" của worked-example (KHÔNG phải nhánh "default nghiêng inherit"); (d) frontend-designer giữ effort: max frontmatter riêng (demote model KHÔNG đụng effort).
  • SKIP: không có helper Sonnet by-design (n-a) · không project-pin model phải gỡ (n-a — khác VIPIX).

5. honest-caveat

  • Nấc = RUNTIME-VERIFIED 06-11 (G-011) — RESOLVED post-restart: spawn-test demote n=1 (tooling-auditor claude-opus-4-8[1m]) + promote n=1 (harvest-curator claude-fable-5[1m]), S57bis. Lịch sử 06-10: executed-file · runtime-PENDING-RESTART. SE env ≠ AI_INFRA env: spawn-test n=2 PROVE frontmatter KHÔNG fresh-read trên CCD harness (cả 2 demoted vẫn self-report claude-fable-5[1m]). Sau restart CLI → re-spawn-test ≥2 demoted self-report claude-opus-4-8* mới claim runtime-verified. Đây là finding ngược với AI_INFRA s20 "fresh-read không cần restart" — đã báo trong email H4.7 mục ⑤ (surface-the-need §K; broadcast tự dặn "đừng kế thừa kết luận mù" — đúng).
  • Context-variant demoted: VERIFIED 06-11 — tooling-auditor self-report "Opus 4.8 (1M context)" → [1m] runtime-resolve 1M trên SE confirmed (trước đó chỉ là claim AI_INFRA s20).
  • Effort cả 2 tier: env machine-wide max — KHÔNG introspect được từ trong agent (H4.8), không claim "verified effort".
  • Risk demote thấy được: tooling-auditor/cicd-monitor nếu verdict-quality giảm rõ sau vài session → adap-request đề xuất promote lại (theo dõi qua H1/H2 chính chúng + em main).
  • Quota note: RESOLVED 06-11 — restart done, demote-pin ăn runtime → tiết kiệm two-tier bắt đầu từ S57bis.