[CLAUDE] Docs: H-15 v3 memory-budget full-parity (lead 220K/sub 60K/wf 50K) + spawn-fill-to-budget

anh owner-directive: full AI_INFRA parity, supersede S82 self-shrink (AI under-shrinking forbidden by mark RC-...01-58-01). New spawn_fill_directive: fill sub context to budget via RICH spawn prompt (MEMORY.md byte-cap = 1 slice only, prompt fills rest). engine +G.5 + adopt-delta S83, ACTIVE-MARKS H-15 v3-delta RC-pqhuy1987-22-06-2026-16-35-37, agents/README retire <=8K-brief. Detector 26 baseline (0 new), A7 217/217. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 17:06:34 +07:00
parent dfde1acbd0
commit 2c7fd635b9
4 changed files with 28 additions and 13 deletions
--- a/.claude/agent-memory/memory-budget.json
+++ b/.claude/agent-memory/memory-budget.json
@ -50,13 +50,18 @@
    "_note": "Harness-15-v2 (S82, 2026-06-21): UPDATED by delta broadcast 2026-06-20-Governance-harness-15-v2-hot-feed-update (supersedes_scope = tier-1-sizing + L2/L3-caps ONLY; rest of H15 unchanged). TWO CHANGES vs S81: (1) Tier-1 = HOT-FEED LARGE per-role (was flat 12K -- too thin, caused lead to forget work across sessions); (2) L2/L3 caps REMOVED (on-demand, no artificial tier-limit, bounded only by model context window). Still the SECOND governor (token) ORTHOGONAL to the BYTE governor (tiers/archive_gate above) -- keep BOTH (B(e)); byte measures file-size-on-disk, token measures context-loaded; VN text ~3.0-3.5 byte/tok so byte/4 = upper bound => real headroom LARGER. Budget = MINIMUM-to-USE floor (FILL Tier-1 with real work-state up to the number; under-fill ONLY when high-value content exhausted; NEVER garbage-stuff -- token-saving = forgetting work).",
    "role_boundary_note": "v2 §6 ROLE BOUNDARY (🔴): the budget numbers (Tier-1 per-role cap + per-bucket allocation) are ANH's (project-owner / chu-du-an) RIGHT to set -- NOT the AI-lead's. em-main's job is exactly two parts: (1) EXECUTE the config faithfully (load Tier-1 to the number, no-truncate, pull each bucket to target) + (2) REPORT %-composition at session-start (§2.1.6) and session-end (§L.b(c)) so anh decides. em-main self-measures + proposes numbers; em-main does NOT auto-tune them down. This corrects the S81 'LEAD-AUTHORITY' framing which conflated AI-lead with project-owner.",
    "tier1_hotfeed_tokens": {
-      "_note": "Tier-1 always-loaded HOT-FEED, PER-ROLE (v2: large/generous, do NOT keep thin). FILL with the 4 work buckets: (1) WIP work-state, (2) recurring-bugs/anti-patterns/gotcha (value_protect, kept regardless of age), (3) backlog, (4) pending-decisions. Numbers below = SE self-measured-ESTIMATE per SE scale (Opus 4.8 1M context window + multi-module ERP workload); %-print at the two session ends shows the REAL composition. AI_INFRA reference (lead 220K / mem-sub 60K / wf-sub 50K) is THEIR measure on THEIR model+federation-scale -- NOT hard-applied; SE numbers are SMALLER (single project; sub MEMORY.md byte-capped at 30720B by design).",
-      "lead_tokens": 60000,
-      "lead_note": "em-main hot-feed: STATUS current-state + 4-bucket work-state block + ACTIVE-MARKS + recent-3-session HANDOFF slice + roster-slice + task-relevant gotchas. Opus 4.8 1M window => ample headroom above this. anh-adjustable.",
-      "memory_sub_tokens": 20000,
-      "memory_sub_note": "memory-bearing sub: own MEMORY.md (<=30720B ~9.3K tok) + archive _INDEX map (<=20480B ~6.2K tok) + work-state slice (~3-4K). Upper-bounded by the BYTE soft-cap on MEMORY.md.",
-      "workflow_sub_tokens": 16000,
-      "workflow_sub_note": "agent-in-workflow: MEMORY-PACK slice (hmw.js:124 args inject) + task context."
+      "_note": "Tier-1 always-loaded HOT-FEED, PER-ROLE (v3 S83 2026-06-22: FULL AI_INFRA parity, owner-set). FILL with the 4 work buckets: (1) WIP work-state, (2) recurring-bugs/anti-patterns/gotcha (value_protect, kept regardless of age), (3) backlog, (4) pending-decisions. anh-set: lead 220K / mem-sub 60K / wf-sub 50K = EXACT AI_INFRA parity. This SUPERSEDES the S82 'SE numbers are SMALLER (lead 60K; subs stay 20K/16K because the byte-cap binds first)' self-justification -- that was the AI under-shrinking, forbidden by role_boundary_note + mark RC-...01-58-01 (token-saving = forgetting work). KEY CORRECTION on subs: the sub MEMORY.md byte-cap (30720B ~9.3K tok) is ONLY ONE SLICE of a sub's Tier-1 -- the SPAWN PROMPT (relevant gotchas + state + full task-context + related docs/memory, written by em-main) fills the REST up to the token budget. So 60K/50K is NOT 'unusable headroom'; it is the target em-main fills via a RICH spawn prompt (see spawn_fill_directive). %-print at the two session ends shows the REAL composition.",
+      "lead_tokens": 220000,
+      "lead_note": "ANH-SET 220K (S83 2026-06-22 owner-directive, full AI_INFRA parity; raised from the 200K interim earlier in S83 and the 60K S82 self-shrink). DO NOT auto-reduce (role_boundary_note + mark RC-...01-58-01). Hot-feed = STATUS full current-state + 4-bucket work-state block + ACTIVE-MARKS + recent-3-session HANDOFF + active roadmap (migration-todos) + roster-slice + task-relevant gotchas + active-task files + task-relevant docs/code -- read GENEROUSLY ('dau phien nap them tier1_lead cho du'), NOT 'on-demand-deferred', but HIGHEST-VALUE distilled only, NEVER garbage-stuff. Opus 4.8 1M window => 220K is ~22% of window, ample. anh-adjustable (owner authority).",
+      "memory_sub_tokens": 60000,
+      "memory_sub_note": "memory-bearing sub (agent-chinh), anh-set 60K (S83 full parity): own MEMORY.md (<=30720B ~9.3K tok auto-inject) + archive _INDEX map + work-state slice + THE RICH SPAWN PROMPT em-main writes (relevant gotchas + current state + full task-context + related docs/memory) = fills toward 60K. The byte-cap on MEMORY.md is NOT the binding limit on the sub's Tier-1; the spawn prompt is. em-main MUST write a context-rich brief, NOT a thin 8K brief (that old anti-truncation heuristic guarded RETURN-truncation #53, mitigated by lean memoryDelta RETURN -- not by starving INPUT).",
+      "workflow_sub_tokens": 50000,
+      "workflow_sub_note": "agent-in-workflow, anh-set 50K (S83 full parity): MEMORY-PACK slice (hmw.js:124 args inject) + RICH task context (relevant gotchas + state + full task-context + related docs/memory passed via the workflow agent() prompt). Same rule: fill to budget with high-value content, lean structured return."
+    },
+    "spawn_fill_directive": {
+      "_note": "Harness-15-v3 (S83 2026-06-22 owner-directive): when em-main SPAWNS any sub-agent or workflow-agent, FILL its context toward its token budget (mem-sub 60K / wf-sub 50K) via a RICH prompt -- relevant gotchas + current state + full task-context + related docs/memory. The sub's MEMORY.md byte-cap is only one slice; the spawn prompt supplies the rest. RULE: highest-value distilled content ONLY (the hot-load tokens must be the most valuable, filtered through many stages), never garbage-stuff to hit the number. RECONCILES the agents/README anti-truncation '<=8K brief' heuristic: that guarded against RETURN-truncation (#53), now mitigated by return-delta-only (memoryDelta) + em-main recover-disk -- so the INPUT prompt is no longer starved; it is filled rich.",
+      "applies_to": ["Agent tool spawn", "Workflow hmw.js agent() calls"],
+      "quality_gate": "highest-value distilled, NOT padding"
    },
    "l2_ondemand": "NO-CAP (v2: removed the 6K cap). On-demand: archive verbatim/gist sections + skill sections; pulled per-need, no artificial tier-limit; bounded only by model context window. On-demand => no permanent context-cost when unused.",
    "l3_rag": "NO-CAP (v2: removed the 4K cap). On-demand: RAG search_memory/search_code per query; bounded only by model context window.",
--- a/.claude/agents/README.md
+++ b/.claude/agents/README.md
@ -62,7 +62,7 @@
 **Em main solo CHỈ khi:** schema/UX/architecture decision · cross-stack tight coupling · bug fix reasoning chain · gotcha #53 fallback (spawn truncate/529 → em main solo reliable, proven S37 BE 700 LOC + FE 4 file).

 **Anti-truncation rules (gotcha #53 — 5× occurrence S35-S37):**
- Brief WRITE agent ≤ 8K (heavy spec ~10K → truncate risk). FE tight brief proven 0 truncation S36.
+- **Spawn prompt = RICH, fill-to-budget (H15-v3 S83 — supersedes old "≤8K brief"):** nạp INPUT tới token-budget của sub (mem-sub **60K** / wf-sub **50K** per `memory-budget.json:spawn_fill_directive`) bằng context GIÁ-TRỊ cao (relevant gotchas + state + full task-context + docs/memory liên-quan) — KHÔNG padding. Truncation #53 = rủi-ro RETURN của agent (vá bằng lean `memoryDelta` return + em-main recover-disk), **KHÔNG** vá bằng starve INPUT. *(Heuristic "≤8K brief" cũ = guard thời context-nhỏ S36, nay nghỉ.)*
 - Tiered Memory v1: L1 HOT soft-cap ~30KB + L2 archive on-demand + L3 RAG just-in-time (per AI_INFRA policy). Investigator 32KB S37 truncate = lesson; soft-cap ~30KB tránh tái diễn.
 - Agent keep entry ≤ 1.5K chars (frontmatter rule mỗi agent).
 - Em main grep verify manual nếu agent return truncated mid-task.
--- a/.claude/governance/ACTIVE-MARKS.md
+++ b/.claude/governance/ACTIVE-MARKS.md
@ -15,7 +15,7 @@
 | `RC-pqhuy1987-20-06-2026-10-29-09` | `rules.md §6.6` + `engine §E.4` (≙ §F4.2) | Quyết-định kiến-trúc/chức-năng = tiêu-chí KHÁCH-QUAN (điểm-đau · khối-lượng · chất-lượng) **KHÔNG quy-mô-đội**; "overkill/quá-mức-solo-dev/cảm-tính" = **BÁC**; thẩm-quyền cần-vs-thừa = AI_INFRA cross-project; AI = neo lý-tính | **pain:** lập-luận "quá mức solo-dev" đã khiến 1 dự-án từ-chối chức-năng chống-lách-engine (sự-cố thật) — SE = solo-dev, đúng đối-tượng · **volume:** 6 dự-án federated + SE 11-agent cần neo nhất-quán · **quality:** quyết-định-cảm-tính trôi chất-lượng âm-thầm; neo-lý-tính giữ rigor | null | 🔖 **Active-High** (anh-confirm S79 · P4 DACI report-before-stamp) |
 | `RC-pqhuy1987-20-06-2026-10-29-10` | `engine §E` (≙ §P) | Codify **User-Mark + chữ-ký RC** (Harness-12/13) — chữ-ký quyết-định governance + 4 cấp + no-cảm-tính deterministic + report-before-stamp | **pain:** quyết-định governance bị quên / giảm-bằng-cảm-tính (không chữ-ký+tier) · **volume:** SE tích-lũy Harness 1-14 cần audit-trail nhất-quán · **quality:** RC-sig = minh-oan + tranh-luận-bằng-bằng-chứng + trách-nhiệm-2-chiều | null | 🔖 **Active-High** (anh-confirm S79 · P4 dogfood: invest-wf→review-wf→báo-cáo→confirm→stamp) |
 | `RC-pqhuy1987-20-06-2026-10-29-11` | `rules.md §6.6 DM-time/age` (≙ §F4.2-ext / H-14) | **Mở-rộng mark-1** — time/age/recency-decay = **false-proxy** (cùng-họ team-size); kiến-trúc KHÔNG dựa cũ / lâu-chưa-dùng / auto-decay; trần budget=(dung-lượng÷tốc-độ-thay-mới) KHÔNG núm-decay-tuổi, drift=đường-nền-cuộn KHÔNG cửa-sổ-tuổi; **additive** (mark-1 GIỮ) | **pain:** cap∝chunk_count = Goodhart-vanity + age-window drift = alarm-spam (sự-cố thật H-14); SE memory-budget từng dễ mắc "giảm-theo-độ-cũ" · **volume:** 6 dự-án áp budget/drift/eval + SE L1/L2/L3 + archive-gate · **quality:** age-decay cắt memory-tốt = false-economy (DM-004 Goodhart §6.6) | null (additive) | 🔖 **Active-High** (anh-confirm S79 via `/user-mark-active-high` · P4 DACI · supersedes:null) |
-| `RC-pqhuy1987-20-06-2026-23-07-37` | `harness-11-engine.md §G` + `memory-budget.json` + `rules.md §6.6` (≙ AI_INFRA H-15) | **H-15 memory-budget per-agent (token-based):** token-governor (**v2 §G.4 S82:** Tầng-1 hot-feed-LỚN per-role lead~60K/sub~20K/wf~16K + L2/L3 **bỏ-trần** on-demand; con-số = **quyền CHỦ-DỰ-ÁN/anh** + AI thực-thi-đúng-số + báo-% 2-đầu-phiên — refine framing "lead-hard-cap" S81) + **value-gated archival** (giữ recurring-bug/anti-pattern/gotcha bất-kể tuổi) + fill-L1-full (budget = sàn-tận-dụng KHÔNG trần-tiết-kiệm) + work-state-block @session-start + %-print 2-đầu-phiên + 2-governor (byte⟂token) | **pain:** tiết-kiệm-token → quên-việc/rơi-trạng-thái giữa phiên (sự-cố thật — Tầng-1 mỏng = lead quên việc nhiều phiên, bằng-chứng v2); SE keep-floor age-based có thể archive gotcha cũ ra khỏi L1 — value-gate FLAG đã bắt thật (test-specialist gotcha#/guard) · **volume:** 11-agent × (SÀN 30K + Tầng-1 hot-feed per-role lead 60K/sub 20K/wf 16K, v2 §G.4) always-load/spawn + L2 archive ~240KB (cicd 194KB) cần value-protect · **quality:** quên-việc → làm-lại → tốn-hơn (Goodhart); giữ hard-won lessons | null (additive — mark `…11` GIỮ; H-15 mở-rộng value-axis xuống tầng-bộ-nhớ) | 🔖 **Active-High** (anh-confirm S81 · **v2-delta anh-confirm S82** `RC-pqhuy1987-21-06-2026-01-58-01` — hot-feed-lớn/bỏ-trần/anh-authority, P4 DACI report-before-stamp ✓) |
+| `RC-pqhuy1987-20-06-2026-23-07-37` | `harness-11-engine.md §G` + `memory-budget.json` + `rules.md §6.6` (≙ AI_INFRA H-15) | **H-15 memory-budget per-agent (token-based):** token-governor (**v3 §G.5 S83:** Tầng-1 hot-feed per-role **lead 220K/mem-sub 60K/wf-sub 50K** full AI_INFRA parity (sửa self-shrink S82) + **spawn-fill-to-budget** (sub Tầng-1 = MEMORY.md byte-cap + spawn-prompt giàu, `spawn_fill_directive`); **v2 §G.4 S82:** L2/L3 **bỏ-trần** on-demand + báo-% 2-đầu-phiên; con-số = **quyền CHỦ-DỰ-ÁN/anh** + AI thực-thi-đúng-số + báo-%) + **value-gated archival** (giữ recurring-bug/anti-pattern/gotcha bất-kể tuổi) + fill-L1-full (budget = sàn-tận-dụng KHÔNG trần-tiết-kiệm) + work-state-block @session-start + %-print 2-đầu-phiên + 2-governor (byte⟂token) | **pain:** tiết-kiệm-token → quên-việc/rơi-trạng-thái giữa phiên (sự-cố thật — Tầng-1 mỏng = lead quên việc nhiều phiên, bằng-chứng v2); SE keep-floor age-based có thể archive gotcha cũ ra khỏi L1 — value-gate FLAG đã bắt thật (test-specialist gotcha#/guard) · **volume:** 11-agent × (SÀN 30K + Tầng-1 hot-feed per-role lead 220K/mem-sub 60K/wf-sub 50K, v3 §G.5 full-parity) always-load/spawn + L2 archive ~240KB (cicd 194KB) cần value-protect · **quality:** quên-việc → làm-lại → tốn-hơn (Goodhart); giữ hard-won lessons | null (additive — mark `…11` GIỮ; H-15 mở-rộng value-axis xuống tầng-bộ-nhớ) | 🔖 **Active-High** (anh-confirm S81 · **v2-delta anh-confirm S82** `RC-pqhuy1987-21-06-2026-01-58-01` — hot-feed-lớn/bỏ-trần/anh-authority · **v3-delta anh-directed S83** `RC-pqhuy1987-22-06-2026-16-35-37` — full-parity 220/60/50 + spawn-fill-to-budget, P4 DACI report-before-stamp ✓) |

 ## 🟢 ACTIVE (follow + nhắc-lại xuyên-suốt — HIỆN @session)
 _(trống)_