[CLAUDE] Docs: H-15 v3 memory-budget full-parity (lead 220K/sub 60K/wf 50K) + spawn-fill-to-budget
anh owner-directive: full AI_INFRA parity, supersede S82 self-shrink (AI under-shrinking forbidden by mark RC-...01-58-01). New spawn_fill_directive: fill sub context to budget via RICH spawn prompt (MEMORY.md byte-cap = 1 slice only, prompt fills rest). engine +G.5 + adopt-delta S83, ACTIVE-MARKS H-15 v3-delta RC-pqhuy1987-22-06-2026-16-35-37, agents/README retire <=8K-brief. Detector 26 baseline (0 new), A7 217/217. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@ -50,13 +50,18 @@
|
||||
"_note": "Harness-15-v2 (S82, 2026-06-21): UPDATED by delta broadcast 2026-06-20-Governance-harness-15-v2-hot-feed-update (supersedes_scope = tier-1-sizing + L2/L3-caps ONLY; rest of H15 unchanged). TWO CHANGES vs S81: (1) Tier-1 = HOT-FEED LARGE per-role (was flat 12K -- too thin, caused lead to forget work across sessions); (2) L2/L3 caps REMOVED (on-demand, no artificial tier-limit, bounded only by model context window). Still the SECOND governor (token) ORTHOGONAL to the BYTE governor (tiers/archive_gate above) -- keep BOTH (B(e)); byte measures file-size-on-disk, token measures context-loaded; VN text ~3.0-3.5 byte/tok so byte/4 = upper bound => real headroom LARGER. Budget = MINIMUM-to-USE floor (FILL Tier-1 with real work-state up to the number; under-fill ONLY when high-value content exhausted; NEVER garbage-stuff -- token-saving = forgetting work).",
|
||||
"role_boundary_note": "v2 §6 ROLE BOUNDARY (🔴): the budget numbers (Tier-1 per-role cap + per-bucket allocation) are ANH's (project-owner / chu-du-an) RIGHT to set -- NOT the AI-lead's. em-main's job is exactly two parts: (1) EXECUTE the config faithfully (load Tier-1 to the number, no-truncate, pull each bucket to target) + (2) REPORT %-composition at session-start (§2.1.6) and session-end (§L.b(c)) so anh decides. em-main self-measures + proposes numbers; em-main does NOT auto-tune them down. This corrects the S81 'LEAD-AUTHORITY' framing which conflated AI-lead with project-owner.",
|
||||
"tier1_hotfeed_tokens": {
|
||||
"_note": "Tier-1 always-loaded HOT-FEED, PER-ROLE (v2: large/generous, do NOT keep thin). FILL with the 4 work buckets: (1) WIP work-state, (2) recurring-bugs/anti-patterns/gotcha (value_protect, kept regardless of age), (3) backlog, (4) pending-decisions. Numbers below = SE self-measured-ESTIMATE per SE scale (Opus 4.8 1M context window + multi-module ERP workload); %-print at the two session ends shows the REAL composition. AI_INFRA reference (lead 220K / mem-sub 60K / wf-sub 50K) is THEIR measure on THEIR model+federation-scale -- NOT hard-applied; SE numbers are SMALLER (single project; sub MEMORY.md byte-capped at 30720B by design).",
|
||||
"lead_tokens": 60000,
|
||||
"lead_note": "em-main hot-feed: STATUS current-state + 4-bucket work-state block + ACTIVE-MARKS + recent-3-session HANDOFF slice + roster-slice + task-relevant gotchas. Opus 4.8 1M window => ample headroom above this. anh-adjustable.",
|
||||
"memory_sub_tokens": 20000,
|
||||
"memory_sub_note": "memory-bearing sub: own MEMORY.md (<=30720B ~9.3K tok) + archive _INDEX map (<=20480B ~6.2K tok) + work-state slice (~3-4K). Upper-bounded by the BYTE soft-cap on MEMORY.md.",
|
||||
"workflow_sub_tokens": 16000,
|
||||
"workflow_sub_note": "agent-in-workflow: MEMORY-PACK slice (hmw.js:124 args inject) + task context."
|
||||
"_note": "Tier-1 always-loaded HOT-FEED, PER-ROLE (v3 S83 2026-06-22: FULL AI_INFRA parity, owner-set). FILL with the 4 work buckets: (1) WIP work-state, (2) recurring-bugs/anti-patterns/gotcha (value_protect, kept regardless of age), (3) backlog, (4) pending-decisions. anh-set: lead 220K / mem-sub 60K / wf-sub 50K = EXACT AI_INFRA parity. This SUPERSEDES the S82 'SE numbers are SMALLER (lead 60K; subs stay 20K/16K because the byte-cap binds first)' self-justification -- that was the AI under-shrinking, forbidden by role_boundary_note + mark RC-...01-58-01 (token-saving = forgetting work). KEY CORRECTION on subs: the sub MEMORY.md byte-cap (30720B ~9.3K tok) is ONLY ONE SLICE of a sub's Tier-1 -- the SPAWN PROMPT (relevant gotchas + state + full task-context + related docs/memory, written by em-main) fills the REST up to the token budget. So 60K/50K is NOT 'unusable headroom'; it is the target em-main fills via a RICH spawn prompt (see spawn_fill_directive). %-print at the two session ends shows the REAL composition.",
|
||||
"lead_tokens": 220000,
|
||||
"lead_note": "ANH-SET 220K (S83 2026-06-22 owner-directive, full AI_INFRA parity; raised from the 200K interim earlier in S83 and the 60K S82 self-shrink). DO NOT auto-reduce (role_boundary_note + mark RC-...01-58-01). Hot-feed = STATUS full current-state + 4-bucket work-state block + ACTIVE-MARKS + recent-3-session HANDOFF + active roadmap (migration-todos) + roster-slice + task-relevant gotchas + active-task files + task-relevant docs/code -- read GENEROUSLY ('dau phien nap them tier1_lead cho du'), NOT 'on-demand-deferred', but HIGHEST-VALUE distilled only, NEVER garbage-stuff. Opus 4.8 1M window => 220K is ~22% of window, ample. anh-adjustable (owner authority).",
|
||||
"memory_sub_tokens": 60000,
|
||||
"memory_sub_note": "memory-bearing sub (agent-chinh), anh-set 60K (S83 full parity): own MEMORY.md (<=30720B ~9.3K tok auto-inject) + archive _INDEX map + work-state slice + THE RICH SPAWN PROMPT em-main writes (relevant gotchas + current state + full task-context + related docs/memory) = fills toward 60K. The byte-cap on MEMORY.md is NOT the binding limit on the sub's Tier-1; the spawn prompt is. em-main MUST write a context-rich brief, NOT a thin 8K brief (that old anti-truncation heuristic guarded RETURN-truncation #53, mitigated by lean memoryDelta RETURN -- not by starving INPUT).",
|
||||
"workflow_sub_tokens": 50000,
|
||||
"workflow_sub_note": "agent-in-workflow, anh-set 50K (S83 full parity): MEMORY-PACK slice (hmw.js:124 args inject) + RICH task context (relevant gotchas + state + full task-context + related docs/memory passed via the workflow agent() prompt). Same rule: fill to budget with high-value content, lean structured return."
|
||||
},
|
||||
"spawn_fill_directive": {
|
||||
"_note": "Harness-15-v3 (S83 2026-06-22 owner-directive): when em-main SPAWNS any sub-agent or workflow-agent, FILL its context toward its token budget (mem-sub 60K / wf-sub 50K) via a RICH prompt -- relevant gotchas + current state + full task-context + related docs/memory. The sub's MEMORY.md byte-cap is only one slice; the spawn prompt supplies the rest. RULE: highest-value distilled content ONLY (the hot-load tokens must be the most valuable, filtered through many stages), never garbage-stuff to hit the number. RECONCILES the agents/README anti-truncation '<=8K brief' heuristic: that guarded against RETURN-truncation (#53), now mitigated by return-delta-only (memoryDelta) + em-main recover-disk -- so the INPUT prompt is no longer starved; it is filled rich.",
|
||||
"applies_to": ["Agent tool spawn", "Workflow hmw.js agent() calls"],
|
||||
"quality_gate": "highest-value distilled, NOT padding"
|
||||
},
|
||||
"l2_ondemand": "NO-CAP (v2: removed the 6K cap). On-demand: archive verbatim/gist sections + skill sections; pulled per-need, no artificial tier-limit; bounded only by model context window. On-demand => no permanent context-cost when unused.",
|
||||
"l3_rag": "NO-CAP (v2: removed the 4K cap). On-demand: RAG search_memory/search_code per query; bounded only by model context window.",
|
||||
|
||||
Reference in New Issue
Block a user