Files
solution-erp/scripts/memory-archive-gate.ps1
pqhuy1987 e70c0462d7 [CLAUDE] Docs: adopt Harness-11 engine tự-bảo-trì (3-workflow audit→implement→review)
- engine-doc canonical docs/governance/harness-11-engine.md (PHẦN A/B/C/D + 3-tier D5/D6/D7 + one-direction-lock D8 + CAVEAT honest)
- scripts/governance-detectors.ps1 (C1 broken-pointer + C2/B3 staleness + C3 vocab-fork + C4 self-exclusion + C5 resolve, NO-API DÒ+FLAG-only, runtime-proven, FP-refined 59→27)
- scripts/memory-archive-gate.ps1 (PHẦN A: hysteresis 0.85 + keep-floor 5 + 2-strike + A7 NO-API L1-eval) + budget.json archive_gate
- B1 ×11 count→canonical-pointer (root CLAUDE.md, ef-core/dep-audit SKILL, skills/README, docs/CLAUDE.md) — drift mig53→55/test306→339/gotcha68→69 RESOLVED + ef-core +Mig 54/55 rows
- cadence-wire D1 session-start §2.1.3 + D2 session-end §L.b(c) + agents/README Upgrade S75
- run-trace TRACKED: audit wf_7fdc3bd5-930 / implement wf_c5e5844e-7c1 / review wf_d7ca1ff8-942 (REVIEW PASS, completeness-gate ĐẠT)
- check-email AI_INFRA harness-11 (verify whole-file 318ff9f6 + body b2a2fc1c) + adap-report + outbox report (body 7fa1b53a)
- 0 production code; state THẬT giữ nguyên (Mig 55 · 88 bảng · 339 test · gotcha 69 · menu 54 · bundle BYF5vIMJ/CB-tiRxd)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 20:44:26 +07:00

290 lines
14 KiB
PowerShell

# memory-archive-gate.ps1 - Harness-11 PART-A (S73, 2026-06-18)
# Mechanized standing-gate for agent-memory hot-tier (L1 MEMORY.md).
#
# NON-NEGOTIABLES (Harness-11):
# (1) NO-API : grep/Select-String + byte/file-exist ONLY. NEVER calls a model.
# (2) FLAG-ONLY: DRY-RUN by default. Prints a PLAN + FLAGS. Does NOT move/edit
# any MEMORY.md or archive file. (auto-WRITE of rules = top hazard.)
# (3) PS 5.1 : ASCII-only output (gotcha #30). powershell.exe -ExecutionPolicy Bypass -File
#
# WHAT IT DOES (two independent passes):
# PLANNER (A1/A4/A5/A6) : for each <sub>/MEMORY.md, measure bytes; if over cap,
# plan how many oldest entries to MOVE to get BELOW the
# low-watermark (hysteresis), never draining below a
# keep-floor of newest entries; gate the *proposal* behind
# a 2-strike counter persisted to .archive-strikes.json.
# A7 L1-GATE (NO-API) : for each <sub>/archive/_INDEX.md, verify every
# substring:"..." pointer resolves (SimpleMatch) inside the
# sub's archive/*.md, and that referenced archive files
# exist + size>0. Prints PASS/FAIL.
#
# TAILORING NOTE (Harness-11 PART-A = 'tailorable'): see header comment block at the
# bottom marked [TAILOR] for the simplifications made vs the maximal spec.
#
# Usage:
# powershell.exe -ExecutionPolicy Bypass -File scripts\memory-archive-gate.ps1
# powershell.exe -ExecutionPolicy Bypass -File scripts\memory-archive-gate.ps1 -Apply (records strikes; STILL no file moves)
param(
[string]$RepoRoot = "$PSScriptRoot\..",
[switch]$Apply = $false
)
$ErrorActionPreference = 'Stop'
# ---- resolve paths --------------------------------------------------------
$memRoot = Join-Path $RepoRoot '.claude\agent-memory'
$budgetPath = Join-Path $memRoot 'memory-budget.json'
$strikePath = Join-Path $memRoot '.archive-strikes.json'
if (-not (Test-Path $memRoot)) { Write-Error "agent-memory root not found: $memRoot"; exit 1 }
if (-not (Test-Path $budgetPath)){ Write-Error "memory-budget.json not found: $budgetPath"; exit 1 }
# ---- load tunables from budget.json (archive_gate block) ------------------
$budget = Get-Content $budgetPath -Raw | ConvertFrom-Json
$gate = $budget.archive_gate
if ($null -eq $gate) { Write-Error "memory-budget.json missing 'archive_gate' block (Harness-11 PART-A)"; exit 1 }
$cap = [int]$gate.autoinject_cap_bytes # A1: over this => over-cap
$lowMark = [int]([math]::Floor($cap * [double]$gate.low_watermark_ratio)) # A4: drain target
$keepFloor = [int]$gate.keep_floor_entries # A5: never auto-drain below N newest
$strikeNeed = [int]$gate.strike_threshold # A6: consecutive over-cap runs before proposing
# ---- strike-counter state (A6) -------------------------------------------
# Stateless script => persist a tiny counter file (additive, NOT a memory file).
# Only mutated under -Apply so DRY-RUN is side-effect-free.
$strikes = @{}
if (Test-Path $strikePath) {
try {
$raw = Get-Content $strikePath -Raw | ConvertFrom-Json
foreach ($p in $raw.PSObject.Properties) { $strikes[$p.Name] = [int]$p.Value }
} catch { $strikes = @{} }
}
# ---- helpers --------------------------------------------------------------
# Entry boundaries in a hot MEMORY.md = lines matching one of:
# ^## (h2) | ^### (h3) | ^--- (separator)
# Count of such markers approximates entry count (A5 keep-floor uses this).
function Get-EntryMarkerLineNumbers([string[]]$lines) {
$idx = @()
for ($i = 0; $i -lt $lines.Count; $i++) {
if ($lines[$i] -match '^(#{2,3}\s|---\s*$)') { $idx += $i }
}
return ,$idx
}
# ---- header ---------------------------------------------------------------
$mode = if ($Apply) { "APPLY (records strikes; NO file moves)" } else { "DRY-RUN (no writes at all)" }
Write-Output "============================================================"
Write-Output " memory-archive-gate.ps1 - Harness-11 PART-A"
Write-Output " mode : $mode"
Write-Output " cap : $cap bytes (autoinject_cap)"
Write-Output " low-water : $lowMark bytes (A4 hysteresis drain target = ratio $($gate.low_watermark_ratio))"
Write-Output " keep-floor : $keepFloor newest entries (A5)"
Write-Output " strike-need : $strikeNeed consecutive over-cap runs to PROPOSE (A6)"
Write-Output "============================================================"
# ==========================================================================
# PASS 1 - PLANNER (A1 measure / A4 hysteresis / A5 keep-floor / A6 strike)
# ==========================================================================
Write-Output ""
Write-Output "### PASS 1 - hot-tier over-cap planner (FLAG ONLY, no moves)"
Write-Output ""
$dash = [string]([char]45) # '-' as a value, never a bare token (PS 5.1 treats '--' runs as decrement op)
Write-Output ("{0,-24} {1,9} {2,5} {3,10} {4,7} {5,12} {6}" -f 'sub','bytes','over?','entries','strike','after-est','resolve')
Write-Output ("{0,-24} {1,9} {2,5} {3,10} {4,7} {5,12} {6}" -f ($dash*24),($dash*9),($dash*5),($dash*10),($dash*7),($dash*12),($dash*7))
$subDirs = Get-ChildItem -Path $memRoot -Directory | Sort-Object Name
$anyOver = $false
foreach ($d in $subDirs) {
$sub = $d.Name
$mem = Join-Path $d.FullName 'MEMORY.md'
if (-not (Test-Path $mem)) { continue }
$bytes = (Get-Item $mem).Length # A1
$isOver = $bytes -gt $cap
$lines = Get-Content $mem
$markers = Get-EntryMarkerLineNumbers $lines
$entryCount = $markers.Count
# --- A6 strike bookkeeping ---
$prev = if ($strikes.ContainsKey($sub)) { [int]$strikes[$sub] } else { 0 }
if ($isOver) {
$cur = $prev + 1
} else {
$cur = 0 # reset on a clean run (consecutive-only)
}
if ($Apply) { $strikes[$sub] = $cur }
if (-not $isOver) {
# Under cap: one tidy line, nothing to plan.
Write-Output ("{0,-24} {1,9} {2,5} {3,10} {4,7} {5,12} {6}" -f $sub, $bytes, 'no', $entryCount, $cur, '-', 'ok')
continue
}
$anyOver = $true
# --- A4 hysteresis + A5 keep-floor : how many OLDEST entries to move? ---
# Move oldest entries one-by-one; estimate bytes-after by cutting at the
# marker line of the FIRST entry we keep. Stop when est < lowMark, but never
# let kept-entries drop below keepFloor.
$moveCount = 0
$afterEst = $bytes
$warnFloor = $false
if ($entryCount -le $keepFloor) {
# Already at/under floor but still over cap => cannot auto-drain.
$warnFloor = $true
$afterEst = $bytes
} else {
# markers[k] = line index where entry (k) starts. Keeping entries
# [k..end] means the kept region begins at byte offset of markers[k].
# Bytes-after = total - (bytes before markers[k]).
for ($move = 1; $move -le ($entryCount - $keepFloor); $move++) {
$cutLine = $markers[$move] # first KEPT entry starts here (0-based line idx)
# bytes of the moved prefix = sum of (line length + 1 newline) for lines [0..cutLine-1]
$prefixBytes = 0
for ($li = 0; $li -lt $cutLine; $li++) { $prefixBytes += ($lines[$li].Length + 2) } # +2 ~ CRLF est
$est = $bytes - $prefixBytes
$moveCount = $move
$afterEst = $est
if ($est -lt $lowMark) { break }
}
# If we exhausted the movable range and still >= lowMark, floor was hit.
if ($afterEst -ge $cap -and $moveCount -eq ($entryCount - $keepFloor)) { $warnFloor = $true }
}
# --- A6 gate the resolution wording on the strike count ---
if ($warnFloor) {
$resolve = "WARN keep-floor hit ($keepFloor); cannot auto-drain - SPLIT/condense entries by hand"
} elseif ($cur -ge $strikeNeed) {
$resolve = "PROPOSE archive (strike $cur>=$strikeNeed): move $moveCount oldest -> curate L1->L2 by hand"
} else {
$resolve = "WATCH (strike $cur<$strikeNeed): re-run; propose only after $strikeNeed consecutive over-cap"
}
Write-Output ("{0,-24} {1,9} {2,5} {3,10} {4,7} {5,12} {6}" -f $sub, $bytes, 'YES', $entryCount, $cur, "~$afterEst", $resolve)
}
if (-not $anyOver) {
Write-Output ""
Write-Output " (no sub over cap - hot tier within auto-inject budget)"
}
# Persist strikes under -Apply (additive counter file, NOT a memory file).
if ($Apply) {
($strikes | ConvertTo-Json) | Set-Content -Path $strikePath -Encoding ASCII
Write-Output ""
Write-Output " [A6] strikes persisted -> $strikePath"
} else {
Write-Output ""
Write-Output " [A6] DRY-RUN: strike counters NOT persisted (run with -Apply to advance strikes)"
}
# ==========================================================================
# PASS 2 - A7 NO-API L1-GATE : pointer-resolve + byte-sanity on EXISTING archive
# ==========================================================================
Write-Output ""
Write-Output "### PASS 2 - A7 archive-integrity gate (NO-API: grep + measure only)"
Write-Output ""
$gateTotalPtr = 0
$gateOkPtr = 0
$gateFailPtr = 0
$anyArchive = $false
foreach ($d in $subDirs) {
$sub = $d.Name
$archDir = Join-Path $d.FullName 'archive'
$indexPath = Join-Path $archDir '_INDEX.md'
if (-not (Test-Path $indexPath)) { continue } # only subs with a built index
$anyArchive = $true
# all archive content files (exclude the index itself)
$contentFiles = Get-ChildItem -Path $archDir -Filter *.md | Where-Object { $_.Name -ne '_INDEX.md' }
Write-Output " [$sub] _INDEX.md + $($contentFiles.Count) archive file(s)"
# (ii) byte-sanity: every archive content file exists + size>0
foreach ($cf in $contentFiles) {
if ($cf.Length -le 0) {
Write-Output (" BYTE-FAIL {0} is 0 bytes" -f $cf.Name)
$gateFailPtr++
}
}
# Pre-load every archive content file as UTF-8 (gotcha #30: PS 5.1 Get-Content
# defaults to ANSI codepage and MANGLES Vietnamese diacritics / em-dash / arrows,
# which made byte-identical pointers falsely FAIL. Force UTF-8 on BOTH sides.)
$utf8 = New-Object System.Text.UTF8Encoding($false)
$haystacks = @{}
foreach ($cf in $contentFiles) {
$haystacks[$cf.Name] = [System.IO.File]::ReadAllText($cf.FullName, $utf8)
}
# (i) pointer-resolve: extract every substring token (substring:QUOTE...QUOTE) and
# locate it literally (String.Contains = SimpleMatch) in ANY archive file.
$indexText = [System.IO.File]::ReadAllText($indexPath, $utf8)
$indexLines = $indexText -split "`r?`n"
$subPtrCount = 0; $subOk = 0; $subFail = 0
foreach ($line in $indexLines) {
# skip blockquote legend/convention lines (e.g. the '> Pointer style ...
# substring:"<unique-string>"' template); those document the format, they
# are not real record pointers.
if ($line -match '^\s*>') { continue }
# robust across all 3 formats (bullet / table / arrow) - just grab the quoted payload
$m = [regex]::Matches($line, 'substring:"([^"]+)"')
foreach ($match in $m) {
$needle = $match.Groups[1].Value
$subPtrCount++; $gateTotalPtr++
# literal substring search across ALL archive content files for this sub
$found = $false
foreach ($k in $haystacks.Keys) {
if ($haystacks[$k].Contains($needle)) { $found = $true; break }
}
if ($found) {
$subOk++; $gateOkPtr++
} else {
$subFail++; $gateFailPtr++
$q = [char]34
Write-Output (" PTR-FAIL substring not found in archive/*.md : {0}{1}{0}" -f $q, $needle)
}
}
}
$verdict = if ($subFail -eq 0) { "PASS" } else { "FAIL" }
Write-Output (" -> {0} pointers {1} resolved {2} failed {3}" -f $verdict, $subPtrCount, $subOk, $subFail)
}
if (-not $anyArchive) {
Write-Output " (no sub has archive/_INDEX.md yet - nothing to gate)"
}
Write-Output ""
Write-Output "------------------------------------------------------------"
$overallA7 = if ($gateFailPtr -eq 0) { "PASS" } else { "FAIL" }
Write-Output (" A7 GATE {0} - total pointers {1}, resolved {2}, failed {3}" -f $overallA7, $gateTotalPtr, $gateOkPtr, $gateFailPtr)
Write-Output "------------------------------------------------------------"
# Exit non-zero only on A7 integrity failure (broken pointer / 0-byte archive).
# Over-cap is a FLAG (not an error) - the gate reports, a human curates.
if ($gateFailPtr -gt 0) { exit 2 } else { exit 0 }
# ==========================================================================
# [TAILOR] Harness-11 PART-A simplifications (honest record):
# * bytes-after-est uses (line.Length + 2) as a CRLF-aware estimate per moved
# line; it is an ESTIMATE for the plan, not a real cut. The "~" prefix in the
# after-est column flags it as approximate. (Real bytes only known post-move,
# which the gate deliberately never performs.)
# * Entry boundary = first of (^##, ^###, ^---). MEMORY.md files here are h2-only
# today (verified S73), so marker-count == entry-count in practice; the regex
# also tolerates h3/HR-delimited files.
# * A6 strike state is a flat {sub: int} JSON. Reset-to-0 on any clean run =>
# "consecutive over-cap" semantics. Requires 2 real runs (-Apply) to reach the
# PROPOSE nudge by design (the spec "runtime needs 2 runs" note).
# * A7 resolves the substring against ALL archive/*.md for the sub (not just the
# arrow-named file) because the 3 _INDEX formats name the target differently
# (reviewer arrow / cicd arrow-then-substr / inv-codebase q-shorthand table).
# A unique substring landing anywhere in the sub's frozen archive == resolved.
# ==========================================================================