Files
solution-erp/docs/governance/RAG-AUDIT-RESPONSE-2026-05-29.md
pqhuy1987 282cbd0c7b
All checks were successful
Deploy SOLUTION_ERP / build-deploy (push) Successful in 4m7s
[CLAUDE] Docs: S41 RAG audit response — exclude **/-anchor fix + retire stale _decision_log + AI_INFRA signal
- rag.json exclude_paths root-anchored -> **/-anchored (defeats gotcha #10:
  node_modules/** + docs/_archive/** were not matching nested paths)
- _decision_log: retire stale "+321% / LIVE 11,922" -> real status
  (LIVE ~3080 ~= registry 3076, drift closed 2026-05-28)
- New docs/governance/RAG-AUDIT-RESPONSE-2026-05-29.md: SE-side prep done +
  corrections (store_memory at-risk = 3 disk-backed broadcasts, NOT ~27) +
  re-bootstrap ask for AI_INFRA + post-bootstrap verify checklist

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-29 22:18:33 +07:00

3.1 KiB
Raw Blame History

📤 SOLUTION_ERP → AI_INFRA — RAG Audit Response (2026-05-29, S41)

Re: AI_INFRA RAG audit 2026-05-29 (Qdrant LIVE verify). SE-side prep DONE; re-bootstrap = AI_INFRA op (charter v2). Persistent + corpus-backed record.

SE-side DONE (this session)

  1. Exclude fix (.claude/rag.json) — root-anchored → **/-anchored, defeats gotcha #10:
    • node_modules/****/node_modules/**
    • docs/_archive/****/_archive/** (also bin/obj/.git**/-anchored for consistency)
    • JSON validated. Takes effect on next re-bootstrap.
  2. _decision_log stale numbers retiredregistry_drift_note "+321% / LIVE 11,922" was pre-bootstrap STALE → rewritten to real status (LIVE ~3080 ≈ registry 3076, drift closed 2026-05-28). anatomy_threshold_chosen "11,922 mature" → "SE collection ~3080".

🔶 SE-side findings (corrections to audit estimates — verified on disk + Qdrant)

  • store_memory at-risk ≪ "~27". True store_memory chunks (heading_path="(manual)") = only the 3 S40 broadcasts, ALL disk-backed (docs/governance/BROADCAST-OUT-*.md confirmed on disk). Replace-mode recreates them from corpus files → NOT at-risk, no export-reinsert needed. The "~27" appears to conflate with the 27 user-memory feedback entries — those are extra_corpus FILE-based; they need re-index to be added, not protected.
  • node_modules junk confirmed: docs/_user-guide/node_modules/ = 30 .md files on disk (≈237 chunks plausible).
  • _archive risk is now WORSE, not stable: docs/_archive/ now holds the 170KB+ pre-S40 STATUS/HANDOFF archives (created S40, after the 05-28 bootstrap). A re-bootstrap WITHOUT the exclude fix would index hundreds of archive chunks. Exclude fix must land before re-bootstrap.
  • ⚠️ Slug anomaly for AI_INFRA to confirm: feedback chunks currently index under the OLD slug path ...projects\D--Dropbox-CONG-VIEC-SOLUTION\memory\ (missing -SOLUTION-ERP). Confirms the slug bug; replace-mode should wipe old-path chunks + re-add from the corrected extra_corpus path (rag.json:18, fixed S40).

🟢 ASK — AI_INFRA re-bootstrap (1 run, gathers everything)

python AI_INFRA/claude-rag/bootstrap.py --project solution_erp — picks up: (a) exclude fix → 0 node_modules + 0 _archive chunks; (b) corrected extra_corpus slug → 27 feedback entries indexed; (c) S38S41 content (Proposal/WorkflowApps/consolidated docs).

Repeat of prior standing items (broadcast 2026-05-29): bootstrap.py corpus-path validation (warn on glob→0 files), verify auto_reindex hook actually fires (last_indexed lagged), search_code corpus gap (src/.cs + fe/.tsx not in corpus), registry sync.

🔍 SE post-bootstrap verify (after AI_INFRA confirms run)

  1. node_modules chunks = 0 · _archive chunks = 0 (search a known junk term → expect miss)
  2. 27 feedback entries discoverable under corrected slug
  3. 3 broadcasts still present · chunk_count sane (no bloat)

Stance (charter v2)

SE = USER of infra. SE handled its own config declaration (corpus/exclude) + content; RAG mechanism (bootstrap/chunk/path-resolution) stays AI_INFRA. Conflict → anh pqhuy quyết.