[CLAUDE] Docs: chốt Session 21 turn 2 — RAG Hybrid setup planning + Cách A validation

Sau S21 turn 1 chốt cicd-monitor, bro clarify 5 dự án future > 1M MD tokens → discussion deep ~15 turn về RAG infrastructure. Em main solo (no SOLUTION_ERP sub-agent spawn), delegate claude-code-guide × 2 research Anthropic + community practice. Quyết định chốt: - Cách A defensive (giữ blanket 120K em main + RAG retrieve supplement) - Bỏ Cách B aggressive (cắt 60-70% blanket) — vi phạm priority em main control flow strong - Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider all hybrid) - 3-layer pattern Phase 1-3 incremental rollout (vector → +BM25 → +reranking, recall ~70% → ~92%) - Stack: Voyage-3-large + Qdrant local + FastMCP Python + Streamlit dashboard Multi-agent cost reality clarify (post-S21 t2): - Em main blanket: ~120K - 4 sub-agents spawn cumulative: ~400K - Total billed heavy session: ~560K Cách A vs ~700K lazy - Saving -20% từ multi-agent shared cache 70-90% - Anthropic acknowledge 8-10× multiplier multi-agent Files updated: - docs/STATUS.md (Last updated S21 turn 2 + Recently Done row top) - docs/HANDOFF.md (TL;DR Session 21 turn 2 section + Last updated) - docs/rag-setup-plan.md (+Section 13 multi-agent cost reality + Section 14 3-layer hybrid Phase 1-3, +355 LOC) - docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md (new session log) Memory user-level update (outside repo, separate update): - feedback_rag_hybrid_pattern.md (NEW cross-project pattern reusable) - MEMORY.md index (+1 entry pointer) Plan I NEW deferred — trigger bro confirm 5 dự án path + stack + pilot + Voyage API + disk cleanup → dedicated session 10-14h weekend (per feedback_drastic_refactor_scope rule). Stats: - 17 memory entries (+1 RAG hybrid) - 1 plan file rag-setup-plan.md (1500 LOC final) - 4 sub-agents seeds-only unchanged - 81 test unchanged - 4 commits S21 cumulative (f1c61c9 + 3a34831 + 1f8e9af + this) CI skip per path filter (all .md). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 18:50:28 +07:00
parent 1f8e9af66f
commit 0a3b747612
4 changed files with 783 additions and 2 deletions
--- a/docs/HANDOFF.md
+++ b/docs/HANDOFF.md
@ -1,6 +1,112 @@
 # HANDOFF — Brief 5 phút cho session tiếp theo
-**Last updated:** 2026-05-12 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`. CI skipped per path filter (3 file `.md`). Cost reality update: ~750K spawn (3 → 4 agents) · ~1.35M heavy / ~700K optimized. Stats: 4 sub-agents seeds-only · 16 memory · 27 mig · 59 tables · ~142 endpoints · 81 test · 44 gotcha · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work — em main solo). Trial Week 1 kick-off S21 turn 2+ Plan B Contract V2 wire mirror PE pattern.**)
+**Last updated:** 2026-05-12 (Session 21 turn 2 — **🎯 RAG Hybrid setup planning + Cách A validation deep dive. 2 commit (`1f8e9af` plan save 1223 LOC + this chốt). KHÔNG implement, plan only — defer chờ bro confirm 5 dự án future. Decision chốt: Cách A defensive (giữ blanket 120K em main + RAG retrieve) over Cách B aggressive (cắt 60-70% blanket). Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider). Stack: Voyage-3-large + Qdrant + FastMCP + Streamlit dashboard. Multi-agent cost reality: 4 agents → ~520K cumulative blanket → heavy session ~560K (Cách A) vs ~700K (lazy). 3-layer pattern Phase 1-3 rollout (embeddings + BM25 + reranking, ~70% → ~92% recall). Stats: +1 memory entry (`feedback_rag_hybrid_pattern`) +1 plan file (`rag-setup-plan.md` 1500 LOC). Sub-agents vẫn 4 seeds-only, em main solo session.**)
 **S21 turn 1:** 2026-05-12 0030 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`. CI skipped per path filter (3 file `.md`). Cost reality update: ~750K spawn (3 → 4 agents) · ~1.35M heavy / ~700K optimized. Stats: 4 sub-agents seeds-only · 16 memory · 27 mig · 59 tables · ~142 endpoints · 81 test · 44 gotcha · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work — em main solo). Trial Week 1 kick-off S21 turn 2+ Plan B Contract V2 wire mirror PE pattern.**)
 ## TL;DR Session 21 turn 2 — RAG Hybrid setup planning (Cách A chốt + 3-layer pattern)
 User clarify 5 dự án future > 1M MD tokens → cuộc thảo luận deep ~15 turn về RAG infrastructure. Em main solo (no SOLUTION_ERP sub-agent spawn), delegate 2 lần claude-code-guide agent research Anthropic + community practice.
 ### Q&A deep dive 10 topics
 1. RAG fundamentals + Vector DB role (Qdrant)
 2. Embedding "AI nhúng" + Voyage AI cost mechanics ($0.18/M tokens)
 3. Multi-project shared architecture (5 projects → single Qdrant + per-collection)
 4. Audit procedure 3-tier (weekly auto + monthly deep + quarterly major)
 5. UI/UX Streamlit dashboard 7 pages design
 6. Cách A defensive (giữ blanket 120K) vs Cách B aggressive (cắt 60-70%)
 7. Reasoning depth comparison: lazy 60% → A 90% → B 75-80%
 8. Industry validation: Anthropic + Cursor + Continue + Cline + Aider all hybrid
 9. Multi-agent cost reality: 8-10× multiplier, ~520K cumulative blanket 5 entities
 10. 3-layer hybrid pattern (Anthropic Contextual Retrieval Sept 2024)
 ### Quyết định chốt — Cách A vs Cách B
 **Chọn Cách A** (defensive hybrid):
 - Blanket: GIỮ NGUYÊN 120K em main + RAG retrieve supplement
 - Sub-agent spawn baseline: ~80-100K each (4 agents = ~400K cumulative)
 - Heavy session billed: ~560K (saving -20% vs lazy 700K)
 - Quality recall: ~85% (vs Cách B 75-80% do fragmentation)
 **Why Cách A** (bro priority chốt):
 - ✅ Em main control flow strong (state ownership direct, response fast)
 - ✅ Decision quality 90% (multi-source cohesive reasoning)
 - ✅ Wall-clock per task -20% (12 phút vs Cách B 16 phút)
 - ✅ Risk-averse (graceful fallback blanket nếu RAG fail)
 - ✅ Multi-agent leverage cache 70-90% hit common queries
 - ✅ Industry-validated (Anthropic + Cursor + Continue + Cline + Aider)
 ### 3-layer hybrid Phase rollout (Anthropic Contextual Retrieval)
 | Phase | Layers | Recall | Cost/mo |
 |---|---|---|---|
 | Phase 1 (Week 1-4) | Vector embedding only (Voyage-3-large) | ~70% | ~$1.50 |
 | Phase 2 (Month 2) | + BM25 hybrid (bm25s free local) | ~78% | ~$1.50 |
 | Phase 3 (Month 3) | + Voyage rerank-2 + Contextual prefix | ~92% | ~$4-5 |
 ### Stack validated cross-industry
 - Voyage AI embedding (Anthropic partner, multilingual 26 lang)
 - Qdrant local (Rust binary, "leading agent memory backend 2026")
 - FastMCP Python (official Anthropic SDK)
 - SQLite event log + Streamlit dashboard 7 pages
 - Pre-commit hook re-index delta
 ### Multi-agent cost reality (Anthropic warn 8-10× multiplier)
 ```
 Per entity blanket Cách A:
  Em main: ~120K
  4 sub-agents × ~100K spawn = 400K cumulative
  Total: ~520K cumulative billed (not single context window)
 Heavy session 4-agent spawn:
  Lazy: ~700K effective billed
  Cách A: ~560K (-20% from multi-agent shared cache)
 ```
 ### Plan I NEW — RAG Setup Implementation (defer)
 **Trigger:** Bro confirm 5 dự án path + stack + pilot choice + Voyage API key + disk cleanup 5-8GB.
 **Schedule:** Dedicated session 10-14h weekend (per `feedback_drastic_refactor_scope`).
 **Phase rollout:**
 - Phase 1 single project pilot 4-week trial
 - Phase 2-3 upgrade incremental conditional on Phase 1 success
 - Cost realistic: ~$2-5/month total cho 5 projects
 ### Deliverables
 - ✅ `docs/rag-setup-plan.md` (commit `1f8e9af` 1223 LOC + extend S21 t2 ~300 LOC = ~1500 LOC final)
 - ✅ Memory `feedback_rag_hybrid_pattern.md` (NEW cross-project reusable)
 - ✅ MEMORY.md index +1 entry
 - ✅ Session log this chốt
 - ⏭ Implementation defer chờ trigger
 ### Em main solo S21 turn 2 (no SOLUTION_ERP sub-agent spawn)
 3 spawn này session — KHÔNG phải 4 SOLUTION_ERP sub-agents:
 - claude-code-guide × 2 (generic agent for Anthropic + industry research)
 - 4 SOLUTION_ERP sub-agents (Inv/Imp/Rev/CICD) vẫn seeds-only
 ### State chốt S21 turn 2
 | Metric | Trước | Sau | Δ |
 |---|---|---|---|
 | DB tables | 59 | 59 | 0 |
 | Migrations | 27 | 27 | 0 |
 | Endpoints | ~142 | ~142 | 0 |
 | FE pages | 34 | 34 | 0 |
 | Unit tests | 81 | 81 | 0 |
 | Gotchas | 44 | 44 | 0 |
 | **Memory entries** | 16 | **17** | **+1** (RAG hybrid pattern) |
 | Skills | 6 | 6 | 0 |
 | Sub-agents | 4 seeds-only | 4 seeds-only | 0 |
 | **Commits S21 cumulative** | 2 | **4** | **+2** |
 | **Plan files** | 0 | **1** (`rag-setup-plan.md`) | **+1** |
 ---
 ## TL;DR Session 21 turn 1 — Add cicd-monitor (4th sub-agent, Path A chốt)
--- a/docs/STATUS.md
+++ b/docs/STATUS.md
@ -2,7 +2,8 @@
 > **Update rule:** trước khi bắt đầu 1 task → ghi row vào `🔥 In Progress`. Xong → chuyển sang `✅ Recently Done`.
-**Last updated:** 2026-05-12 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`, CI skipped per path filter (`**/*.md` paths-ignore docs-only). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động (bundle hash unchanged / mig drift prod / endpoint 500) — recurring blind spot pattern em main solo S20 quên verify ~30% push. Cost reality update: ~750K spawn setup (3 → 4 agents) · ~1.35M heavy session · ~700K optimized cached. Stats: 4 sub-agents seeds-only (+1 cicd-monitor green) · 16 memory entries (no new, update existing `feedback_multi_agent_setup.md` 3 → 4 agents narrative) · 27 mig · 59 tables · ~142 endpoints · 81 test unchanged · 44 gotcha unchanged · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work S21 t1 nên KHÔNG có findings — em main solo via context + Write file).**)
+**Last updated:** 2026-05-12 1800 (Session 21 turn 2 — **🎯 RAG Hybrid setup planning + Cách A validation deep dive. 2 commit (`1f8e9af` plan save 1223 LOC + this chốt). Em main solo (no SOLUTION_ERP sub-agent spawn), delegate claude-code-guide × 2 research Anthropic + community practice. Decision chốt: Cách A defensive (giữ blanket 120K em main + RAG retrieve supplement) over Cách B aggressive (cắt 60-70% blanket). Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider all hybrid). Stack: Voyage-3-large + Qdrant local + FastMCP Python + Streamlit dashboard 7 pages + SQLite event log. Multi-agent cost reality: 4 agents → ~520K cumulative blanket → heavy session ~560K (Cách A) vs ~700K (lazy), saving -20%. 3-layer pattern Phase 1-3 rollout (Layer 1 vector → Layer 2 +BM25 → Layer 3 +reranking, recall ~70% → ~92%). Stats: +1 memory entry (`feedback_rag_hybrid_pattern.md`) +1 plan file (`rag-setup-plan.md` 1500 LOC final). 4 sub-agents vẫn seeds-only. Plan I NEW deferred chờ bro confirm 5 dự án path + stack + Voyage API key + disk cleanup 5-8GB.**)
 **S21 turn 1:** 2026-05-12 0030 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`, CI skipped per path filter (`**/*.md` paths-ignore docs-only). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động (bundle hash unchanged / mig drift prod / endpoint 500) — recurring blind spot pattern em main solo S20 quên verify ~30% push. Cost reality update: ~750K spawn setup (3 → 4 agents) · ~1.35M heavy session · ~700K optimized cached. Stats: 4 sub-agents seeds-only (+1 cicd-monitor green) · 16 memory entries (no new, update existing `feedback_multi_agent_setup.md` 3 → 4 agents narrative) · 27 mig · 59 tables · ~142 endpoints · 81 test unchanged · 44 gotcha unchanged · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work S21 t1 nên KHÔNG có findings — em main solo via context + Write file).**)
 **S20 wrap:** 2026-05-11 22:00 (Session 20 wrap turns 1-12 — **🎯 14 commit `9dee00d` → `ae1814c`. PE Detail UI restructure 3 yêu cầu (t1-5) + Manual budget drop tên (t6) + Mig 27 admin menu eOffice (t7) + NCC palette 5-màu cycle + Winner icon ✓ đậm + AddSupplier auto-fill master + Responsive laptop nhỏ 4-tầng pattern (t8-11) + Multi-agent infrastructure setup 3 sub-agents (t12). 27 mig (+1) · 59 tables · ~142 endpoints (+1) · 34 FE pages (+1) · 61 menu key (+1) · 81 test pass unchanged · 44 gotcha · 16 memory entries (+2) · 3 sub-agents NEW. Phase 9 UAT iteration mode.**)
 **S20 turn 7:** 2026-05-11 17:00 (Session 20 turn 7 — **🎯 Admin Ẩn/Hiện + Đổi tên menu eOffice (Mig 27). 5 chunk `2ea2d27`→`ef394f8`→`059bfcb`→`1ed6530`→Chunk E Docs. User Q2=b: DisplayLabel CHỈ áp fe-user, admin sidebar giữ Label gốc. Domain MenuItem +IsVisible(true) +DisplayLabel(200). Mig 27 AddVisibilityAndDisplayLabelToMenuItems. BE PATCH /api/menus/{key} [Authorize Policy=Permissions.Update]. NEW FE-admin MenuVisibilityPage ~210 LOC (table inline edit per-row + Save dirty + Khôi phục mặc định + Toggle Eye/EyeOff + 4 StatCard). fe-user Layout filterForUser 2 tầng (USER_HIDDEN_KEYS hardcode + !isVisible dynamic) + effectiveLabel(displayLabel || label) replace 3 callsite. fe-admin Layout KHÔNG đụng. +1 menu key MenuVisibility "Menu eOffice" leaf System Order=94. 27 mig, 59 tables, ~142 endpoints, 34 FE pages, 81 test pass (Q4 UAT defer).**)
 **S20 prev:** 2026-05-11 (Session 20 — **🎯 PE Detail UI restructure 3 yêu cầu user UX. 4 chunk per-commit `9dee00d` → `2bba851` → `f2f01f4` → (current Chunk D Docs).** Q1=a (giữ Section "Chọn NCC TP" riêng), Q2=a "1 hạng mục trước tiên" (NCC shared, demo 1 hạng mục), Q3=a (chỉ hiện NV đã ký), Q4 public luôn (skip dotnet test mỗi chunk theo memory `feedback_uat_skip_verify`, vẫn `npm run build` × 2 app mỗi chunk vì có rename/remove function). **Chunk A (`9dee00d`)**: BE `CreatePurchaseEvaluationCommandHandler` thêm 1 PurchaseEvaluationDetail mặc định khi tạo phiếu — GroupCode="01", GroupName="Hạng mục chính", NoiDung=TenGoiThau, DonGiaNganSach=ThanhTienNganSach=Budget.TongNganSach hoặc BudgetManualAmount fallback 0; Changelog Insert audit. FE reorder PeDetailTabs (mirror 2 app) 1.Thông tin / 2.Hạng mục (lên #2) / 3.Chọn NCC / 4.NCC tham gia / 5.Ý kiến. **Chunk B (`2bba851`)**: ItemsTab restructure thành list `HangMucCard` (1 card / 1 hạng mục, expanded=true mặc định cho 1 hạng mục demo). Header card: GroupCode + NoiDung + 3 stat (KL/ĐG/TT) + NS link Δ nếu có + Pencil/Trash actions + ▼/▶ toggle expand. Expand body: NCC inline table columns NCC / Liên hệ / Điều khoản TT / **File báo giá** / ĐG chưa VAT / ĐG có VAT / Thành tiền / Action. Quote inline click cell → QuoteDialog cũ reuse. Add NCC + Sửa NCC reuse AddSupplierDialog/EditSupplierDialog cũ. Winner ✓ button mỗi NCC row. Drop function `SuppliersTab` (dead code ~134 LOC, replace bằng HangMucCard expand panel). Giữ AddSupplierDialog + EditSupplierDialog + SupplierAttachmentsCell (HangMucCard call lại). Section layout cuối: 1.Thông tin / 2.Hạng mục + Báo giá NCC (nested) / 3.Chọn NCC TP thắng thầu / 4.Ý kiến cấp duyệt — 4 section. **Chunk C (`f2f01f4`)**: Section Ý kiến restructure render layer (KHÔNG đụng Mig 26 schema — vẫn UPSERT 1 row / Level). LevelOpinionsSectionV2 forEach step → 1 `StepOpinionsBox` (replace grid-cols-2 cho N approver). Box header: "Bước N — Tên" + dept badge emerald + "X/Y đã duyệt" counter. Body: filter opinions theo step.order → sort levelOrder asc, signedAt asc → render `StepOpinionEntry` per signed opinion (tên NV + Cấp badge slate + admin override badge amber nếu có + emerald rounded-full timestamp + comment text). NV chưa duyệt KHÔNG hiển thị (Q3=a). Drop function `LevelOpinionBox` (replaced). Mirror fe-admin + fe-user. Verify build pass cả 2 app sau khi catch TS6133 `SuppliersTab` + `SupplierAttachmentsCell` unused (đã giải quyết: drop SuppliersTab, restore SupplierAttachmentsCell vào HangMucCard cột "File báo giá"). 81 test pass (no change — UAT defer)**)
@ -64,6 +65,7 @@
 | Ngày | Ai | Task | Commit |
 |---|---|---|---|
 | 2026-05-12 | Claude | **🎯 SESSION 21 turn 2 — RAG Hybrid setup planning + Cách A validation deep dive (2 commit `1f8e9af` plan save + this chốt)** — Sau S21 turn 1 chốt cicd-monitor, user clarify 5 dự án future > 1M MD tokens → cuộc thảo luận deep ~15 turn về RAG infrastructure. **Em main solo** (no SOLUTION_ERP sub-agent spawn), delegate **claude-code-guide × 2** spawn agent research Anthropic + community practice. **Q&A deep dive 10 topics**: (1) RAG fundamentals + Vector DB Qdrant role, (2) Embedding "AI nhúng" + Voyage AI cost mechanics ($0.18/M tokens), (3) Multi-project shared architecture (5 projects → single Qdrant + per-collection), (4) Audit procedure 3-tier (weekly auto + monthly deep + quarterly major), (5) UI/UX Streamlit dashboard 7 pages design (overview + drill-down + compare + audit + cost + change + admin), (6) Cách A defensive (giữ blanket 120K) vs Cách B aggressive (cắt 60-70%), (7) Reasoning depth comparison lazy 60% → A 90% → B 75-80%, (8) Industry validation Anthropic + Cursor + Continue + Cline + Aider all hybrid, (9) Multi-agent cost reality 8-10× multiplier ~520K cumulative blanket 5 entities, (10) 3-layer hybrid pattern Anthropic Contextual Retrieval Sept 2024. **Quyết định chốt Cách A** (defensive hybrid: giữ blanket 120K em main + RAG retrieve supplement, sub-agent spawn baseline ~100K each, 4 agents = ~400K cumulative, heavy session billed ~560K saving -20% vs lazy 700K, quality recall ~85%) over **Cách B bỏ** (aggressive cut 60-70% vi phạm priority em main control flow strong + reasoning fragmented + UX latency +1-2s/state Q + risk severe RAG fail). **Why Cách A** (bro priority chốt): em main control flow strong preserve, decision quality 90% multi-source cohesive, wall-clock -20% (12 phút vs 16), risk-averse graceful fallback, multi-agent leverage cache 70-90%, industry-validated 9 sources. **3-layer hybrid Phase rollout**: P1 (W1-4) vector only Voyage-3-large recall ~70% $1.50/mo · P2 (M2) +BM25 bm25s free recall ~78% $1.50/mo · P3 (M3) +Voyage rerank-2 + Contextual prefix recall ~92% $4-5/mo. **Stack validated** cross-industry: Voyage AI embedding (Anthropic partner, multilingual 26 lang, $0.36 initial), Qdrant local (Rust 50MB, agent-native 2026 leader, ~3GB disk 5 project), FastMCP Python (official SDK, ~100 LOC), SQLite event log (5 tables + audit history), Streamlit 7 pages. **Plan I NEW deferred** — trigger bro confirm 5 dự án path + stack + pilot + Voyage API key + disk cleanup → dedicated session 10-14h weekend (per `feedback_drastic_refactor_scope` rule). **Deliverables**: `docs/rag-setup-plan.md` 1223 LOC commit `1f8e9af` + extend S21 t2 ~300 LOC = ~1500 LOC final, memory `feedback_rag_hybrid_pattern.md` cross-project reusable, session log this chốt, MEMORY.md index +1 entry. **CI skipped** path filter (`.md`). **4 sub-agents vẫn seeds-only** (KHÔNG spawn S21 turn 2 nên KHÔNG flush MEMORY.md per §6.5 KHÔNG add noise). Tests baseline 81 unchanged. | `1f8e9af` (plan save) · this chốt (commit final) |
 | 2026-05-12 | Claude | **🎯 SESSION 21 turn 1 — Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier, 1 commit `f1c61c9`)** — User chốt Path A sau pre-flight Plan G Trial Week 1: thêm sub-agent thứ 4 chuyên post-deploy verify (Gitea Actions poll + bundle hash 2 app verify + sqlcmd mig prod = repo latest + endpoint smoke). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động — recurring blind spot pattern em main solo S20 quên verify ~30% push. **2 file mới**: `.claude/agents/cicd-monitor.md` (~7KB) — system prompt + 8-step workflow (verify push → poll Gitea API → fail log grep → live curl smoke → bundle hash × 2 app + verify changed → sqlcmd mig prod = repo latest → report PASS/FAIL/PARTIAL/TIMEOUT/SKIPPED-DOCS) + 5-stage report table + gotcha #25/#39/#40/#41/#44 cross-ref + skill `iis-deploy-runbook`/`dependency-audit-erp`/`ef-core-migration` preload + Anti-pattern 9 rules. `.claude/agent-memory/cicd-monitor/MEMORY.md` (~5KB seed) — recurring CI bug patterns + 5-stage checklist + baseline build/bundle metrics + bearer test pattern admin/nv.test. **1 file update repo**: `.claude/agents/README.md` — 4-agent architecture diagram (green slot mới) + decision tree (after push code + prod issue diagnose branches) + memory routine 4 SendMessage + skills preload 4 agents + cost reality table 564K → 750K spawn / 1.2M → 1.35M heavy / 600K → 700K optimized + trial workflow Week 1-3 CI/CD Monitor spawn integrated + pass criteria + catch ≥1 deploy ship fail. **Memory user-level update**: `feedback_multi_agent_setup.md` — title 3 → 4 sub-agents, decision tree +CI/CD Monitor invocation branches (after push + user prod issue), skills preload list +CI/CD Monitor (iis-deploy-runbook + dependency-audit-erp + ef-core-migration), cost table update + trade-off rationale (recurring blind spot ~30% push S20). **CI skipped**: all 3 file changed `.md` → match `paths-ignore: '**/*.md'` per gotcha #41 → no Gitea Actions run → no IIS deploy (expected — agent infra là local Claude Code, không cần present trên prod). Push success `36e21c8..f1c61c9 main -> main`. **3 (now 4) sub-agents vẫn seeds-only**: chưa spawn work nào — em main solo via context paste + Write file. KHÔNG flush 3 agent MEMORY.md (chưa spawn work = không findings, per §6.5 KHÔNG add noise entry). cicd-monitor MEMORY.md có entry "setup 2026-05-12" trong seed. Trial Week 1 kick-off ở Session 21 turn 2+ với Plan B Contract V2 wire Mig 28+29 candidate (mirror PE pattern S17-S19 proven 1×). Tests baseline 81 unchanged (no test added — docs-only commit). | `f1c61c9` (Setup cicd-monitor + README 4-agent + memory update) |
 | 2026-05-11 | Claude | **🎯 SESSION 20 turns 6 + 8-12 — PE polish (NCC palette + autofill + responsive) + Multi-agent setup (7 commit `f568945` → `ae1814c`)** — Sau turn 7 wrap-up Mig 27, user iterate 7 polish/feature lớn nhỏ. **Turn 6 (`f568945`)** Manual budget "Nhập tay" drop tên field — 3 file × 2 app mirror (BudgetFieldRow + WorkspaceCreateView + HeaderForm) bỏ Input "Tên" UI khỏi manual mode, BE save `budgetManualName: null` luôn, VND format `1.000.000` + suffix đ. **Turn 8 (`3ec7b5a`)** AddSupplier +Số tiền inline + NCC 5-màu palette + Winner badge "🏆 Trúng thầu" — AddSupplierDialog +prop detailId? +form thanhTien, sequential POST /suppliers (response {id}) → POST /quotes (nếu detailId + thanhTien > 0). NCC_PALETTES const 5 màu literal Tailwind (blue/purple/sky/teal/pink) cycle theo idx. Winner row override emerald-500 border-l + bg-emerald-100/70 + shadow-sm + ring-1 emerald-300 + badge rounded-full bg-emerald-600 text-white "🏆 Trúng thầu". **Turn 9 (`83aae8e`)** User feedback bỏ badge → revert icon ✓ stick cũ nhưng đậm hơn (text-base font-bold emerald-700) + tên NCC winner text-emerald-900 + hover transition (winner hover:bg-emerald-200/70, non-winner hover:bg-white/80 hover:shadow-sm). **Turn 10 (`66551db`)** AddSupplierDialog auto-fill từ master data khi chọn NCC dropdown — onChange lookup picked supplier, setForm ghi đè 4 field (contactName ← contactPerson / contactPhone ← phone / contactEmail ← email / note ← note). Hint emerald "✓ Đã tự điền từ Master". User vẫn override được. **Turn 11 (`6e338f7`)** Responsive cho laptop màn hình nhỏ 1280-1366px — 4-tầng pattern: sidebar fe-admin + fe-user `w-72` → `w-60 xl:w-72` (+48px lg) / PE Workspace 2-panel `lg:[320px_1fr]` → `lg:[260px_1fr] xl:[320px_1fr]` (+60px lg) / Section padding `px-5 py-4` → `px-3 py-3 sm:px-5 sm:py-4` (+16px xs) / HangMucCard `gap-3 p-3` → `flex-wrap gap-2 p-2 sm:gap-3 sm:p-3` (+8px xs). Net gain trên 1366px ~+132px width cho NCC table area. Memory `feedback_responsive_laptop_breakpoint.md` capture pattern. **Turn 12 (`ae1814c`)** SETUP MULTI-AGENT INFRASTRUCTURE 3 sub-agents (Investigator READ cyan + Implementer WRITE conditional yellow + Reviewer READ adversarial red) + em main coordinator. Pre-flight decision gate 6/6 ✅. Phase 1-4 execute: `.claude/agents/` 4 file (README ~9.7KB + investigator + implementer + reviewer) + `.claude/agent-memory/` 3 MEMORY.md seed (~6KB each). Customize SOLUTION_ERP: skills preload mỗi agent (reuse 6 skills hiện có) + bearer test (admin@solutions / nv.test@solutions) + prod UAT URL + Phase 9 UAT mode + DB Dev/Design distinct. Windows MAX_PATH pitfall handled — drop `isolation: worktree` khỏi implementer.md (project path 51 chars + Dropbox-managed nested overflow 260+ chars). Memory `feedback_multi_agent_setup.md` capture decision gate + ACCEPT/REFUSE criteria + NAMGROUP s41-s43 ROI reference. 3 agents **chưa spawn work** ở S20 turn 12 — seeds-only state. Trial Week 1 candidate Contract V2 wire Mig 28+29 (mirror PE pattern proven). **Stats cumulative S20:** 27 mig (+1 Mig 27 from turn 7) · 59 tables · ~142 endpoints (+1 PATCH /menus/{key}) · 34 FE pages (+1 MenuVisibilityPage) · ~61 menu key (+1) · 81 test pass unchanged · 44 gotcha unchanged · **16 memory entries (+2: responsive + multi-agent)** · 6 skills unchanged · **3 sub-agents NEW** · 14 commits S20. | `f568945` (t6) · `3ec7b5a` (t8) · `83aae8e` (t9) · `66551db` (t10) · `6e338f7` (t11) · `ae1814c` (t12) · (current Docs t13 wrap) |
 | 2026-05-11 | Claude | **🎯 SESSION 20 turn 7 — Admin Ẩn/Hiện + Đổi tên menu eOffice (Mig 27, 5 chunk `2ea2d27`→`ef394f8`→`059bfcb`→`1ed6530`→Chunk E Docs)** — User UAT yêu cầu "tính năng Ẩn Hiện và Đổi tên hiển thị của các Menu bên ngoài Office, làm trong Trang Admin Page". Hỏi xác nhận "chưa có" — đúng. User clarify Q2=b "edit hiển thị bên ngoài, chỉ của eOffice thôi" → admin sidebar luôn giữ Label gốc, DisplayLabel CHỈ áp fe-user. Q1=a global (không per-role), Q3=a giữ USER_HIDDEN_KEYS hardcode + tầng IsVisible dynamic combine, Q4 UAT skip test. **Chunk A** Domain MenuItem +IsVisible bool=true +DisplayLabel string?(200) + EF config + Migration 27 AddVisibilityAndDisplayLabelToMenuItems (2 AddColumn) — 3-file rule, apply LocalDB _Dev + _Design OK. **Chunk B** BE API: MenuNodeDto + MenuItemDto +isVisible +displayLabel (sau CRUD flags trước Children). GetMyMenuTreeQueryHandler pass through, KHÔNG filter server-side — 2 FE app tự quyết. UpdateMenuItemCommand + Validator + Handler (trim DisplayLabel whitespace → null). MenusController +PATCH /api/menus/{key} [Authorize Policy=Permissions.Update] body {isVisible, displayLabel}. **Chunk C** Domain MenuKeys +MenuVisibility const + All[] + DbInitializer +leaf "Menu eOffice" Icon=Eye Order=94 (Workflows shift 94→95). Manual seed Mig 27 LocalDB _Dev (INSERT MenuItems + Permissions Admin). FE Admin: types/menu.ts +isVisible +displayLabel, lib/menuKeys.ts +MenuVisibility, Layout resolver +/system/menu-visibility, App.tsx +Route. NEW pages/system/MenuVisibilityPage.tsx ~210 LOC: PageHeader + 4 StatCard (Tổng/Hiển thị/Đã ẩn/Đã đổi tên) + Search + Table 5 cột (Key mono + parentKey ↳ / Tên gốc / Input "Tên hiển thị" inline placeholder "Mặc định: {label}" / Toggle button emerald-Eye / amber-EyeOff / Lưu khi dirty + Khôi phục khi custom). PATCH endpoint, invalidate ['menus','all'] + ['my-menu'] trigger live update sidebar. Row hidden bg-amber-50/40 highlight, custom label bg-brand-50/40. **Chunk D** fe-user types/menu.ts mirror. Layout.tsx filterForUser 2 tầng (USER_HIDDEN_KEYS structural + !isVisible dynamic). Helper effectiveLabel(n) = displayLabel?.trim() || label. Replace 3 callsite {node.label} → {effectiveLabel(node)}. USER_FIXED_TOP "__inbox" entry +isVisible:true cho type check pass. **fe-admin Layout KHÔNG đụng** — admin sidebar render Label gốc + show hết menu (user Q2=b). **Chunk E Docs (current)**. **Stats Session 20 turn 7**: 26→27 mig, 59 DB tables (no change), ~141→142 endpoints, 33→34 FE pages, ~60→61 menu key, 81 test pass (Q4 UAT defer), 44 gotcha (no new). Memory entries 14 (no new). | `2ea2d27` (A Mig 27) · `ef394f8` (B BE API) · `059bfcb` (C FE admin) · `1ed6530` (D FE user) · (current E Docs) |
--- a/docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md
+++ b/docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md
@ -0,0 +1,318 @@
 # Session 21 turn 2 — RAG Hybrid setup planning + Cách A validation deep dive
 **Date:** 2026-05-12 (tiếp S21 turn 1 từ 0030 sáng — sang sáng-chiều-tối discussion deep RAG)
 **Dev:** Claude (Opus 4.7 1M Max — em main solo, no SOLUTION_ERP sub-agent spawn)
 **Base commit:** `3a34831` (S21 turn 1 chốt cicd-monitor)
 **Commits:** `1f8e9af` (RAG plan save) + this chốt (2 commit S21 turn 2)
 ## Bối cảnh
 Sau S21 turn 1 chốt cicd-monitor (4 sub-agents seeds-only), bro đặt câu hỏi về RAG infrastructure cho **5 dự án future > 1M MD context**. Cuộc thảo luận deep ~15+ turn covering:
 1. RAG fundamentals + Vector DB role
 2. Embedding model "AI nhúng" + Voyage AI cost mechanics
 3. Multi-project shared architecture (5 projects)
 4. Audit procedure 3-tier + change tracking SQLite
 5. UI/UX Streamlit dashboard 7 pages
 6. Cách A defensive (giữ blanket) vs Cách B aggressive (cắt 60-70%)
 7. Reasoning depth comparison (lazy current vs Cách A vs Cách B)
 8. Industry validation via claude-code-guide research
 9. Multi-agent cumulative cost reality (4 agents → ~520K cumulative blanket)
 10. 3-layer hybrid pattern (Anthropic Contextual Retrieval: embeddings + BM25 + reranking)
 ## Deliverables
 ### File mới — `docs/rag-setup-plan.md` (commit `1f8e9af`, 1223 LOC)
 Cross-project reference plan với 12 section comprehensive:
 1. Context + Why
 2. Architecture overview (6-layer diagram)
 3. BLANKET load list (~100K, 28% MD)
 4. RAG store list (~254K, 72% MD)
 5. Tool stack recommend
 6. Setup scripts copy-paste ready (~250 LOC Python)
 7. Audit procedure 3-tier (weekly/monthly/quarterly)
 8. Multi-AI client access (Claude Code + Desktop + Cursor + GPT-4)
 9. Timeline rollout 10-14h dedicated session
 10. Caveats + risks
 11. Success metrics + decision gate
 12. Future enhancements
 ### File extend S21 turn 2 (this chốt commit)
 Add 2 sections vào `rag-setup-plan.md`:
 - Section 13: Multi-agent cumulative cost reality (Anthropic 8-10× warning)
 - Section 14: 3-layer hybrid RAG upgrade path (Phase 1-3 Anthropic Contextual Retrieval)
 ## Quyết định chốt — Cách A vs Cách B
 ### Chọn **Cách A** (defensive hybrid) ⭐
 ```
 Blanket: GIỮ NGUYÊN ~120K em main (35% MD)
 RAG: ADD as supplement (retrieve on-demand)
 Multi-agent: 4 sub-agents share retrieve cache
 Sub-agent spawn blanket: ~80-100K each (auto-inject + skills + spec)
 Cumulative blanket 5 entities: ~520K
 Heavy session billed: ~560K (saving 20% vs lazy)
 ```
 **Why Cách A (priority bro: em main control flow strong):**
 1. ✅ State ownership strong — em main biết direct project state
 2. ✅ Decision quality 90% (vs Cách B 75-80% do fragmentation)
 3. ✅ Wall-clock per task 12 phút (vs Cách B 16 phút)
 4. ✅ UX smooth — em response fast direct cho state question
 5. ✅ Risk-averse — graceful degradation nếu RAG fail (blanket fallback)
 6. ✅ Multi-agent leverage cache hit 70-90% common queries
 7. ✅ Quality recall +25-55pp (5-15 sources cross-validated vs lazy 1-3)
 ### Bỏ **Cách B** (aggressive cut)
 ```
 Blanket: CẮT MẠNH 60-70% (40-50K còn lại)
 RAG: PRIMARY access mechanism cho mọi thứ
 ```
 **Why bỏ:**
 1. ❌ Vi phạm priority "em main control flow strong"
 2. ❌ State ownership weak — phải retrieve mỗi câu state question
 3. ❌ UX latency +1-2s per state Q
 4. ❌ Decision quality 75-80% do reasoning fragmentation
 5. ❌ Risk severe nếu RAG fail (em main ngơ ngác)
 6. ❌ Anthropic research warn: "context rot inevitable cutting aggressively"
 7. ❌ Cascade retrieve problem (1 task → 2-3 retrieves)
 ## Industry validation via claude-code-guide research
 Spawn 2 lần claude-code-guide agent research (NOT SOLUTION_ERP sub-agents):
 ### Round 1: Anthropic setup inventory (10 features)
 - Memory tool beta (`content-management-2025-06-27`)
 - Prompt caching extensions (5min/1h beta)
 - Files API beta (`files-api-2025-04-14`)
 - Citations stable
 - MCP servers official + community (9,400+ in 2026)
 - Voyage AI embedding partnership
 - Context compaction tool
 - Claude Agent SDK orchestration
 - Batch API 50% discount
 - RAG best practices Anthropic official
 ### Round 2: Industry practice validation
 **5/5 dimensions Cách A fit Anthropic explicit recommend:**
 | Dimension | Bro setup | Anthropic pattern |
 |---|---|---|
 | Context approach | Hybrid blanket+RAG | ✅ Recommended explicit |
 | Sub-agent count | 4 | ✅ "3-5 optimal" |
 | MD scale | 5 project > 1M | ✅ "Use RAG khi >200K" |
 | Stack | Qdrant+Voyage+MCP | ✅ Production validated |
 | Coordination | Em main + agents | ✅ "Coordinator+workers" |
 **Source 4 Anthropic blog posts:**
 - "Effective Context Engineering for AI Agents" (2025)
 - "Contextual Retrieval" (Sept 2024 flagship)
 - "Effective Harnesses for Long-Running Agents"
 - "Multi-Agent Coordination Patterns"
 **Community consensus (Tier 1 tools all Hybrid):**
 - Cursor IDE `@codebase` indexing
 - Continue.dev MCP transport
 - Cline / Roo-Cline filesystem + AST + dynamic context
 - Aider code-as-graph
 - Sourcegraph Cody graph-aware
 → **ZERO** tools adopt aggressive Cách B pattern. **ALL** evolve toward Cách A hybrid.
 ## 3-layer hybrid pattern (Anthropic Contextual Retrieval Sept 2024)
 ```
 Layer 1: Embeddings (Voyage-3-large)
  → Semantic + synonym + multilingual catch
  Performance: baseline ~50% recall
 + Contextual prefix (Haiku-generated context):
  → +35% improvement = ~67% recall
 Layer 2: BM25 (bm25s Python lib free)
  → Exact identifier + technical terms catch
  + Layer 1 = ~75% recall
 Layer 3: Reranking (Voyage rerank-2)
  → Cross-attention deep relevance
  + Layer 1+2 = ~85% recall
 ```
 **Phase rollout incremental:**
 | Phase | Layer | Recall | Cost/month |
 |---|---|---|---|
 | Phase 1 (Week 1-4) | Layer 1 vector only | ~70% | ~$1.50 |
 | Phase 2 (Month 2) | + Layer 2 BM25 | ~78% | ~$1.50 (BM25 free local) |
 | Phase 3 (Month 3) | + Layer 3 + Contextual | ~92% | ~$4-5 |
 ## Multi-agent cost reality (Anthropic warn 8-10× multiplier)
 ```
 Per entity blanket:
  Em main: ~120K
  Sub-agent each spawn: ~80-100K (auto-inject baseline + skills + spec)
 Cumulative blanket 5 entities = ~520K
 Heavy session full 4-agent spawn:
  Lazy current:  ~700K effective billed
  Cách A:        ~560K (-20% saving from multi-agent shared cache)
 Cost multiplier vs solo em main: ~8-10×
 Anthropic acknowledged: "Expect 3-10× token multiplier"
 ```
 **Saving Cách A breakdown (-140K):**
 - Em main lazy Read → retrieve: -25K
 - 4 agents lazy Read → cached retrieve: -160K (share cache 70-90%)
 - Reasoning streamlined: -20K
 - Plus +60K retrieve cost added
 - Net: -145K ≈ -20% per heavy session
 ## Stack validated
 | Component | Tool | Reason |
 |---|---|---|
 | **Vector DB** | Qdrant local | Rust binary 50MB, agent-native 2026 leader |
 | **Embedding** | Voyage-3-large | Anthropic partner, multilingual 26 lang, $0.18/M |
 | **MCP server** | FastMCP Python | Official Anthropic SDK |
 | **Chunking** | Custom adaptive Python | §6.5 compliant, transparent |
 | **Tracking** | SQLite local | Event log + audit + cost analytics |
 | **Dashboard** | Streamlit custom | 7 pages multi-project |
 | **Re-index** | Pre-commit hook | Native git, delta on commit |
 **Total cost 5 projects:** ~$1.50-5/month depending Phase. ~$0.50 initial embed.
 ## Em main solo S21 turn 2 (no SOLUTION_ERP sub-agent spawn)
 ```
 Spawn này session:
  ✅ claude-code-guide × 2 (generic agent for Anthropic research)
  ❌ Investigator / Implementer / Reviewer / CI/CD Monitor (vẫn seeds-only)
 Em main solo qua context paste + Write file + research delegate.
 ```
 ## Skills check
 6 skills hiện tại unchanged. Decision KHÔNG add skill mới cho RAG vì:
 - RAG là decision/architectural pattern, không phải workflow project-specific
 - Cross-project applicable → memory entry phù hợp hơn skill
 - Per rule §9.5 anti-pattern "viết skill chỉ để có thêm"
 - Defer skill creation sau Phase 1 trial validate
 ## Tests
 Unit test 81 unchanged (0 test added — pure planning, không code change).
 ## Memory entry mới
 **`feedback_rag_hybrid_pattern.md`** (NEW — cross-project pattern reusable):
 - Decision Cách A rationale (control flow priority)
 - Multi-agent cost reality (8-10× multiplier)
 - 3-layer hybrid pattern Phase 1-3 incremental rollout
 - Stack validated (Voyage + Qdrant + FastMCP)
 - When to apply / when NOT apply triggers
 - Anti-patterns documented
 - Anthropic 4 blog cross-ref
 ## Verify chain
 | Check | Status |
 |---|---|
 | dotnet build | Không chạy (no .cs change) |
 | dotnet test | Không chạy (no test added — pure docs) |
 | npm build | Không chạy (no FE change) |
 | Push origin | Pending end of turn |
 | CI Gitea Actions | Skip per path filter `.md` |
 | IIS prod deploy | KHÔNG xảy ra (CI skip, expected) |
 ## Docs updates
 - ✅ `docs/STATUS.md` — Last updated S21 turn 2 + Recently Done row top
 - ✅ `docs/HANDOFF.md` — TL;DR Session 21 turn 2 section + Last updated
 - ✅ `docs/rag-setup-plan.md` — extend +Section 13 (cost reality) +Section 14 (3-layer)
 - ✅ `docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md` — file này
 - ✅ Memory user-level new: `feedback_rag_hybrid_pattern.md`
 - ✅ Memory user-level: `MEMORY.md` index + 1 entry pointer
 - ⏭ KHÔNG đụng: rules.md / architecture.md / gotchas.md / database/* / flows/* / skills/* / CLAUDE.md (no real change cho 8 file này)
 - ⏭ KHÔNG flush 4 sub-agent MEMORY.md (chưa spawn, per §6.5 KHÔNG add noise)
 ## Handoff Session 21 turn 3+
 ### Plan I NEW — RAG Setup Implementation
 **Trigger:** Bro confirm 5 dự án path + stack + pilot choice + Voyage API key + disk cleanup 5-8GB free.
 **Schedule:** Dedicated session 10-14h weekend (per memory `feedback_drastic_refactor_scope` rule).
 **Phases:**
 - Phase 1 (Week 1-4): Layer 1 vector embeddings only — ~70% recall — ~$1.50/mo
 - Phase 2 (Month 2): + Layer 2 BM25 hybrid — ~78% recall — ~$1.50/mo
 - Phase 3 (Month 3): + Layer 3 Reranking + Contextual — ~92% recall — ~$4-5/mo
 **Pre-flight task:** Spawn 🔵 Investigator audit MD inventory 5 dự án parallel → tinh chỉnh blanket list per project.
 ### Plan B Contract V2 wire (vẫn pending S21 turn 1)
 - Trial Week 1 multi-agent kick-off SOLUTION_ERP
 - 6 tasks (Mig 28+29 + Service + Controller + FE × 2 + Pin V2)
 - 4 sub-agents pipeline coordinate (lần đầu spawn 4 agents thật)
 ### Plan C Test gap fill (vẫn pending)
 Bundle Chunk E Plan B — 5 test pending:
 - B4 silent 403 regression (gotcha #44 vi phạm §7)
 - V2 Service `ApproveV2Async` UPSERT opinion
 - Section gộp Chunk C render
 - Mig 25 PATCH `/user-selectable`
 - Mig 27 PATCH `/api/menus/{key}`
 ### Plan D-F-G unchanged
 - D: Hard blockers ops (UAT/SMTP/creds/backup) — BLOCKED chờ user
 - F: Audit định kỳ 2026-06-01 (~3 tuần nữa, KHÔNG tự chạy)
 - G: Multi-agent trial 4-week (post-S21 t1 + S21 t2 setup complete)
 ## Stats cumulative S21 turn 2
 | Metric | Trước S21 t2 | Sau S21 t2 | Δ |
 |---|---|---|---|
 | DB tables | 59 | 59 | 0 |
 | Migrations | 27 | 27 | 0 |
 | Endpoints | ~142 | ~142 | 0 |
 | FE pages | 34 | 34 | 0 |
 | Unit tests | 81 | 81 | 0 |
 | Gotchas | 44 | 44 | 0 |
 | **Memory entries** | 16 | **17** | **+1** (RAG hybrid pattern) |
 | Skills | 6 | 6 | 0 |
 | Sub-agents | 4 seeds-only | 4 seeds-only | 0 (chưa spawn) |
 | **Commits S21** | 2 (`f1c61c9` + `3a34831`) | **4** | **+2** (1f8e9af + this chốt) |
 | **MD plan files** | 0 | **1** | **+1** (`rag-setup-plan.md` 1223 LOC + 2 section extend) |
 ## Cross-ref
 - S21 turn 1 session log: `2026-05-12-0030-s21-cicd-monitor-add.md`
 - Plan file: `docs/rag-setup-plan.md` (1223 + extend ~300 LOC = ~1500 LOC)
 - Memory new: `feedback_rag_hybrid_pattern.md` (cross-project reusable)
 - Industry research: claude-code-guide × 2 spawn agent reports
 - 4 Anthropic blog cross-ref trong memory entry
 ## Bài học chốt S21 turn 2
 1. **Em main control flow strong là priority bro** — quyết định Cách A defensive over Cách B aggressive
 2. **Multi-agent cost realistic 8-10× solo** — KHÔNG tránh được spawn baseline ~400K cumulative 4 agents
 3. **Anthropic recommend 3-layer hybrid pattern** — embeddings + BM25 + reranking compound effect
 4. **Industry consensus = hybrid** — Cursor + Continue + Cline + Aider all evolve toward hybrid
 5. **Voyage Vietnamese quality cần verify Week 1** — voyage-3-large multilingual nhưng explicit Vietnamese benchmark chưa publish
 6. **RAG setup = dedicated session 10-14h** — per `feedback_drastic_refactor_scope` rule
 7. **5 projects scale workable** — single Qdrant + per-project collection + ~$2-5/month cost
--- a/docs/rag-setup-plan.md
+++ b/docs/rag-setup-plan.md
@ -1165,6 +1165,361 @@ Mitigation:
 ---
 ## 13. Multi-agent cumulative cost reality (Anthropic 8-10× warning)
 > **Added S21 turn 2 (2026-05-12)** — clarification sau khi user catch gap "120K blanket KHÔNG bao gồm 4 agents".
 ### Per-entity blanket breakdown
 ```
 Em main blanket:                    ~120K
  STATUS + HANDOFF top + rules + architecture + 5 agent .md + 
  4 MEMORY.md auto-inject + skills desc + memory critical + 
  auto-inject system reminders
 Per sub-agent spawn baseline:       ~80-100K each
  Agent system prompt (~5K) +
  3 skills preload SKILL.md full (~21K, trigger semantic) +
  Auto-inject MEMORY.md 25KB first 200 lines (~7K) +
  Em main pass spec task (~10-15K) +
  Em main paste common context excerpt (~30-50K) +
  Auto-inject project context (~10K)
  = ~80-100K per sub-agent spawn (per Anthropic docs)
 4 sub-agents cumulative:            ~400K
  (4 × ~100K each, isolated context windows)
 TOTAL cumulative blanket 5 entities: ~520K
  Em main + 4 sub-agents combined (isolated windows, cumulative billing)
 ```
 ### Context windows are ISOLATED
 ```
 KHÔNG phải 5 entities share 520K trong 1 context window 1M.
 Mỗi entity có context window 1M RIÊNG:
  Em main      → context window 1M, dùng ~120K
  Investigator → context window 1M, dùng ~100K
  Implementer  → context window 1M, dùng ~100K
  Reviewer     → context window 1M, dùng ~100K
  CICD Monitor → context window 1M, dùng ~100K
 → Mỗi entity LOST-IN-MIDDLE threshold riêng (~700K each)
 → Mỗi entity capacity ~58 tasks before hit hard cap riêng
 NHƯNG billing là CUMULATIVE 520K across all contexts:
  Anthropic billing tổng tokens across all 5 windows
  → Hit weekly cap nhanh hơn solo em main 4-5×
 ```
 ### Heavy session token compound effect (Cách A vs lazy)
 **Without RAG (lazy current — 4 agents spawn):**
 ```
 Em main:
  Blanket: 120K
  Lazy Read on-demand: ~50K
  Reasoning + coordinate: ~30K
  = ~200K subtotal
 4 sub-agents (each):
  Spawn blanket: ~100K
  Lazy Read inside agent: ~50K
  Reasoning + work: ~30K
  Each agent: ~180K
  ──────────────
  4 agents subtotal: ~720K cumulative
 SendMessage iteration:
  10 round trips × ~30K nominal: 300K nominal
  Cache hit 70%: ~90K effective
 TOTAL HEAVY SESSION (lazy):
  200K + 720K + 90K = ~1010K nominal
  After cache discount: ~700K effective billed
 ```
 **With Cách A RAG:**
 ```
 Em main:
  Blanket: 120K (unchanged)
  RAG retrieve replace lazy Read: ~30K (-20K saving)
  Reasoning streamlined: ~25K
  = ~175K subtotal (saving 25K)
 4 sub-agents (each):
  Spawn blanket: ~100K (unchanged)
  RAG retrieve (share cache 70-90% common queries): ~15K
  Reasoning streamlined: ~25K
  Each agent: ~140K (saving 40K each)
  ──────────────
  4 agents subtotal: ~560K (saving 160K total)
 SendMessage iteration: ~90K effective (unchanged)
 TOTAL HEAVY SESSION (Cách A):
  175K + 560K + 90K = ~825K nominal
  After cache discount: ~560K effective billed
 SAVING: -140K (-20%)
 ```
 ### Cost saving breakdown
 | Component | Lazy current | Cách A | Saving |
 |---|---:|---:|---:|
 | Em main blanket (fixed) | 120K | 120K | 0 |
 | Em main lazy Read → RAG retrieve | 50K | 30K | -20K |
 | Em main reasoning streamlined | 30K | 25K | -5K |
 | 4 agents spawn blanket (fixed) | 400K | 400K | 0 |
 | 4 agents lazy Read → cached retrieve | 200K | 60K | **-140K** |
 | 4 agents reasoning | 120K | 100K | -20K |
 | SendMessage cached | 90K | 90K | 0 |
 | **TOTAL EFFECTIVE BILLED** | **~700K** | **~560K** | **-140K (-20%)** |
 → **Saving 80% từ 4 agents** share retrieve cache (cache hit 70-90% common queries cross-agent).
 → Em main saving chỉ 25K (blanket unchanged, chỉ optimize Read → retrieve).
 ### Multi-agent leverage example concrete
 ```
 Task Plan B Contract V2 wire:
  🔵 Inv query "PE V2 schema pattern" → 15K retrieve + cached
  🟡 Imp query same → cache hit 90% → 1.5K effective
  🔴 Rev query same → cache hit 90% → 1.5K effective
  🟢 CICD query same → cache hit 90% → 1.5K effective
  Em main query same → cache hit 90% → 1.5K effective
  Cumulative retrieve cost: 15K + 4×1.5K = 21K
 Compare to lazy:
  Each agent Read PE V2 file separately
  5 entities × 20K Read = 100K cumulative
  → Saving 79K just for 1 cross-agent query
 ```
 ### Optimization tips để giảm cumulative
 **Option 1: Spawn ít agents hơn**
 - Decision gate 6-criteria mỗi task (per `feedback_multi_agent_setup` rule)
 - Solo em main đủ → KHÔNG spawn agent
 - Chỉ spawn agent nào THẬT cần
 - Trong S20-S21: 4 agents seeds-only, em chưa spawn lần nào → cost ~120K em main thôi
 **Option 2: Tune blanket sub-agent (100K → 80K)**
 - Em main pass spec gọn (~10K thay 15K)
 - Em main paste common context excerpt thay full (~20K thay 50K)
 - Skills preload chỉ description (~3K thay 21K full SKILL.md)
  → Trigger SKILL.md full khi semantic match
 - Per sub-agent: 100K → 80K
 - 4 agents cumulative: 400K → 320K
 - Heavy session: 560K → 480K (-15%)
 **Option 3: SendMessage cache aggressive (1h TTL beta)**
 - Anthropic extended cache `extended-cache-ttl-2025-04-11`
 - Static prompts cache premium WRITE 2× base
 - Subsequent reads 0.1× discount
 - Multi-agent cùng cache prefix → benefit lớn
 - Saving 10-15% additional
 ---
 ## 14. 3-layer hybrid RAG upgrade path (Anthropic Contextual Retrieval)
 > **Added S21 turn 2 (2026-05-12)** — Anthropic flagship pattern Sept 2024.
 ### Pattern overview
 ```
 Anthropic Contextual Retrieval = 3 layers compound:
 Layer 1: Embeddings (Voyage-3-large)
  → Semantic + synonym + multilingual catch
 + Contextual prefix (Haiku-generated context):
  Add chunk-specific context BEFORE embed
  "This chunk discusses... in context of..."
  → Better recall via enriched vector
 Layer 2: BM25 (bm25s Python lib free local)
  → Exact identifier + technical terms (function names, error codes, Mig numbers)
 + Contextual BM25 (same prefix pattern)
 Layer 3: Reranking (Voyage rerank-2)
  → Cross-attention deep relevance
  → Re-score top 30 candidates → return top 5 truly relevant
 ```
 ### Performance compound effect
 ```
 Baseline (naive vector embeddings):       ~50% recall
 + Contextual embeddings:                  ~67% recall (-35% failure)
 + Hybrid Contextual + BM25:               ~75% recall (-49% failure)
 + Reranking:                              ~85% recall (-67% failure)
 ```
 📎 Source: [Anthropic Contextual Retrieval Sept 2024](https://www.anthropic.com/news/contextual-retrieval)
 ### Phase rollout incremental (recommend cho bro)
 | Phase | Setup | Recall | Cost/month | Effort additional |
 |---|---|---:|---:|---|
 | **Phase 1** (Week 1-4) | Layer 1 vector only (Voyage-3-large) | ~70% | ~$1.50 | 10-14h initial |
 | **Phase 2** (Month 2) | + Layer 2 BM25 (bm25s free local) | ~78% | ~$1.50 unchanged | 2-3h |
 | **Phase 3** (Month 3) | + Layer 3 Voyage rerank-2 + Contextual prefix | ~92% | ~$4-5 | 3-4h |
 ### Phase 1 implementation (basic vector RAG)
 Đã cover trong Section 5-6 plan. Bro implement Week 1-4 trial pilot.
 ### Phase 2 upgrade — Add BM25 hybrid
 ```python
 # scripts/rag-mcp-server.py — upgrade
 from bm25s import BM25
 bm25 = BM25.load("./rag-data/bm25_index")  # pre-built
@mcp.tool()
 def rag_retrieve_hybrid(query, scope="all", k=5):
    # Step 1: Vector search
    query_vec = voyage.embed([query], model="voyage-3-large").embeddings[0]
    vector_results = qdrant.search(COLLECTION, query_vec, limit=20)
    # Step 2: BM25 search (local Python lib)
    bm25_results = bm25.retrieve(query, k=20)
    # Step 3: Merge + dedup
    candidates = merge_dedup(vector_results, bm25_results)  # ~30 chunks
    # Step 4: Score combine (RRF reciprocal rank fusion)
    final_scores = reciprocal_rank_fusion(vector_results, bm25_results)
    return final_scores[:k]
 ```
 ### Phase 3 upgrade — Full Anthropic Contextual
 ```python
 # scripts/rag-indexer.py — upgrade với contextual prefix
 import anthropic
 claude_haiku = anthropic.Anthropic()
 def contextualize_chunk(chunk_content, full_doc_path):
    """Generate context prefix using Claude Haiku (cheap model)."""
    full_doc = open(full_doc_path).read()
    response = claude_haiku.messages.create(
        model="claude-haiku-4-5",  # cheap ~$0.0001/chunk
        max_tokens=150,
        messages=[{
            "role": "user",
            "content": f"""<document>
 {full_doc[:5000]}
 </document>
 <chunk>
 {chunk_content}
 </chunk>
 Give a brief context (50-100 words) explaining what this chunk is about and where it fits in the document. Be specific."""
        }]
    )
    return response.content[0].text
 # In indexer pipeline:
 for chunk in chunks:
    context = contextualize_chunk(chunk["content"], chunk["source"])
    chunk["content_enriched"] = f"{context}\n\n{chunk['content']}"
    # Embed enriched version → better recall
 ```
 ```python
 # scripts/rag-mcp-server.py — final upgrade với reranking
 import voyageai
@mcp.tool()
 def rag_retrieve_full(query, scope="all", k=5):
    # Step 1-3: Same as Phase 2 (vector + BM25 + merge)
    candidates = hybrid_search(query, scope, top=30)
    # Step 4: Voyage Rerank
    rerank_response = voyage.rerank(
        query=query,
        documents=[c.content for c in candidates],
        model="voyage-rerank-2",  # ~$0.05 per 1000 queries
        top_k=k
    )
    return [candidates[r.index] for r in rerank_response.results]
 ```
 ### Cost incremental analysis
 ```
 Phase 1 → Phase 3 incremental cost:
 Phase 1 (basic vector):
  Voyage embed: ~$0.36 initial + ~$0.20/mo delta
  = ~$1.50/mo total
 Phase 2 (+BM25):
  BM25 free local (Python lib)
  Embedding cost same
  = ~$1.50/mo total (unchanged)
 Phase 3 (+Reranking + Contextual):
  Voyage rerank-2: ~$0.05 per 1000 queries
  600 queries/mo × $0.05/1K = $0.03/mo
  Haiku contextual prefix: ~$0.0001 per chunk
  Initial 5000 chunks × $0.0001 = $0.50 one-time
  Delta ~100 chunks/mo × $0.0001 = $0.01/mo
  + Voyage rerank monthly: ~$0.05/mo per 1K queries × 5 projects
  + Re-embed enriched chunks: ~$0.50/mo
  = ~$4-5/mo total
 → Quality jump 70% → 92% recall = +22pp
 → Cost jump $1.50 → $4-5/mo = +$3
 → Worth it after Phase 1 validation
 ```
 ### Why incremental rollout (vs all-in Phase 3 immediate)
 1. **Validate Layer 1 quality first** — nếu Voyage Vietnamese kém → upgrade Phase 2-3 vô ích
 2. **Measure baseline cost** — biết exact Voyage spend trước add rerank/contextual
 3. **Identify retrieval miss patterns** — Phase 1 trial reveal weakness → target Phase 2-3 fix
 4. **Risk-averse setup** — mỗi phase 2-3h add, rollback dễ nếu fail
 5. **§6.5 narrative preserve** — KHÔNG over-engineer, build incremental
 ### When to skip Phase 2-3
 - Phase 1 recall already > 85% → Phase 2-3 marginal benefit (Vietnamese-specific corpus)
 - Cost monthly < $5 budget → stay Phase 1 OK
 - Solo dev no Vietnamese exact terms heavy → BM25 less impactful
 ### When to MUST upgrade Phase 2-3
 - Recall < 70% on benchmark → indicate Phase 1 insufficient
 - Em main report "miss exact identifier" frequently → Phase 2 BM25 critical
 - Multi-language queries common → Phase 3 reranker stabilize
 - Production quality target > 90% → Phase 3 required
 ---
 ## 📚 References + tools
 ### Anthropic official