[CLAUDE] Docs: chốt Session 21 turn 2 — RAG Hybrid setup planning + Cách A validation

Sau S21 turn 1 chốt cicd-monitor, bro clarify 5 dự án future > 1M MD tokens → discussion deep ~15 turn về RAG infrastructure. Em main solo (no SOLUTION_ERP sub-agent spawn), delegate claude-code-guide × 2 research Anthropic + community practice. Quyết định chốt: - Cách A defensive (giữ blanket 120K em main + RAG retrieve supplement) - Bỏ Cách B aggressive (cắt 60-70% blanket) — vi phạm priority em main control flow strong - Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider all hybrid) - 3-layer pattern Phase 1-3 incremental rollout (vector → +BM25 → +reranking, recall ~70% → ~92%) - Stack: Voyage-3-large + Qdrant local + FastMCP Python + Streamlit dashboard Multi-agent cost reality clarify (post-S21 t2): - Em main blanket: ~120K - 4 sub-agents spawn cumulative: ~400K - Total billed heavy session: ~560K Cách A vs ~700K lazy - Saving -20% từ multi-agent shared cache 70-90% - Anthropic acknowledge 8-10× multiplier multi-agent Files updated: - docs/STATUS.md (Last updated S21 turn 2 + Recently Done row top) - docs/HANDOFF.md (TL;DR Session 21 turn 2 section + Last updated) - docs/rag-setup-plan.md (+Section 13 multi-agent cost reality + Section 14 3-layer hybrid Phase 1-3, +355 LOC) - docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md (new session log) Memory user-level update (outside repo, separate update): - feedback_rag_hybrid_pattern.md (NEW cross-project pattern reusable) - MEMORY.md index (+1 entry pointer) Plan I NEW deferred — trigger bro confirm 5 dự án path + stack + pilot + Voyage API + disk cleanup → dedicated session 10-14h weekend (per feedback_drastic_refactor_scope rule). Stats: - 17 memory entries (+1 RAG hybrid) - 1 plan file rag-setup-plan.md (1500 LOC final) - 4 sub-agents seeds-only unchanged - 81 test unchanged - 4 commits S21 cumulative (f1c61c9 + 3a34831 + 1f8e9af + this) CI skip per path filter (all .md). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 18:50:28 +07:00
parent 1f8e9af66f
commit 0a3b747612
4 changed files with 783 additions and 2 deletions
--- a/docs/HANDOFF.md
+++ b/docs/HANDOFF.md
@ -1,6 +1,112 @@
 # HANDOFF — Brief 5 phút cho session tiếp theo

-**Last updated:** 2026-05-12 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`. CI skipped per path filter (3 file `.md`). Cost reality update: ~750K spawn (3 → 4 agents) · ~1.35M heavy / ~700K optimized. Stats: 4 sub-agents seeds-only · 16 memory · 27 mig · 59 tables · ~142 endpoints · 81 test · 44 gotcha · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work — em main solo). Trial Week 1 kick-off S21 turn 2+ Plan B Contract V2 wire mirror PE pattern.**)
+**Last updated:** 2026-05-12 (Session 21 turn 2 — **🎯 RAG Hybrid setup planning + Cách A validation deep dive. 2 commit (`1f8e9af` plan save 1223 LOC + this chốt). KHÔNG implement, plan only — defer chờ bro confirm 5 dự án future. Decision chốt: Cách A defensive (giữ blanket 120K em main + RAG retrieve) over Cách B aggressive (cắt 60-70% blanket). Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider). Stack: Voyage-3-large + Qdrant + FastMCP + Streamlit dashboard. Multi-agent cost reality: 4 agents → ~520K cumulative blanket → heavy session ~560K (Cách A) vs ~700K (lazy). 3-layer pattern Phase 1-3 rollout (embeddings + BM25 + reranking, ~70% → ~92% recall). Stats: +1 memory entry (`feedback_rag_hybrid_pattern`) +1 plan file (`rag-setup-plan.md` 1500 LOC). Sub-agents vẫn 4 seeds-only, em main solo session.**)
+**S21 turn 1:** 2026-05-12 0030 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`. CI skipped per path filter (3 file `.md`). Cost reality update: ~750K spawn (3 → 4 agents) · ~1.35M heavy / ~700K optimized. Stats: 4 sub-agents seeds-only · 16 memory · 27 mig · 59 tables · ~142 endpoints · 81 test · 44 gotcha · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work — em main solo). Trial Week 1 kick-off S21 turn 2+ Plan B Contract V2 wire mirror PE pattern.**)
+
+## TL;DR Session 21 turn 2 — RAG Hybrid setup planning (Cách A chốt + 3-layer pattern)
+
+User clarify 5 dự án future > 1M MD tokens → cuộc thảo luận deep ~15 turn về RAG infrastructure. Em main solo (no SOLUTION_ERP sub-agent spawn), delegate 2 lần claude-code-guide agent research Anthropic + community practice.
+
+### Q&A deep dive 10 topics
+
+1. RAG fundamentals + Vector DB role (Qdrant)
+2. Embedding "AI nhúng" + Voyage AI cost mechanics ($0.18/M tokens)
+3. Multi-project shared architecture (5 projects → single Qdrant + per-collection)
+4. Audit procedure 3-tier (weekly auto + monthly deep + quarterly major)
+5. UI/UX Streamlit dashboard 7 pages design
+6. Cách A defensive (giữ blanket 120K) vs Cách B aggressive (cắt 60-70%)
+7. Reasoning depth comparison: lazy 60% → A 90% → B 75-80%
+8. Industry validation: Anthropic + Cursor + Continue + Cline + Aider all hybrid
+9. Multi-agent cost reality: 8-10× multiplier, ~520K cumulative blanket 5 entities
+10. 3-layer hybrid pattern (Anthropic Contextual Retrieval Sept 2024)
+
+### Quyết định chốt — Cách A vs Cách B
+
+**Chọn Cách A** (defensive hybrid):
+- Blanket: GIỮ NGUYÊN 120K em main + RAG retrieve supplement
+- Sub-agent spawn baseline: ~80-100K each (4 agents = ~400K cumulative)
+- Heavy session billed: ~560K (saving -20% vs lazy 700K)
+- Quality recall: ~85% (vs Cách B 75-80% do fragmentation)
+
+**Why Cách A** (bro priority chốt):
+- ✅ Em main control flow strong (state ownership direct, response fast)
+- ✅ Decision quality 90% (multi-source cohesive reasoning)
+- ✅ Wall-clock per task -20% (12 phút vs Cách B 16 phút)
+- ✅ Risk-averse (graceful fallback blanket nếu RAG fail)
+- ✅ Multi-agent leverage cache 70-90% hit common queries
+- ✅ Industry-validated (Anthropic + Cursor + Continue + Cline + Aider)
+
+### 3-layer hybrid Phase rollout (Anthropic Contextual Retrieval)
+
+| Phase | Layers | Recall | Cost/mo |
+|---|---|---|---|
+| Phase 1 (Week 1-4) | Vector embedding only (Voyage-3-large) | ~70% | ~$1.50 |
+| Phase 2 (Month 2) | + BM25 hybrid (bm25s free local) | ~78% | ~$1.50 |
+| Phase 3 (Month 3) | + Voyage rerank-2 + Contextual prefix | ~92% | ~$4-5 |
+
+### Stack validated cross-industry
+
+- Voyage AI embedding (Anthropic partner, multilingual 26 lang)
+- Qdrant local (Rust binary, "leading agent memory backend 2026")
+- FastMCP Python (official Anthropic SDK)
+- SQLite event log + Streamlit dashboard 7 pages
+- Pre-commit hook re-index delta
+
+### Multi-agent cost reality (Anthropic warn 8-10× multiplier)
+
+```
+Per entity blanket Cách A:
+  Em main: ~120K
+  4 sub-agents × ~100K spawn = 400K cumulative
+  Total: ~520K cumulative billed (not single context window)
+
+Heavy session 4-agent spawn:
+  Lazy: ~700K effective billed
+  Cách A: ~560K (-20% from multi-agent shared cache)
+```
+
+### Plan I NEW — RAG Setup Implementation (defer)
+
+**Trigger:** Bro confirm 5 dự án path + stack + pilot choice + Voyage API key + disk cleanup 5-8GB.
+
+**Schedule:** Dedicated session 10-14h weekend (per `feedback_drastic_refactor_scope`).
+
+**Phase rollout:**
+- Phase 1 single project pilot 4-week trial
+- Phase 2-3 upgrade incremental conditional on Phase 1 success
+- Cost realistic: ~$2-5/month total cho 5 projects
+
+### Deliverables
+
+- ✅ `docs/rag-setup-plan.md` (commit `1f8e9af` 1223 LOC + extend S21 t2 ~300 LOC = ~1500 LOC final)
+- ✅ Memory `feedback_rag_hybrid_pattern.md` (NEW cross-project reusable)
+- ✅ MEMORY.md index +1 entry
+- ✅ Session log this chốt
+- ⏭ Implementation defer chờ trigger
+
+### Em main solo S21 turn 2 (no SOLUTION_ERP sub-agent spawn)
+
+3 spawn này session — KHÔNG phải 4 SOLUTION_ERP sub-agents:
+- claude-code-guide × 2 (generic agent for Anthropic + industry research)
+- 4 SOLUTION_ERP sub-agents (Inv/Imp/Rev/CICD) vẫn seeds-only
+
+### State chốt S21 turn 2
+
+| Metric | Trước | Sau | Δ |
+|---|---|---|---|
+| DB tables | 59 | 59 | 0 |
+| Migrations | 27 | 27 | 0 |
+| Endpoints | ~142 | ~142 | 0 |
+| FE pages | 34 | 34 | 0 |
+| Unit tests | 81 | 81 | 0 |
+| Gotchas | 44 | 44 | 0 |
+| **Memory entries** | 16 | **17** | **+1** (RAG hybrid pattern) |
+| Skills | 6 | 6 | 0 |
+| Sub-agents | 4 seeds-only | 4 seeds-only | 0 |
+| **Commits S21 cumulative** | 2 | **4** | **+2** |
+| **Plan files** | 0 | **1** (`rag-setup-plan.md`) | **+1** |
+
+---

 ## TL;DR Session 21 turn 1 — Add cicd-monitor (4th sub-agent, Path A chốt)

--- a/docs/STATUS.md
+++ b/docs/STATUS.md
@ -2,7 +2,8 @@

 > **Update rule:** trước khi bắt đầu 1 task → ghi row vào `🔥 In Progress`. Xong → chuyển sang `✅ Recently Done`.

-**Last updated:** 2026-05-12 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`, CI skipped per path filter (`**/*.md` paths-ignore docs-only). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động (bundle hash unchanged / mig drift prod / endpoint 500) — recurring blind spot pattern em main solo S20 quên verify ~30% push. Cost reality update: ~750K spawn setup (3 → 4 agents) · ~1.35M heavy session · ~700K optimized cached. Stats: 4 sub-agents seeds-only (+1 cicd-monitor green) · 16 memory entries (no new, update existing `feedback_multi_agent_setup.md` 3 → 4 agents narrative) · 27 mig · 59 tables · ~142 endpoints · 81 test unchanged · 44 gotcha unchanged · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work S21 t1 nên KHÔNG có findings — em main solo via context + Write file).**)
+**Last updated:** 2026-05-12 1800 (Session 21 turn 2 — **🎯 RAG Hybrid setup planning + Cách A validation deep dive. 2 commit (`1f8e9af` plan save 1223 LOC + this chốt). Em main solo (no SOLUTION_ERP sub-agent spawn), delegate claude-code-guide × 2 research Anthropic + community practice. Decision chốt: Cách A defensive (giữ blanket 120K em main + RAG retrieve supplement) over Cách B aggressive (cắt 60-70% blanket). Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider all hybrid). Stack: Voyage-3-large + Qdrant local + FastMCP Python + Streamlit dashboard 7 pages + SQLite event log. Multi-agent cost reality: 4 agents → ~520K cumulative blanket → heavy session ~560K (Cách A) vs ~700K (lazy), saving -20%. 3-layer pattern Phase 1-3 rollout (Layer 1 vector → Layer 2 +BM25 → Layer 3 +reranking, recall ~70% → ~92%). Stats: +1 memory entry (`feedback_rag_hybrid_pattern.md`) +1 plan file (`rag-setup-plan.md` 1500 LOC final). 4 sub-agents vẫn seeds-only. Plan I NEW deferred chờ bro confirm 5 dự án path + stack + Voyage API key + disk cleanup 5-8GB.**)
+**S21 turn 1:** 2026-05-12 0030 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`, CI skipped per path filter (`**/*.md` paths-ignore docs-only). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động (bundle hash unchanged / mig drift prod / endpoint 500) — recurring blind spot pattern em main solo S20 quên verify ~30% push. Cost reality update: ~750K spawn setup (3 → 4 agents) · ~1.35M heavy session · ~700K optimized cached. Stats: 4 sub-agents seeds-only (+1 cicd-monitor green) · 16 memory entries (no new, update existing `feedback_multi_agent_setup.md` 3 → 4 agents narrative) · 27 mig · 59 tables · ~142 endpoints · 81 test unchanged · 44 gotcha unchanged · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work S21 t1 nên KHÔNG có findings — em main solo via context + Write file).**)
 **S20 wrap:** 2026-05-11 22:00 (Session 20 wrap turns 1-12 — **🎯 14 commit `9dee00d` → `ae1814c`. PE Detail UI restructure 3 yêu cầu (t1-5) + Manual budget drop tên (t6) + Mig 27 admin menu eOffice (t7) + NCC palette 5-màu cycle + Winner icon ✓ đậm + AddSupplier auto-fill master + Responsive laptop nhỏ 4-tầng pattern (t8-11) + Multi-agent infrastructure setup 3 sub-agents (t12). 27 mig (+1) · 59 tables · ~142 endpoints (+1) · 34 FE pages (+1) · 61 menu key (+1) · 81 test pass unchanged · 44 gotcha · 16 memory entries (+2) · 3 sub-agents NEW. Phase 9 UAT iteration mode.**)
 **S20 turn 7:** 2026-05-11 17:00 (Session 20 turn 7 — **🎯 Admin Ẩn/Hiện + Đổi tên menu eOffice (Mig 27). 5 chunk `2ea2d27`→`ef394f8`→`059bfcb`→`1ed6530`→Chunk E Docs. User Q2=b: DisplayLabel CHỈ áp fe-user, admin sidebar giữ Label gốc. Domain MenuItem +IsVisible(true) +DisplayLabel(200). Mig 27 AddVisibilityAndDisplayLabelToMenuItems. BE PATCH /api/menus/{key} [Authorize Policy=Permissions.Update]. NEW FE-admin MenuVisibilityPage ~210 LOC (table inline edit per-row + Save dirty + Khôi phục mặc định + Toggle Eye/EyeOff + 4 StatCard). fe-user Layout filterForUser 2 tầng (USER_HIDDEN_KEYS hardcode + !isVisible dynamic) + effectiveLabel(displayLabel || label) replace 3 callsite. fe-admin Layout KHÔNG đụng. +1 menu key MenuVisibility "Menu eOffice" leaf System Order=94. 27 mig, 59 tables, ~142 endpoints, 34 FE pages, 81 test pass (Q4 UAT defer).**)
 **S20 prev:** 2026-05-11 (Session 20 — **🎯 PE Detail UI restructure 3 yêu cầu user UX. 4 chunk per-commit `9dee00d` → `2bba851` → `f2f01f4` → (current Chunk D Docs).** Q1=a (giữ Section "Chọn NCC TP" riêng), Q2=a "1 hạng mục trước tiên" (NCC shared, demo 1 hạng mục), Q3=a (chỉ hiện NV đã ký), Q4 public luôn (skip dotnet test mỗi chunk theo memory `feedback_uat_skip_verify`, vẫn `npm run build` × 2 app mỗi chunk vì có rename/remove function). **Chunk A (`9dee00d`)**: BE `CreatePurchaseEvaluationCommandHandler` thêm 1 PurchaseEvaluationDetail mặc định khi tạo phiếu — GroupCode="01", GroupName="Hạng mục chính", NoiDung=TenGoiThau, DonGiaNganSach=ThanhTienNganSach=Budget.TongNganSach hoặc BudgetManualAmount fallback 0; Changelog Insert audit. FE reorder PeDetailTabs (mirror 2 app) 1.Thông tin / 2.Hạng mục (lên #2) / 3.Chọn NCC / 4.NCC tham gia / 5.Ý kiến. **Chunk B (`2bba851`)**: ItemsTab restructure thành list `HangMucCard` (1 card / 1 hạng mục, expanded=true mặc định cho 1 hạng mục demo). Header card: GroupCode + NoiDung + 3 stat (KL/ĐG/TT) + NS link Δ nếu có + Pencil/Trash actions + ▼/▶ toggle expand. Expand body: NCC inline table columns NCC / Liên hệ / Điều khoản TT / **File báo giá** / ĐG chưa VAT / ĐG có VAT / Thành tiền / Action. Quote inline click cell → QuoteDialog cũ reuse. Add NCC + Sửa NCC reuse AddSupplierDialog/EditSupplierDialog cũ. Winner ✓ button mỗi NCC row. Drop function `SuppliersTab` (dead code ~134 LOC, replace bằng HangMucCard expand panel). Giữ AddSupplierDialog + EditSupplierDialog + SupplierAttachmentsCell (HangMucCard call lại). Section layout cuối: 1.Thông tin / 2.Hạng mục + Báo giá NCC (nested) / 3.Chọn NCC TP thắng thầu / 4.Ý kiến cấp duyệt — 4 section. **Chunk C (`f2f01f4`)**: Section Ý kiến restructure render layer (KHÔNG đụng Mig 26 schema — vẫn UPSERT 1 row / Level). LevelOpinionsSectionV2 forEach step → 1 `StepOpinionsBox` (replace grid-cols-2 cho N approver). Box header: "Bước N — Tên" + dept badge emerald + "X/Y đã duyệt" counter. Body: filter opinions theo step.order → sort levelOrder asc, signedAt asc → render `StepOpinionEntry` per signed opinion (tên NV + Cấp badge slate + admin override badge amber nếu có + emerald rounded-full timestamp + comment text). NV chưa duyệt KHÔNG hiển thị (Q3=a). Drop function `LevelOpinionBox` (replaced). Mirror fe-admin + fe-user. Verify build pass cả 2 app sau khi catch TS6133 `SuppliersTab` + `SupplierAttachmentsCell` unused (đã giải quyết: drop SuppliersTab, restore SupplierAttachmentsCell vào HangMucCard cột "File báo giá"). 81 test pass (no change — UAT defer)**)
@ -64,6 +65,7 @@

 | Ngày | Ai | Task | Commit |
 |---|---|---|---|
+| 2026-05-12 | Claude | **🎯 SESSION 21 turn 2 — RAG Hybrid setup planning + Cách A validation deep dive (2 commit `1f8e9af` plan save + this chốt)** — Sau S21 turn 1 chốt cicd-monitor, user clarify 5 dự án future > 1M MD tokens → cuộc thảo luận deep ~15 turn về RAG infrastructure. **Em main solo** (no SOLUTION_ERP sub-agent spawn), delegate **claude-code-guide × 2** spawn agent research Anthropic + community practice. **Q&A deep dive 10 topics**: (1) RAG fundamentals + Vector DB Qdrant role, (2) Embedding "AI nhúng" + Voyage AI cost mechanics ($0.18/M tokens), (3) Multi-project shared architecture (5 projects → single Qdrant + per-collection), (4) Audit procedure 3-tier (weekly auto + monthly deep + quarterly major), (5) UI/UX Streamlit dashboard 7 pages design (overview + drill-down + compare + audit + cost + change + admin), (6) Cách A defensive (giữ blanket 120K) vs Cách B aggressive (cắt 60-70%), (7) Reasoning depth comparison lazy 60% → A 90% → B 75-80%, (8) Industry validation Anthropic + Cursor + Continue + Cline + Aider all hybrid, (9) Multi-agent cost reality 8-10× multiplier ~520K cumulative blanket 5 entities, (10) 3-layer hybrid pattern Anthropic Contextual Retrieval Sept 2024. **Quyết định chốt Cách A** (defensive hybrid: giữ blanket 120K em main + RAG retrieve supplement, sub-agent spawn baseline ~100K each, 4 agents = ~400K cumulative, heavy session billed ~560K saving -20% vs lazy 700K, quality recall ~85%) over **Cách B bỏ** (aggressive cut 60-70% vi phạm priority em main control flow strong + reasoning fragmented + UX latency +1-2s/state Q + risk severe RAG fail). **Why Cách A** (bro priority chốt): em main control flow strong preserve, decision quality 90% multi-source cohesive, wall-clock -20% (12 phút vs 16), risk-averse graceful fallback, multi-agent leverage cache 70-90%, industry-validated 9 sources. **3-layer hybrid Phase rollout**: P1 (W1-4) vector only Voyage-3-large recall ~70% $1.50/mo · P2 (M2) +BM25 bm25s free recall ~78% $1.50/mo · P3 (M3) +Voyage rerank-2 + Contextual prefix recall ~92% $4-5/mo. **Stack validated** cross-industry: Voyage AI embedding (Anthropic partner, multilingual 26 lang, $0.36 initial), Qdrant local (Rust 50MB, agent-native 2026 leader, ~3GB disk 5 project), FastMCP Python (official SDK, ~100 LOC), SQLite event log (5 tables + audit history), Streamlit 7 pages. **Plan I NEW deferred** — trigger bro confirm 5 dự án path + stack + pilot + Voyage API key + disk cleanup → dedicated session 10-14h weekend (per `feedback_drastic_refactor_scope` rule). **Deliverables**: `docs/rag-setup-plan.md` 1223 LOC commit `1f8e9af` + extend S21 t2 ~300 LOC = ~1500 LOC final, memory `feedback_rag_hybrid_pattern.md` cross-project reusable, session log this chốt, MEMORY.md index +1 entry. **CI skipped** path filter (`.md`). **4 sub-agents vẫn seeds-only** (KHÔNG spawn S21 turn 2 nên KHÔNG flush MEMORY.md per §6.5 KHÔNG add noise). Tests baseline 81 unchanged. | `1f8e9af` (plan save) · this chốt (commit final) |
 | 2026-05-12 | Claude | **🎯 SESSION 21 turn 1 — Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier, 1 commit `f1c61c9`)** — User chốt Path A sau pre-flight Plan G Trial Week 1: thêm sub-agent thứ 4 chuyên post-deploy verify (Gitea Actions poll + bundle hash 2 app verify + sqlcmd mig prod = repo latest + endpoint smoke). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động — recurring blind spot pattern em main solo S20 quên verify ~30% push. **2 file mới**: `.claude/agents/cicd-monitor.md` (~7KB) — system prompt + 8-step workflow (verify push → poll Gitea API → fail log grep → live curl smoke → bundle hash × 2 app + verify changed → sqlcmd mig prod = repo latest → report PASS/FAIL/PARTIAL/TIMEOUT/SKIPPED-DOCS) + 5-stage report table + gotcha #25/#39/#40/#41/#44 cross-ref + skill `iis-deploy-runbook`/`dependency-audit-erp`/`ef-core-migration` preload + Anti-pattern 9 rules. `.claude/agent-memory/cicd-monitor/MEMORY.md` (~5KB seed) — recurring CI bug patterns + 5-stage checklist + baseline build/bundle metrics + bearer test pattern admin/nv.test. **1 file update repo**: `.claude/agents/README.md` — 4-agent architecture diagram (green slot mới) + decision tree (after push code + prod issue diagnose branches) + memory routine 4 SendMessage + skills preload 4 agents + cost reality table 564K → 750K spawn / 1.2M → 1.35M heavy / 600K → 700K optimized + trial workflow Week 1-3 CI/CD Monitor spawn integrated + pass criteria + catch ≥1 deploy ship fail. **Memory user-level update**: `feedback_multi_agent_setup.md` — title 3 → 4 sub-agents, decision tree +CI/CD Monitor invocation branches (after push + user prod issue), skills preload list +CI/CD Monitor (iis-deploy-runbook + dependency-audit-erp + ef-core-migration), cost table update + trade-off rationale (recurring blind spot ~30% push S20). **CI skipped**: all 3 file changed `.md` → match `paths-ignore: '**/*.md'` per gotcha #41 → no Gitea Actions run → no IIS deploy (expected — agent infra là local Claude Code, không cần present trên prod). Push success `36e21c8..f1c61c9 main -> main`. **3 (now 4) sub-agents vẫn seeds-only**: chưa spawn work nào — em main solo via context paste + Write file. KHÔNG flush 3 agent MEMORY.md (chưa spawn work = không findings, per §6.5 KHÔNG add noise entry). cicd-monitor MEMORY.md có entry "setup 2026-05-12" trong seed. Trial Week 1 kick-off ở Session 21 turn 2+ với Plan B Contract V2 wire Mig 28+29 candidate (mirror PE pattern S17-S19 proven 1×). Tests baseline 81 unchanged (no test added — docs-only commit). | `f1c61c9` (Setup cicd-monitor + README 4-agent + memory update) |
 | 2026-05-11 | Claude | **🎯 SESSION 20 turns 6 + 8-12 — PE polish (NCC palette + autofill + responsive) + Multi-agent setup (7 commit `f568945` → `ae1814c`)** — Sau turn 7 wrap-up Mig 27, user iterate 7 polish/feature lớn nhỏ. **Turn 6 (`f568945`)** Manual budget "Nhập tay" drop tên field — 3 file × 2 app mirror (BudgetFieldRow + WorkspaceCreateView + HeaderForm) bỏ Input "Tên" UI khỏi manual mode, BE save `budgetManualName: null` luôn, VND format `1.000.000` + suffix đ. **Turn 8 (`3ec7b5a`)** AddSupplier +Số tiền inline + NCC 5-màu palette + Winner badge "🏆 Trúng thầu" — AddSupplierDialog +prop detailId? +form thanhTien, sequential POST /suppliers (response {id}) → POST /quotes (nếu detailId + thanhTien > 0). NCC_PALETTES const 5 màu literal Tailwind (blue/purple/sky/teal/pink) cycle theo idx. Winner row override emerald-500 border-l + bg-emerald-100/70 + shadow-sm + ring-1 emerald-300 + badge rounded-full bg-emerald-600 text-white "🏆 Trúng thầu". **Turn 9 (`83aae8e`)** User feedback bỏ badge → revert icon ✓ stick cũ nhưng đậm hơn (text-base font-bold emerald-700) + tên NCC winner text-emerald-900 + hover transition (winner hover:bg-emerald-200/70, non-winner hover:bg-white/80 hover:shadow-sm). **Turn 10 (`66551db`)** AddSupplierDialog auto-fill từ master data khi chọn NCC dropdown — onChange lookup picked supplier, setForm ghi đè 4 field (contactName ← contactPerson / contactPhone ← phone / contactEmail ← email / note ← note). Hint emerald "✓ Đã tự điền từ Master". User vẫn override được. **Turn 11 (`6e338f7`)** Responsive cho laptop màn hình nhỏ 1280-1366px — 4-tầng pattern: sidebar fe-admin + fe-user `w-72` → `w-60 xl:w-72` (+48px lg) / PE Workspace 2-panel `lg:[320px_1fr]` → `lg:[260px_1fr] xl:[320px_1fr]` (+60px lg) / Section padding `px-5 py-4` → `px-3 py-3 sm:px-5 sm:py-4` (+16px xs) / HangMucCard `gap-3 p-3` → `flex-wrap gap-2 p-2 sm:gap-3 sm:p-3` (+8px xs). Net gain trên 1366px ~+132px width cho NCC table area. Memory `feedback_responsive_laptop_breakpoint.md` capture pattern. **Turn 12 (`ae1814c`)** SETUP MULTI-AGENT INFRASTRUCTURE 3 sub-agents (Investigator READ cyan + Implementer WRITE conditional yellow + Reviewer READ adversarial red) + em main coordinator. Pre-flight decision gate 6/6 ✅. Phase 1-4 execute: `.claude/agents/` 4 file (README ~9.7KB + investigator + implementer + reviewer) + `.claude/agent-memory/` 3 MEMORY.md seed (~6KB each). Customize SOLUTION_ERP: skills preload mỗi agent (reuse 6 skills hiện có) + bearer test (admin@solutions / nv.test@solutions) + prod UAT URL + Phase 9 UAT mode + DB Dev/Design distinct. Windows MAX_PATH pitfall handled — drop `isolation: worktree` khỏi implementer.md (project path 51 chars + Dropbox-managed nested overflow 260+ chars). Memory `feedback_multi_agent_setup.md` capture decision gate + ACCEPT/REFUSE criteria + NAMGROUP s41-s43 ROI reference. 3 agents **chưa spawn work** ở S20 turn 12 — seeds-only state. Trial Week 1 candidate Contract V2 wire Mig 28+29 (mirror PE pattern proven). **Stats cumulative S20:** 27 mig (+1 Mig 27 from turn 7) · 59 tables · ~142 endpoints (+1 PATCH /menus/{key}) · 34 FE pages (+1 MenuVisibilityPage) · ~61 menu key (+1) · 81 test pass unchanged · 44 gotcha unchanged · **16 memory entries (+2: responsive + multi-agent)** · 6 skills unchanged · **3 sub-agents NEW** · 14 commits S20. | `f568945` (t6) · `3ec7b5a` (t8) · `83aae8e` (t9) · `66551db` (t10) · `6e338f7` (t11) · `ae1814c` (t12) · (current Docs t13 wrap) |
 | 2026-05-11 | Claude | **🎯 SESSION 20 turn 7 — Admin Ẩn/Hiện + Đổi tên menu eOffice (Mig 27, 5 chunk `2ea2d27`→`ef394f8`→`059bfcb`→`1ed6530`→Chunk E Docs)** — User UAT yêu cầu "tính năng Ẩn Hiện và Đổi tên hiển thị của các Menu bên ngoài Office, làm trong Trang Admin Page". Hỏi xác nhận "chưa có" — đúng. User clarify Q2=b "edit hiển thị bên ngoài, chỉ của eOffice thôi" → admin sidebar luôn giữ Label gốc, DisplayLabel CHỈ áp fe-user. Q1=a global (không per-role), Q3=a giữ USER_HIDDEN_KEYS hardcode + tầng IsVisible dynamic combine, Q4 UAT skip test. **Chunk A** Domain MenuItem +IsVisible bool=true +DisplayLabel string?(200) + EF config + Migration 27 AddVisibilityAndDisplayLabelToMenuItems (2 AddColumn) — 3-file rule, apply LocalDB _Dev + _Design OK. **Chunk B** BE API: MenuNodeDto + MenuItemDto +isVisible +displayLabel (sau CRUD flags trước Children). GetMyMenuTreeQueryHandler pass through, KHÔNG filter server-side — 2 FE app tự quyết. UpdateMenuItemCommand + Validator + Handler (trim DisplayLabel whitespace → null). MenusController +PATCH /api/menus/{key} [Authorize Policy=Permissions.Update] body {isVisible, displayLabel}. **Chunk C** Domain MenuKeys +MenuVisibility const + All[] + DbInitializer +leaf "Menu eOffice" Icon=Eye Order=94 (Workflows shift 94→95). Manual seed Mig 27 LocalDB _Dev (INSERT MenuItems + Permissions Admin). FE Admin: types/menu.ts +isVisible +displayLabel, lib/menuKeys.ts +MenuVisibility, Layout resolver +/system/menu-visibility, App.tsx +Route. NEW pages/system/MenuVisibilityPage.tsx ~210 LOC: PageHeader + 4 StatCard (Tổng/Hiển thị/Đã ẩn/Đã đổi tên) + Search + Table 5 cột (Key mono + parentKey ↳ / Tên gốc / Input "Tên hiển thị" inline placeholder "Mặc định: {label}" / Toggle button emerald-Eye / amber-EyeOff / Lưu khi dirty + Khôi phục khi custom). PATCH endpoint, invalidate ['menus','all'] + ['my-menu'] trigger live update sidebar. Row hidden bg-amber-50/40 highlight, custom label bg-brand-50/40. **Chunk D** fe-user types/menu.ts mirror. Layout.tsx filterForUser 2 tầng (USER_HIDDEN_KEYS structural + !isVisible dynamic). Helper effectiveLabel(n) = displayLabel?.trim() || label. Replace 3 callsite {node.label} → {effectiveLabel(node)}. USER_FIXED_TOP "__inbox" entry +isVisible:true cho type check pass. **fe-admin Layout KHÔNG đụng** — admin sidebar render Label gốc + show hết menu (user Q2=b). **Chunk E Docs (current)**. **Stats Session 20 turn 7**: 26→27 mig, 59 DB tables (no change), ~141→142 endpoints, 33→34 FE pages, ~60→61 menu key, 81 test pass (Q4 UAT defer), 44 gotcha (no new). Memory entries 14 (no new). | `2ea2d27` (A Mig 27) · `ef394f8` (B BE API) · `059bfcb` (C FE admin) · `1ed6530` (D FE user) · (current E Docs) |
--- a/docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md
+++ b/docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md
@ -0,0 +1,318 @@
+# Session 21 turn 2 — RAG Hybrid setup planning + Cách A validation deep dive
+
+**Date:** 2026-05-12 (tiếp S21 turn 1 từ 0030 sáng — sang sáng-chiều-tối discussion deep RAG)
+**Dev:** Claude (Opus 4.7 1M Max — em main solo, no SOLUTION_ERP sub-agent spawn)
+**Base commit:** `3a34831` (S21 turn 1 chốt cicd-monitor)
+**Commits:** `1f8e9af` (RAG plan save) + this chốt (2 commit S21 turn 2)
+
+## Bối cảnh
+
+Sau S21 turn 1 chốt cicd-monitor (4 sub-agents seeds-only), bro đặt câu hỏi về RAG infrastructure cho **5 dự án future > 1M MD context**. Cuộc thảo luận deep ~15+ turn covering:
+
+1. RAG fundamentals + Vector DB role
+2. Embedding model "AI nhúng" + Voyage AI cost mechanics
+3. Multi-project shared architecture (5 projects)
+4. Audit procedure 3-tier + change tracking SQLite
+5. UI/UX Streamlit dashboard 7 pages
+6. Cách A defensive (giữ blanket) vs Cách B aggressive (cắt 60-70%)
+7. Reasoning depth comparison (lazy current vs Cách A vs Cách B)
+8. Industry validation via claude-code-guide research
+9. Multi-agent cumulative cost reality (4 agents → ~520K cumulative blanket)
+10. 3-layer hybrid pattern (Anthropic Contextual Retrieval: embeddings + BM25 + reranking)
+
+## Deliverables
+
+### File mới — `docs/rag-setup-plan.md` (commit `1f8e9af`, 1223 LOC)
+
+Cross-project reference plan với 12 section comprehensive:
+
+1. Context + Why
+2. Architecture overview (6-layer diagram)
+3. BLANKET load list (~100K, 28% MD)
+4. RAG store list (~254K, 72% MD)
+5. Tool stack recommend
+6. Setup scripts copy-paste ready (~250 LOC Python)
+7. Audit procedure 3-tier (weekly/monthly/quarterly)
+8. Multi-AI client access (Claude Code + Desktop + Cursor + GPT-4)
+9. Timeline rollout 10-14h dedicated session
+10. Caveats + risks
+11. Success metrics + decision gate
+12. Future enhancements
+
+### File extend S21 turn 2 (this chốt commit)
+
+Add 2 sections vào `rag-setup-plan.md`:
+- Section 13: Multi-agent cumulative cost reality (Anthropic 8-10× warning)
+- Section 14: 3-layer hybrid RAG upgrade path (Phase 1-3 Anthropic Contextual Retrieval)
+
+## Quyết định chốt — Cách A vs Cách B
+
+### Chọn **Cách A** (defensive hybrid) ⭐
+
+```
+Blanket: GIỮ NGUYÊN ~120K em main (35% MD)
+RAG: ADD as supplement (retrieve on-demand)
+Multi-agent: 4 sub-agents share retrieve cache
+Sub-agent spawn blanket: ~80-100K each (auto-inject + skills + spec)
+Cumulative blanket 5 entities: ~520K
+Heavy session billed: ~560K (saving 20% vs lazy)
+```
+
+**Why Cách A (priority bro: em main control flow strong):**
+1. ✅ State ownership strong — em main biết direct project state
+2. ✅ Decision quality 90% (vs Cách B 75-80% do fragmentation)
+3. ✅ Wall-clock per task 12 phút (vs Cách B 16 phút)
+4. ✅ UX smooth — em response fast direct cho state question
+5. ✅ Risk-averse — graceful degradation nếu RAG fail (blanket fallback)
+6. ✅ Multi-agent leverage cache hit 70-90% common queries
+7. ✅ Quality recall +25-55pp (5-15 sources cross-validated vs lazy 1-3)
+
+### Bỏ **Cách B** (aggressive cut)
+
+```
+Blanket: CẮT MẠNH 60-70% (40-50K còn lại)
+RAG: PRIMARY access mechanism cho mọi thứ
+```
+
+**Why bỏ:**
+1. ❌ Vi phạm priority "em main control flow strong"
+2. ❌ State ownership weak — phải retrieve mỗi câu state question
+3. ❌ UX latency +1-2s per state Q
+4. ❌ Decision quality 75-80% do reasoning fragmentation
+5. ❌ Risk severe nếu RAG fail (em main ngơ ngác)
+6. ❌ Anthropic research warn: "context rot inevitable cutting aggressively"
+7. ❌ Cascade retrieve problem (1 task → 2-3 retrieves)
+
+## Industry validation via claude-code-guide research
+
+Spawn 2 lần claude-code-guide agent research (NOT SOLUTION_ERP sub-agents):
+
+### Round 1: Anthropic setup inventory (10 features)
+
+- Memory tool beta (`content-management-2025-06-27`)
+- Prompt caching extensions (5min/1h beta)
+- Files API beta (`files-api-2025-04-14`)
+- Citations stable
+- MCP servers official + community (9,400+ in 2026)
+- Voyage AI embedding partnership
+- Context compaction tool
+- Claude Agent SDK orchestration
+- Batch API 50% discount
+- RAG best practices Anthropic official
+
+### Round 2: Industry practice validation
+
+**5/5 dimensions Cách A fit Anthropic explicit recommend:**
+
+| Dimension | Bro setup | Anthropic pattern |
+|---|---|---|
+| Context approach | Hybrid blanket+RAG | ✅ Recommended explicit |
+| Sub-agent count | 4 | ✅ "3-5 optimal" |
+| MD scale | 5 project > 1M | ✅ "Use RAG khi >200K" |
+| Stack | Qdrant+Voyage+MCP | ✅ Production validated |
+| Coordination | Em main + agents | ✅ "Coordinator+workers" |
+
+**Source 4 Anthropic blog posts:**
+- "Effective Context Engineering for AI Agents" (2025)
+- "Contextual Retrieval" (Sept 2024 flagship)
+- "Effective Harnesses for Long-Running Agents"
+- "Multi-Agent Coordination Patterns"
+
+**Community consensus (Tier 1 tools all Hybrid):**
+- Cursor IDE `@codebase` indexing
+- Continue.dev MCP transport
+- Cline / Roo-Cline filesystem + AST + dynamic context
+- Aider code-as-graph
+- Sourcegraph Cody graph-aware
+
+→ **ZERO** tools adopt aggressive Cách B pattern. **ALL** evolve toward Cách A hybrid.
+
+## 3-layer hybrid pattern (Anthropic Contextual Retrieval Sept 2024)
+
+```
+Layer 1: Embeddings (Voyage-3-large)
+  → Semantic + synonym + multilingual catch
+  Performance: baseline ~50% recall
+  
+ Contextual prefix (Haiku-generated context):
+  → +35% improvement = ~67% recall
+
+Layer 2: BM25 (bm25s Python lib free)
+  → Exact identifier + technical terms catch
+  + Layer 1 = ~75% recall
+  
+Layer 3: Reranking (Voyage rerank-2)
+  → Cross-attention deep relevance
+  + Layer 1+2 = ~85% recall
+```
+
+**Phase rollout incremental:**
+
+| Phase | Layer | Recall | Cost/month |
+|---|---|---|---|
+| Phase 1 (Week 1-4) | Layer 1 vector only | ~70% | ~$1.50 |
+| Phase 2 (Month 2) | + Layer 2 BM25 | ~78% | ~$1.50 (BM25 free local) |
+| Phase 3 (Month 3) | + Layer 3 + Contextual | ~92% | ~$4-5 |
+
+## Multi-agent cost reality (Anthropic warn 8-10× multiplier)
+
+```
+Per entity blanket:
+  Em main: ~120K
+  Sub-agent each spawn: ~80-100K (auto-inject baseline + skills + spec)
+  
+Cumulative blanket 5 entities = ~520K
+
+Heavy session full 4-agent spawn:
+  Lazy current:  ~700K effective billed
+  Cách A:        ~560K (-20% saving from multi-agent shared cache)
+  
+Cost multiplier vs solo em main: ~8-10×
+Anthropic acknowledged: "Expect 3-10× token multiplier"
+```
+
+**Saving Cách A breakdown (-140K):**
+- Em main lazy Read → retrieve: -25K
+- 4 agents lazy Read → cached retrieve: -160K (share cache 70-90%)
+- Reasoning streamlined: -20K
+- Plus +60K retrieve cost added
+- Net: -145K ≈ -20% per heavy session
+
+## Stack validated
+
+| Component | Tool | Reason |
+|---|---|---|
+| **Vector DB** | Qdrant local | Rust binary 50MB, agent-native 2026 leader |
+| **Embedding** | Voyage-3-large | Anthropic partner, multilingual 26 lang, $0.18/M |
+| **MCP server** | FastMCP Python | Official Anthropic SDK |
+| **Chunking** | Custom adaptive Python | §6.5 compliant, transparent |
+| **Tracking** | SQLite local | Event log + audit + cost analytics |
+| **Dashboard** | Streamlit custom | 7 pages multi-project |
+| **Re-index** | Pre-commit hook | Native git, delta on commit |
+
+**Total cost 5 projects:** ~$1.50-5/month depending Phase. ~$0.50 initial embed.
+
+## Em main solo S21 turn 2 (no SOLUTION_ERP sub-agent spawn)
+
+```
+Spawn này session:
+  ✅ claude-code-guide × 2 (generic agent for Anthropic research)
+  ❌ Investigator / Implementer / Reviewer / CI/CD Monitor (vẫn seeds-only)
+  
+Em main solo qua context paste + Write file + research delegate.
+```
+
+## Skills check
+
+6 skills hiện tại unchanged. Decision KHÔNG add skill mới cho RAG vì:
+- RAG là decision/architectural pattern, không phải workflow project-specific
+- Cross-project applicable → memory entry phù hợp hơn skill
+- Per rule §9.5 anti-pattern "viết skill chỉ để có thêm"
+- Defer skill creation sau Phase 1 trial validate
+
+## Tests
+
+Unit test 81 unchanged (0 test added — pure planning, không code change).
+
+## Memory entry mới
+
+**`feedback_rag_hybrid_pattern.md`** (NEW — cross-project pattern reusable):
+- Decision Cách A rationale (control flow priority)
+- Multi-agent cost reality (8-10× multiplier)
+- 3-layer hybrid pattern Phase 1-3 incremental rollout
+- Stack validated (Voyage + Qdrant + FastMCP)
+- When to apply / when NOT apply triggers
+- Anti-patterns documented
+- Anthropic 4 blog cross-ref
+
+## Verify chain
+
+| Check | Status |
+|---|---|
+| dotnet build | Không chạy (no .cs change) |
+| dotnet test | Không chạy (no test added — pure docs) |
+| npm build | Không chạy (no FE change) |
+| Push origin | Pending end of turn |
+| CI Gitea Actions | Skip per path filter `.md` |
+| IIS prod deploy | KHÔNG xảy ra (CI skip, expected) |
+
+## Docs updates
+
+- ✅ `docs/STATUS.md` — Last updated S21 turn 2 + Recently Done row top
+- ✅ `docs/HANDOFF.md` — TL;DR Session 21 turn 2 section + Last updated
+- ✅ `docs/rag-setup-plan.md` — extend +Section 13 (cost reality) +Section 14 (3-layer)
+- ✅ `docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md` — file này
+- ✅ Memory user-level new: `feedback_rag_hybrid_pattern.md`
+- ✅ Memory user-level: `MEMORY.md` index + 1 entry pointer
+- ⏭ KHÔNG đụng: rules.md / architecture.md / gotchas.md / database/* / flows/* / skills/* / CLAUDE.md (no real change cho 8 file này)
+- ⏭ KHÔNG flush 4 sub-agent MEMORY.md (chưa spawn, per §6.5 KHÔNG add noise)
+
+## Handoff Session 21 turn 3+
+
+### Plan I NEW — RAG Setup Implementation
+
+**Trigger:** Bro confirm 5 dự án path + stack + pilot choice + Voyage API key + disk cleanup 5-8GB free.
+
+**Schedule:** Dedicated session 10-14h weekend (per memory `feedback_drastic_refactor_scope` rule).
+
+**Phases:**
+- Phase 1 (Week 1-4): Layer 1 vector embeddings only — ~70% recall — ~$1.50/mo
+- Phase 2 (Month 2): + Layer 2 BM25 hybrid — ~78% recall — ~$1.50/mo
+- Phase 3 (Month 3): + Layer 3 Reranking + Contextual — ~92% recall — ~$4-5/mo
+
+**Pre-flight task:** Spawn 🔵 Investigator audit MD inventory 5 dự án parallel → tinh chỉnh blanket list per project.
+
+### Plan B Contract V2 wire (vẫn pending S21 turn 1)
+
+- Trial Week 1 multi-agent kick-off SOLUTION_ERP
+- 6 tasks (Mig 28+29 + Service + Controller + FE × 2 + Pin V2)
+- 4 sub-agents pipeline coordinate (lần đầu spawn 4 agents thật)
+
+### Plan C Test gap fill (vẫn pending)
+
+Bundle Chunk E Plan B — 5 test pending:
+- B4 silent 403 regression (gotcha #44 vi phạm §7)
+- V2 Service `ApproveV2Async` UPSERT opinion
+- Section gộp Chunk C render
+- Mig 25 PATCH `/user-selectable`
+- Mig 27 PATCH `/api/menus/{key}`
+
+### Plan D-F-G unchanged
+
+- D: Hard blockers ops (UAT/SMTP/creds/backup) — BLOCKED chờ user
+- F: Audit định kỳ 2026-06-01 (~3 tuần nữa, KHÔNG tự chạy)
+- G: Multi-agent trial 4-week (post-S21 t1 + S21 t2 setup complete)
+
+## Stats cumulative S21 turn 2
+
+| Metric | Trước S21 t2 | Sau S21 t2 | Δ |
+|---|---|---|---|
+| DB tables | 59 | 59 | 0 |
+| Migrations | 27 | 27 | 0 |
+| Endpoints | ~142 | ~142 | 0 |
+| FE pages | 34 | 34 | 0 |
+| Unit tests | 81 | 81 | 0 |
+| Gotchas | 44 | 44 | 0 |
+| **Memory entries** | 16 | **17** | **+1** (RAG hybrid pattern) |
+| Skills | 6 | 6 | 0 |
+| Sub-agents | 4 seeds-only | 4 seeds-only | 0 (chưa spawn) |
+| **Commits S21** | 2 (`f1c61c9` + `3a34831`) | **4** | **+2** (1f8e9af + this chốt) |
+| **MD plan files** | 0 | **1** | **+1** (`rag-setup-plan.md` 1223 LOC + 2 section extend) |
+
+## Cross-ref
+
+- S21 turn 1 session log: `2026-05-12-0030-s21-cicd-monitor-add.md`
+- Plan file: `docs/rag-setup-plan.md` (1223 + extend ~300 LOC = ~1500 LOC)
+- Memory new: `feedback_rag_hybrid_pattern.md` (cross-project reusable)
+- Industry research: claude-code-guide × 2 spawn agent reports
+- 4 Anthropic blog cross-ref trong memory entry
+
+## Bài học chốt S21 turn 2
+
+1. **Em main control flow strong là priority bro** — quyết định Cách A defensive over Cách B aggressive
+2. **Multi-agent cost realistic 8-10× solo** — KHÔNG tránh được spawn baseline ~400K cumulative 4 agents
+3. **Anthropic recommend 3-layer hybrid pattern** — embeddings + BM25 + reranking compound effect
+4. **Industry consensus = hybrid** — Cursor + Continue + Cline + Aider all evolve toward hybrid
+5. **Voyage Vietnamese quality cần verify Week 1** — voyage-3-large multilingual nhưng explicit Vietnamese benchmark chưa publish
+6. **RAG setup = dedicated session 10-14h** — per `feedback_drastic_refactor_scope` rule
+7. **5 projects scale workable** — single Qdrant + per-project collection + ~$2-5/month cost
--- a/docs/rag-setup-plan.md
+++ b/docs/rag-setup-plan.md
@ -1165,6 +1165,361 @@ Mitigation:

 ---

+## 13. Multi-agent cumulative cost reality (Anthropic 8-10× warning)
+
+> **Added S21 turn 2 (2026-05-12)** — clarification sau khi user catch gap "120K blanket KHÔNG bao gồm 4 agents".
+
+### Per-entity blanket breakdown
+
+```
+Em main blanket:                    ~120K
+  STATUS + HANDOFF top + rules + architecture + 5 agent .md + 
+  4 MEMORY.md auto-inject + skills desc + memory critical + 
+  auto-inject system reminders
+
+Per sub-agent spawn baseline:       ~80-100K each
+  Agent system prompt (~5K) +
+  3 skills preload SKILL.md full (~21K, trigger semantic) +
+  Auto-inject MEMORY.md 25KB first 200 lines (~7K) +
+  Em main pass spec task (~10-15K) +
+  Em main paste common context excerpt (~30-50K) +
+  Auto-inject project context (~10K)
+  = ~80-100K per sub-agent spawn (per Anthropic docs)
+  
+4 sub-agents cumulative:            ~400K
+  (4 × ~100K each, isolated context windows)
+
+TOTAL cumulative blanket 5 entities: ~520K
+  Em main + 4 sub-agents combined (isolated windows, cumulative billing)
+```
+
+### Context windows are ISOLATED
+
+```
+KHÔNG phải 5 entities share 520K trong 1 context window 1M.
+
+Mỗi entity có context window 1M RIÊNG:
+  Em main      → context window 1M, dùng ~120K
+  Investigator → context window 1M, dùng ~100K
+  Implementer  → context window 1M, dùng ~100K
+  Reviewer     → context window 1M, dùng ~100K
+  CICD Monitor → context window 1M, dùng ~100K
+  
+→ Mỗi entity LOST-IN-MIDDLE threshold riêng (~700K each)
+→ Mỗi entity capacity ~58 tasks before hit hard cap riêng
+
+NHƯNG billing là CUMULATIVE 520K across all contexts:
+  Anthropic billing tổng tokens across all 5 windows
+  → Hit weekly cap nhanh hơn solo em main 4-5×
+```
+
+### Heavy session token compound effect (Cách A vs lazy)
+
+**Without RAG (lazy current — 4 agents spawn):**
+
+```
+Em main:
+  Blanket: 120K
+  Lazy Read on-demand: ~50K
+  Reasoning + coordinate: ~30K
+  = ~200K subtotal
+
+4 sub-agents (each):
+  Spawn blanket: ~100K
+  Lazy Read inside agent: ~50K
+  Reasoning + work: ~30K
+  Each agent: ~180K
+  ──────────────
+  4 agents subtotal: ~720K cumulative
+
+SendMessage iteration:
+  10 round trips × ~30K nominal: 300K nominal
+  Cache hit 70%: ~90K effective
+
+TOTAL HEAVY SESSION (lazy):
+  200K + 720K + 90K = ~1010K nominal
+  After cache discount: ~700K effective billed
+```
+
+**With Cách A RAG:**
+
+```
+Em main:
+  Blanket: 120K (unchanged)
+  RAG retrieve replace lazy Read: ~30K (-20K saving)
+  Reasoning streamlined: ~25K
+  = ~175K subtotal (saving 25K)
+
+4 sub-agents (each):
+  Spawn blanket: ~100K (unchanged)
+  RAG retrieve (share cache 70-90% common queries): ~15K
+  Reasoning streamlined: ~25K
+  Each agent: ~140K (saving 40K each)
+  ──────────────
+  4 agents subtotal: ~560K (saving 160K total)
+
+SendMessage iteration: ~90K effective (unchanged)
+
+TOTAL HEAVY SESSION (Cách A):
+  175K + 560K + 90K = ~825K nominal
+  After cache discount: ~560K effective billed
+  
+SAVING: -140K (-20%)
+```
+
+### Cost saving breakdown
+
+| Component | Lazy current | Cách A | Saving |
+|---|---:|---:|---:|
+| Em main blanket (fixed) | 120K | 120K | 0 |
+| Em main lazy Read → RAG retrieve | 50K | 30K | -20K |
+| Em main reasoning streamlined | 30K | 25K | -5K |
+| 4 agents spawn blanket (fixed) | 400K | 400K | 0 |
+| 4 agents lazy Read → cached retrieve | 200K | 60K | **-140K** |
+| 4 agents reasoning | 120K | 100K | -20K |
+| SendMessage cached | 90K | 90K | 0 |
+| **TOTAL EFFECTIVE BILLED** | **~700K** | **~560K** | **-140K (-20%)** |
+
+→ **Saving 80% từ 4 agents** share retrieve cache (cache hit 70-90% common queries cross-agent).
+
+→ Em main saving chỉ 25K (blanket unchanged, chỉ optimize Read → retrieve).
+
+### Multi-agent leverage example concrete
+
+```
+Task Plan B Contract V2 wire:
+  🔵 Inv query "PE V2 schema pattern" → 15K retrieve + cached
+  🟡 Imp query same → cache hit 90% → 1.5K effective
+  🔴 Rev query same → cache hit 90% → 1.5K effective
+  🟢 CICD query same → cache hit 90% → 1.5K effective
+  Em main query same → cache hit 90% → 1.5K effective
+  
+  Cumulative retrieve cost: 15K + 4×1.5K = 21K
+  
+Compare to lazy:
+  Each agent Read PE V2 file separately
+  5 entities × 20K Read = 100K cumulative
+  
+  → Saving 79K just for 1 cross-agent query
+```
+
+### Optimization tips để giảm cumulative
+
+**Option 1: Spawn ít agents hơn**
+- Decision gate 6-criteria mỗi task (per `feedback_multi_agent_setup` rule)
+- Solo em main đủ → KHÔNG spawn agent
+- Chỉ spawn agent nào THẬT cần
+- Trong S20-S21: 4 agents seeds-only, em chưa spawn lần nào → cost ~120K em main thôi
+
+**Option 2: Tune blanket sub-agent (100K → 80K)**
+- Em main pass spec gọn (~10K thay 15K)
+- Em main paste common context excerpt thay full (~20K thay 50K)
+- Skills preload chỉ description (~3K thay 21K full SKILL.md)
+  → Trigger SKILL.md full khi semantic match
+- Per sub-agent: 100K → 80K
+- 4 agents cumulative: 400K → 320K
+- Heavy session: 560K → 480K (-15%)
+
+**Option 3: SendMessage cache aggressive (1h TTL beta)**
+- Anthropic extended cache `extended-cache-ttl-2025-04-11`
+- Static prompts cache premium WRITE 2× base
+- Subsequent reads 0.1× discount
+- Multi-agent cùng cache prefix → benefit lớn
+- Saving 10-15% additional
+
+---
+
+## 14. 3-layer hybrid RAG upgrade path (Anthropic Contextual Retrieval)
+
+> **Added S21 turn 2 (2026-05-12)** — Anthropic flagship pattern Sept 2024.
+
+### Pattern overview
+
+```
+Anthropic Contextual Retrieval = 3 layers compound:
+
+Layer 1: Embeddings (Voyage-3-large)
+  → Semantic + synonym + multilingual catch
+  
+ Contextual prefix (Haiku-generated context):
+  Add chunk-specific context BEFORE embed
+  "This chunk discusses... in context of..."
+  → Better recall via enriched vector
+
+Layer 2: BM25 (bm25s Python lib free local)
+  → Exact identifier + technical terms (function names, error codes, Mig numbers)
+  
+ Contextual BM25 (same prefix pattern)
+
+Layer 3: Reranking (Voyage rerank-2)
+  → Cross-attention deep relevance
+  → Re-score top 30 candidates → return top 5 truly relevant
+```
+
+### Performance compound effect
+
+```
+Baseline (naive vector embeddings):       ~50% recall
+
+ Contextual embeddings:                  ~67% recall (-35% failure)
+
+ Hybrid Contextual + BM25:               ~75% recall (-49% failure)
+
+ Reranking:                              ~85% recall (-67% failure)
+```
+
+📎 Source: [Anthropic Contextual Retrieval Sept 2024](https://www.anthropic.com/news/contextual-retrieval)
+
+### Phase rollout incremental (recommend cho bro)
+
+| Phase | Setup | Recall | Cost/month | Effort additional |
+|---|---|---:|---:|---|
+| **Phase 1** (Week 1-4) | Layer 1 vector only (Voyage-3-large) | ~70% | ~$1.50 | 10-14h initial |
+| **Phase 2** (Month 2) | + Layer 2 BM25 (bm25s free local) | ~78% | ~$1.50 unchanged | 2-3h |
+| **Phase 3** (Month 3) | + Layer 3 Voyage rerank-2 + Contextual prefix | ~92% | ~$4-5 | 3-4h |
+
+### Phase 1 implementation (basic vector RAG)
+
+Đã cover trong Section 5-6 plan. Bro implement Week 1-4 trial pilot.
+
+### Phase 2 upgrade — Add BM25 hybrid
+
+```python
+# scripts/rag-mcp-server.py — upgrade
+from bm25s import BM25
+
+bm25 = BM25.load("./rag-data/bm25_index")  # pre-built
+
+@mcp.tool()
+def rag_retrieve_hybrid(query, scope="all", k=5):
+    # Step 1: Vector search
+    query_vec = voyage.embed([query], model="voyage-3-large").embeddings[0]
+    vector_results = qdrant.search(COLLECTION, query_vec, limit=20)
+    
+    # Step 2: BM25 search (local Python lib)
+    bm25_results = bm25.retrieve(query, k=20)
+    
+    # Step 3: Merge + dedup
+    candidates = merge_dedup(vector_results, bm25_results)  # ~30 chunks
+    
+    # Step 4: Score combine (RRF reciprocal rank fusion)
+    final_scores = reciprocal_rank_fusion(vector_results, bm25_results)
+    
+    return final_scores[:k]
+```
+
+### Phase 3 upgrade — Full Anthropic Contextual
+
+```python
+# scripts/rag-indexer.py — upgrade với contextual prefix
+import anthropic
+
+claude_haiku = anthropic.Anthropic()
+
+def contextualize_chunk(chunk_content, full_doc_path):
+    """Generate context prefix using Claude Haiku (cheap model)."""
+    full_doc = open(full_doc_path).read()
+    
+    response = claude_haiku.messages.create(
+        model="claude-haiku-4-5",  # cheap ~$0.0001/chunk
+        max_tokens=150,
+        messages=[{
+            "role": "user",
+            "content": f"""<document>
+{full_doc[:5000]}
+</document>
+
+<chunk>
+{chunk_content}
+</chunk>
+
+Give a brief context (50-100 words) explaining what this chunk is about and where it fits in the document. Be specific."""
+        }]
+    )
+    
+    return response.content[0].text
+
+# In indexer pipeline:
+for chunk in chunks:
+    context = contextualize_chunk(chunk["content"], chunk["source"])
+    chunk["content_enriched"] = f"{context}\n\n{chunk['content']}"
+    # Embed enriched version → better recall
+```
+
+```python
+# scripts/rag-mcp-server.py — final upgrade với reranking
+import voyageai
+
+@mcp.tool()
+def rag_retrieve_full(query, scope="all", k=5):
+    # Step 1-3: Same as Phase 2 (vector + BM25 + merge)
+    candidates = hybrid_search(query, scope, top=30)
+    
+    # Step 4: Voyage Rerank
+    rerank_response = voyage.rerank(
+        query=query,
+        documents=[c.content for c in candidates],
+        model="voyage-rerank-2",  # ~$0.05 per 1000 queries
+        top_k=k
+    )
+    
+    return [candidates[r.index] for r in rerank_response.results]
+```
+
+### Cost incremental analysis
+
+```
+Phase 1 → Phase 3 incremental cost:
+
+Phase 1 (basic vector):
+  Voyage embed: ~$0.36 initial + ~$0.20/mo delta
+  = ~$1.50/mo total
+  
+Phase 2 (+BM25):
+  BM25 free local (Python lib)
+  Embedding cost same
+  = ~$1.50/mo total (unchanged)
+
+Phase 3 (+Reranking + Contextual):
+  Voyage rerank-2: ~$0.05 per 1000 queries
+  600 queries/mo × $0.05/1K = $0.03/mo
+  
+  Haiku contextual prefix: ~$0.0001 per chunk
+  Initial 5000 chunks × $0.0001 = $0.50 one-time
+  Delta ~100 chunks/mo × $0.0001 = $0.01/mo
+  
+  + Voyage rerank monthly: ~$0.05/mo per 1K queries × 5 projects
+  + Re-embed enriched chunks: ~$0.50/mo
+  = ~$4-5/mo total
+
+→ Quality jump 70% → 92% recall = +22pp
+→ Cost jump $1.50 → $4-5/mo = +$3
+→ Worth it after Phase 1 validation
+```
+
+### Why incremental rollout (vs all-in Phase 3 immediate)
+
+1. **Validate Layer 1 quality first** — nếu Voyage Vietnamese kém → upgrade Phase 2-3 vô ích
+2. **Measure baseline cost** — biết exact Voyage spend trước add rerank/contextual
+3. **Identify retrieval miss patterns** — Phase 1 trial reveal weakness → target Phase 2-3 fix
+4. **Risk-averse setup** — mỗi phase 2-3h add, rollback dễ nếu fail
+5. **§6.5 narrative preserve** — KHÔNG over-engineer, build incremental
+
+### When to skip Phase 2-3
+
+- Phase 1 recall already > 85% → Phase 2-3 marginal benefit (Vietnamese-specific corpus)
+- Cost monthly < $5 budget → stay Phase 1 OK
+- Solo dev no Vietnamese exact terms heavy → BM25 less impactful
+
+### When to MUST upgrade Phase 2-3
+
+- Recall < 70% on benchmark → indicate Phase 1 insufficient
+- Em main report "miss exact identifier" frequently → Phase 2 BM25 critical
+- Multi-language queries common → Phase 3 reranker stabilize
+- Production quality target > 90% → Phase 3 required
+
+---
+
 ## 📚 References + tools

 ### Anthropic official