[CLAUDE] Docs: chốt Session 21 turn 2 — RAG Hybrid setup planning + Cách A validation

Sau S21 turn 1 chốt cicd-monitor, bro clarify 5 dự án future > 1M MD tokens → discussion deep ~15 turn về RAG infrastructure. Em main solo (no SOLUTION_ERP sub-agent spawn), delegate claude-code-guide × 2 research Anthropic + community practice.

Quyết định chốt:
- Cách A defensive (giữ blanket 120K em main + RAG retrieve supplement)
- Bỏ Cách B aggressive (cắt 60-70% blanket) — vi phạm priority em main control flow strong
- Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider all hybrid)
- 3-layer pattern Phase 1-3 incremental rollout (vector → +BM25 → +reranking, recall ~70% → ~92%)
- Stack: Voyage-3-large + Qdrant local + FastMCP Python + Streamlit dashboard

Multi-agent cost reality clarify (post-S21 t2):
- Em main blanket: ~120K
- 4 sub-agents spawn cumulative: ~400K
- Total billed heavy session: ~560K Cách A vs ~700K lazy
- Saving -20% từ multi-agent shared cache 70-90%
- Anthropic acknowledge 8-10× multiplier multi-agent

Files updated:
- docs/STATUS.md (Last updated S21 turn 2 + Recently Done row top)
- docs/HANDOFF.md (TL;DR Session 21 turn 2 section + Last updated)
- docs/rag-setup-plan.md (+Section 13 multi-agent cost reality + Section 14 3-layer hybrid Phase 1-3, +355 LOC)
- docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md (new session log)

Memory user-level update (outside repo, separate update):
- feedback_rag_hybrid_pattern.md (NEW cross-project pattern reusable)
- MEMORY.md index (+1 entry pointer)

Plan I NEW deferred — trigger bro confirm 5 dự án path + stack + pilot + Voyage API + disk cleanup → dedicated session 10-14h weekend (per feedback_drastic_refactor_scope rule).

Stats:
- 17 memory entries (+1 RAG hybrid)
- 1 plan file rag-setup-plan.md (1500 LOC final)
- 4 sub-agents seeds-only unchanged
- 81 test unchanged
- 4 commits S21 cumulative (f1c61c9 + 3a34831 + 1f8e9af + this)

CI skip per path filter (all .md).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
pqhuy1987
2026-05-12 18:50:28 +07:00
parent 1f8e9af66f
commit 0a3b747612
4 changed files with 783 additions and 2 deletions

View File

@ -1,6 +1,112 @@
# HANDOFF — Brief 5 phút cho session tiếp theo
**Last updated:** 2026-05-12 (Session 21 turn 1**🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`. CI skipped per path filter (3 file `.md`). Cost reality update: ~750K spawn (3 → 4 agents) · ~1.35M heavy / ~700K optimized. Stats: 4 sub-agents seeds-only · 16 memory · 27 mig · 59 tables · ~142 endpoints · 81 test · 44 gotcha · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work — em main solo). Trial Week 1 kick-off S21 turn 2+ Plan B Contract V2 wire mirror PE pattern.**)
**Last updated:** 2026-05-12 (Session 21 turn 2**🎯 RAG Hybrid setup planning + Cách A validation deep dive. 2 commit (`1f8e9af` plan save 1223 LOC + this chốt). KHÔNG implement, plan only — defer chờ bro confirm 5 dự án future. Decision chốt: Cách A defensive (giữ blanket 120K em main + RAG retrieve) over Cách B aggressive (cắt 60-70% blanket). Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider). Stack: Voyage-3-large + Qdrant + FastMCP + Streamlit dashboard. Multi-agent cost reality: 4 agents → ~520K cumulative blanket → heavy session ~560K (Cách A) vs ~700K (lazy). 3-layer pattern Phase 1-3 rollout (embeddings + BM25 + reranking, ~70% → ~92% recall). Stats: +1 memory entry (`feedback_rag_hybrid_pattern`) +1 plan file (`rag-setup-plan.md` 1500 LOC). Sub-agents vẫn 4 seeds-only, em main solo session.**)
**S21 turn 1:** 2026-05-12 0030 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`. CI skipped per path filter (3 file `.md`). Cost reality update: ~750K spawn (3 → 4 agents) · ~1.35M heavy / ~700K optimized. Stats: 4 sub-agents seeds-only · 16 memory · 27 mig · 59 tables · ~142 endpoints · 81 test · 44 gotcha · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work — em main solo). Trial Week 1 kick-off S21 turn 2+ Plan B Contract V2 wire mirror PE pattern.**)
## TL;DR Session 21 turn 2 — RAG Hybrid setup planning (Cách A chốt + 3-layer pattern)
User clarify 5 dự án future > 1M MD tokens → cuộc thảo luận deep ~15 turn về RAG infrastructure. Em main solo (no SOLUTION_ERP sub-agent spawn), delegate 2 lần claude-code-guide agent research Anthropic + community practice.
### Q&A deep dive 10 topics
1. RAG fundamentals + Vector DB role (Qdrant)
2. Embedding "AI nhúng" + Voyage AI cost mechanics ($0.18/M tokens)
3. Multi-project shared architecture (5 projects → single Qdrant + per-collection)
4. Audit procedure 3-tier (weekly auto + monthly deep + quarterly major)
5. UI/UX Streamlit dashboard 7 pages design
6. Cách A defensive (giữ blanket 120K) vs Cách B aggressive (cắt 60-70%)
7. Reasoning depth comparison: lazy 60% → A 90% → B 75-80%
8. Industry validation: Anthropic + Cursor + Continue + Cline + Aider all hybrid
9. Multi-agent cost reality: 8-10× multiplier, ~520K cumulative blanket 5 entities
10. 3-layer hybrid pattern (Anthropic Contextual Retrieval Sept 2024)
### Quyết định chốt — Cách A vs Cách B
**Chọn Cách A** (defensive hybrid):
- Blanket: GIỮ NGUYÊN 120K em main + RAG retrieve supplement
- Sub-agent spawn baseline: ~80-100K each (4 agents = ~400K cumulative)
- Heavy session billed: ~560K (saving -20% vs lazy 700K)
- Quality recall: ~85% (vs Cách B 75-80% do fragmentation)
**Why Cách A** (bro priority chốt):
- ✅ Em main control flow strong (state ownership direct, response fast)
- ✅ Decision quality 90% (multi-source cohesive reasoning)
- ✅ Wall-clock per task -20% (12 phút vs Cách B 16 phút)
- ✅ Risk-averse (graceful fallback blanket nếu RAG fail)
- ✅ Multi-agent leverage cache 70-90% hit common queries
- ✅ Industry-validated (Anthropic + Cursor + Continue + Cline + Aider)
### 3-layer hybrid Phase rollout (Anthropic Contextual Retrieval)
| Phase | Layers | Recall | Cost/mo |
|---|---|---|---|
| Phase 1 (Week 1-4) | Vector embedding only (Voyage-3-large) | ~70% | ~$1.50 |
| Phase 2 (Month 2) | + BM25 hybrid (bm25s free local) | ~78% | ~$1.50 |
| Phase 3 (Month 3) | + Voyage rerank-2 + Contextual prefix | ~92% | ~$4-5 |
### Stack validated cross-industry
- Voyage AI embedding (Anthropic partner, multilingual 26 lang)
- Qdrant local (Rust binary, "leading agent memory backend 2026")
- FastMCP Python (official Anthropic SDK)
- SQLite event log + Streamlit dashboard 7 pages
- Pre-commit hook re-index delta
### Multi-agent cost reality (Anthropic warn 8-10× multiplier)
```
Per entity blanket Cách A:
Em main: ~120K
4 sub-agents × ~100K spawn = 400K cumulative
Total: ~520K cumulative billed (not single context window)
Heavy session 4-agent spawn:
Lazy: ~700K effective billed
Cách A: ~560K (-20% from multi-agent shared cache)
```
### Plan I NEW — RAG Setup Implementation (defer)
**Trigger:** Bro confirm 5 dự án path + stack + pilot choice + Voyage API key + disk cleanup 5-8GB.
**Schedule:** Dedicated session 10-14h weekend (per `feedback_drastic_refactor_scope`).
**Phase rollout:**
- Phase 1 single project pilot 4-week trial
- Phase 2-3 upgrade incremental conditional on Phase 1 success
- Cost realistic: ~$2-5/month total cho 5 projects
### Deliverables
-`docs/rag-setup-plan.md` (commit `1f8e9af` 1223 LOC + extend S21 t2 ~300 LOC = ~1500 LOC final)
- ✅ Memory `feedback_rag_hybrid_pattern.md` (NEW cross-project reusable)
- ✅ MEMORY.md index +1 entry
- ✅ Session log this chốt
- ⏭ Implementation defer chờ trigger
### Em main solo S21 turn 2 (no SOLUTION_ERP sub-agent spawn)
3 spawn này session — KHÔNG phải 4 SOLUTION_ERP sub-agents:
- claude-code-guide × 2 (generic agent for Anthropic + industry research)
- 4 SOLUTION_ERP sub-agents (Inv/Imp/Rev/CICD) vẫn seeds-only
### State chốt S21 turn 2
| Metric | Trước | Sau | Δ |
|---|---|---|---|
| DB tables | 59 | 59 | 0 |
| Migrations | 27 | 27 | 0 |
| Endpoints | ~142 | ~142 | 0 |
| FE pages | 34 | 34 | 0 |
| Unit tests | 81 | 81 | 0 |
| Gotchas | 44 | 44 | 0 |
| **Memory entries** | 16 | **17** | **+1** (RAG hybrid pattern) |
| Skills | 6 | 6 | 0 |
| Sub-agents | 4 seeds-only | 4 seeds-only | 0 |
| **Commits S21 cumulative** | 2 | **4** | **+2** |
| **Plan files** | 0 | **1** (`rag-setup-plan.md`) | **+1** |
---
## TL;DR Session 21 turn 1 — Add cicd-monitor (4th sub-agent, Path A chốt)

View File

@ -2,7 +2,8 @@
> **Update rule:** trước khi bắt đầu 1 task → ghi row vào `🔥 In Progress`. Xong → chuyển sang `✅ Recently Done`.
**Last updated:** 2026-05-12 (Session 21 turn 1**🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`, CI skipped per path filter (`**/*.md` paths-ignore docs-only). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động (bundle hash unchanged / mig drift prod / endpoint 500) — recurring blind spot pattern em main solo S20 quên verify ~30% push. Cost reality update: ~750K spawn setup (3 → 4 agents) · ~1.35M heavy session · ~700K optimized cached. Stats: 4 sub-agents seeds-only (+1 cicd-monitor green) · 16 memory entries (no new, update existing `feedback_multi_agent_setup.md` 3 → 4 agents narrative) · 27 mig · 59 tables · ~142 endpoints · 81 test unchanged · 44 gotcha unchanged · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work S21 t1 nên KHÔNG có findings — em main solo via context + Write file).**)
**Last updated:** 2026-05-12 1800 (Session 21 turn 2**🎯 RAG Hybrid setup planning + Cách A validation deep dive. 2 commit (`1f8e9af` plan save 1223 LOC + this chốt). Em main solo (no SOLUTION_ERP sub-agent spawn), delegate claude-code-guide × 2 research Anthropic + community practice. Decision chốt: Cách A defensive (giữ blanket 120K em main + RAG retrieve supplement) over Cách B aggressive (cắt 60-70% blanket). Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider all hybrid). Stack: Voyage-3-large + Qdrant local + FastMCP Python + Streamlit dashboard 7 pages + SQLite event log. Multi-agent cost reality: 4 agents → ~520K cumulative blanket → heavy session ~560K (Cách A) vs ~700K (lazy), saving -20%. 3-layer pattern Phase 1-3 rollout (Layer 1 vector → Layer 2 +BM25 → Layer 3 +reranking, recall ~70% → ~92%). Stats: +1 memory entry (`feedback_rag_hybrid_pattern.md`) +1 plan file (`rag-setup-plan.md` 1500 LOC final). 4 sub-agents vẫn seeds-only. Plan I NEW deferred chờ bro confirm 5 dự án path + stack + Voyage API key + disk cleanup 5-8GB.**)
**S21 turn 1:** 2026-05-12 0030 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`, CI skipped per path filter (`**/*.md` paths-ignore docs-only). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động (bundle hash unchanged / mig drift prod / endpoint 500) — recurring blind spot pattern em main solo S20 quên verify ~30% push. Cost reality update: ~750K spawn setup (3 → 4 agents) · ~1.35M heavy session · ~700K optimized cached. Stats: 4 sub-agents seeds-only (+1 cicd-monitor green) · 16 memory entries (no new, update existing `feedback_multi_agent_setup.md` 3 → 4 agents narrative) · 27 mig · 59 tables · ~142 endpoints · 81 test unchanged · 44 gotcha unchanged · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work S21 t1 nên KHÔNG có findings — em main solo via context + Write file).**)
**S20 wrap:** 2026-05-11 22:00 (Session 20 wrap turns 1-12 — **🎯 14 commit `9dee00d``ae1814c`. PE Detail UI restructure 3 yêu cầu (t1-5) + Manual budget drop tên (t6) + Mig 27 admin menu eOffice (t7) + NCC palette 5-màu cycle + Winner icon ✓ đậm + AddSupplier auto-fill master + Responsive laptop nhỏ 4-tầng pattern (t8-11) + Multi-agent infrastructure setup 3 sub-agents (t12). 27 mig (+1) · 59 tables · ~142 endpoints (+1) · 34 FE pages (+1) · 61 menu key (+1) · 81 test pass unchanged · 44 gotcha · 16 memory entries (+2) · 3 sub-agents NEW. Phase 9 UAT iteration mode.**)
**S20 turn 7:** 2026-05-11 17:00 (Session 20 turn 7 — **🎯 Admin Ẩn/Hiện + Đổi tên menu eOffice (Mig 27). 5 chunk `2ea2d27``ef394f8``059bfcb``1ed6530`→Chunk E Docs. User Q2=b: DisplayLabel CHỈ áp fe-user, admin sidebar giữ Label gốc. Domain MenuItem +IsVisible(true) +DisplayLabel(200). Mig 27 AddVisibilityAndDisplayLabelToMenuItems. BE PATCH /api/menus/{key} [Authorize Policy=Permissions.Update]. NEW FE-admin MenuVisibilityPage ~210 LOC (table inline edit per-row + Save dirty + Khôi phục mặc định + Toggle Eye/EyeOff + 4 StatCard). fe-user Layout filterForUser 2 tầng (USER_HIDDEN_KEYS hardcode + !isVisible dynamic) + effectiveLabel(displayLabel || label) replace 3 callsite. fe-admin Layout KHÔNG đụng. +1 menu key MenuVisibility "Menu eOffice" leaf System Order=94. 27 mig, 59 tables, ~142 endpoints, 34 FE pages, 81 test pass (Q4 UAT defer).**)
**S20 prev:** 2026-05-11 (Session 20 — **🎯 PE Detail UI restructure 3 yêu cầu user UX. 4 chunk per-commit `9dee00d``2bba851``f2f01f4` → (current Chunk D Docs).** Q1=a (giữ Section "Chọn NCC TP" riêng), Q2=a "1 hạng mục trước tiên" (NCC shared, demo 1 hạng mục), Q3=a (chỉ hiện NV đã ký), Q4 public luôn (skip dotnet test mỗi chunk theo memory `feedback_uat_skip_verify`, vẫn `npm run build` × 2 app mỗi chunk vì có rename/remove function). **Chunk A (`9dee00d`)**: BE `CreatePurchaseEvaluationCommandHandler` thêm 1 PurchaseEvaluationDetail mặc định khi tạo phiếu — GroupCode="01", GroupName="Hạng mục chính", NoiDung=TenGoiThau, DonGiaNganSach=ThanhTienNganSach=Budget.TongNganSach hoặc BudgetManualAmount fallback 0; Changelog Insert audit. FE reorder PeDetailTabs (mirror 2 app) 1.Thông tin / 2.Hạng mục (lên #2) / 3.Chọn NCC / 4.NCC tham gia / 5.Ý kiến. **Chunk B (`2bba851`)**: ItemsTab restructure thành list `HangMucCard` (1 card / 1 hạng mục, expanded=true mặc định cho 1 hạng mục demo). Header card: GroupCode + NoiDung + 3 stat (KL/ĐG/TT) + NS link Δ nếu có + Pencil/Trash actions + ▼/▶ toggle expand. Expand body: NCC inline table columns NCC / Liên hệ / Điều khoản TT / **File báo giá** / ĐG chưa VAT / ĐG có VAT / Thành tiền / Action. Quote inline click cell → QuoteDialog cũ reuse. Add NCC + Sửa NCC reuse AddSupplierDialog/EditSupplierDialog cũ. Winner ✓ button mỗi NCC row. Drop function `SuppliersTab` (dead code ~134 LOC, replace bằng HangMucCard expand panel). Giữ AddSupplierDialog + EditSupplierDialog + SupplierAttachmentsCell (HangMucCard call lại). Section layout cuối: 1.Thông tin / 2.Hạng mục + Báo giá NCC (nested) / 3.Chọn NCC TP thắng thầu / 4.Ý kiến cấp duyệt — 4 section. **Chunk C (`f2f01f4`)**: Section Ý kiến restructure render layer (KHÔNG đụng Mig 26 schema — vẫn UPSERT 1 row / Level). LevelOpinionsSectionV2 forEach step → 1 `StepOpinionsBox` (replace grid-cols-2 cho N approver). Box header: "Bước N — Tên" + dept badge emerald + "X/Y đã duyệt" counter. Body: filter opinions theo step.order → sort levelOrder asc, signedAt asc → render `StepOpinionEntry` per signed opinion (tên NV + Cấp badge slate + admin override badge amber nếu có + emerald rounded-full timestamp + comment text). NV chưa duyệt KHÔNG hiển thị (Q3=a). Drop function `LevelOpinionBox` (replaced). Mirror fe-admin + fe-user. Verify build pass cả 2 app sau khi catch TS6133 `SuppliersTab` + `SupplierAttachmentsCell` unused (đã giải quyết: drop SuppliersTab, restore SupplierAttachmentsCell vào HangMucCard cột "File báo giá"). 81 test pass (no change — UAT defer)**)
@ -64,6 +65,7 @@
| Ngày | Ai | Task | Commit |
|---|---|---|---|
| 2026-05-12 | Claude | **🎯 SESSION 21 turn 2 — RAG Hybrid setup planning + Cách A validation deep dive (2 commit `1f8e9af` plan save + this chốt)** — Sau S21 turn 1 chốt cicd-monitor, user clarify 5 dự án future > 1M MD tokens → cuộc thảo luận deep ~15 turn về RAG infrastructure. **Em main solo** (no SOLUTION_ERP sub-agent spawn), delegate **claude-code-guide × 2** spawn agent research Anthropic + community practice. **Q&A deep dive 10 topics**: (1) RAG fundamentals + Vector DB Qdrant role, (2) Embedding "AI nhúng" + Voyage AI cost mechanics ($0.18/M tokens), (3) Multi-project shared architecture (5 projects → single Qdrant + per-collection), (4) Audit procedure 3-tier (weekly auto + monthly deep + quarterly major), (5) UI/UX Streamlit dashboard 7 pages design (overview + drill-down + compare + audit + cost + change + admin), (6) Cách A defensive (giữ blanket 120K) vs Cách B aggressive (cắt 60-70%), (7) Reasoning depth comparison lazy 60% → A 90% → B 75-80%, (8) Industry validation Anthropic + Cursor + Continue + Cline + Aider all hybrid, (9) Multi-agent cost reality 8-10× multiplier ~520K cumulative blanket 5 entities, (10) 3-layer hybrid pattern Anthropic Contextual Retrieval Sept 2024. **Quyết định chốt Cách A** (defensive hybrid: giữ blanket 120K em main + RAG retrieve supplement, sub-agent spawn baseline ~100K each, 4 agents = ~400K cumulative, heavy session billed ~560K saving -20% vs lazy 700K, quality recall ~85%) over **Cách B bỏ** (aggressive cut 60-70% vi phạm priority em main control flow strong + reasoning fragmented + UX latency +1-2s/state Q + risk severe RAG fail). **Why Cách A** (bro priority chốt): em main control flow strong preserve, decision quality 90% multi-source cohesive, wall-clock -20% (12 phút vs 16), risk-averse graceful fallback, multi-agent leverage cache 70-90%, industry-validated 9 sources. **3-layer hybrid Phase rollout**: P1 (W1-4) vector only Voyage-3-large recall ~70% $1.50/mo · P2 (M2) +BM25 bm25s free recall ~78% $1.50/mo · P3 (M3) +Voyage rerank-2 + Contextual prefix recall ~92% $4-5/mo. **Stack validated** cross-industry: Voyage AI embedding (Anthropic partner, multilingual 26 lang, $0.36 initial), Qdrant local (Rust 50MB, agent-native 2026 leader, ~3GB disk 5 project), FastMCP Python (official SDK, ~100 LOC), SQLite event log (5 tables + audit history), Streamlit 7 pages. **Plan I NEW deferred** — trigger bro confirm 5 dự án path + stack + pilot + Voyage API key + disk cleanup → dedicated session 10-14h weekend (per `feedback_drastic_refactor_scope` rule). **Deliverables**: `docs/rag-setup-plan.md` 1223 LOC commit `1f8e9af` + extend S21 t2 ~300 LOC = ~1500 LOC final, memory `feedback_rag_hybrid_pattern.md` cross-project reusable, session log this chốt, MEMORY.md index +1 entry. **CI skipped** path filter (`.md`). **4 sub-agents vẫn seeds-only** (KHÔNG spawn S21 turn 2 nên KHÔNG flush MEMORY.md per §6.5 KHÔNG add noise). Tests baseline 81 unchanged. | `1f8e9af` (plan save) · this chốt (commit final) |
| 2026-05-12 | Claude | **🎯 SESSION 21 turn 1 — Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier, 1 commit `f1c61c9`)** — User chốt Path A sau pre-flight Plan G Trial Week 1: thêm sub-agent thứ 4 chuyên post-deploy verify (Gitea Actions poll + bundle hash 2 app verify + sqlcmd mig prod = repo latest + endpoint smoke). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động — recurring blind spot pattern em main solo S20 quên verify ~30% push. **2 file mới**: `.claude/agents/cicd-monitor.md` (~7KB) — system prompt + 8-step workflow (verify push → poll Gitea API → fail log grep → live curl smoke → bundle hash × 2 app + verify changed → sqlcmd mig prod = repo latest → report PASS/FAIL/PARTIAL/TIMEOUT/SKIPPED-DOCS) + 5-stage report table + gotcha #25/#39/#40/#41/#44 cross-ref + skill `iis-deploy-runbook`/`dependency-audit-erp`/`ef-core-migration` preload + Anti-pattern 9 rules. `.claude/agent-memory/cicd-monitor/MEMORY.md` (~5KB seed) — recurring CI bug patterns + 5-stage checklist + baseline build/bundle metrics + bearer test pattern admin/nv.test. **1 file update repo**: `.claude/agents/README.md` — 4-agent architecture diagram (green slot mới) + decision tree (after push code + prod issue diagnose branches) + memory routine 4 SendMessage + skills preload 4 agents + cost reality table 564K → 750K spawn / 1.2M → 1.35M heavy / 600K → 700K optimized + trial workflow Week 1-3 CI/CD Monitor spawn integrated + pass criteria + catch ≥1 deploy ship fail. **Memory user-level update**: `feedback_multi_agent_setup.md` — title 3 → 4 sub-agents, decision tree +CI/CD Monitor invocation branches (after push + user prod issue), skills preload list +CI/CD Monitor (iis-deploy-runbook + dependency-audit-erp + ef-core-migration), cost table update + trade-off rationale (recurring blind spot ~30% push S20). **CI skipped**: all 3 file changed `.md` → match `paths-ignore: '**/*.md'` per gotcha #41 → no Gitea Actions run → no IIS deploy (expected — agent infra là local Claude Code, không cần present trên prod). Push success `36e21c8..f1c61c9 main -> main`. **3 (now 4) sub-agents vẫn seeds-only**: chưa spawn work nào — em main solo via context paste + Write file. KHÔNG flush 3 agent MEMORY.md (chưa spawn work = không findings, per §6.5 KHÔNG add noise entry). cicd-monitor MEMORY.md có entry "setup 2026-05-12" trong seed. Trial Week 1 kick-off ở Session 21 turn 2+ với Plan B Contract V2 wire Mig 28+29 candidate (mirror PE pattern S17-S19 proven 1×). Tests baseline 81 unchanged (no test added — docs-only commit). | `f1c61c9` (Setup cicd-monitor + README 4-agent + memory update) |
| 2026-05-11 | Claude | **🎯 SESSION 20 turns 6 + 8-12 — PE polish (NCC palette + autofill + responsive) + Multi-agent setup (7 commit `f568945``ae1814c`)** — Sau turn 7 wrap-up Mig 27, user iterate 7 polish/feature lớn nhỏ. **Turn 6 (`f568945`)** Manual budget "Nhập tay" drop tên field — 3 file × 2 app mirror (BudgetFieldRow + WorkspaceCreateView + HeaderForm) bỏ Input "Tên" UI khỏi manual mode, BE save `budgetManualName: null` luôn, VND format `1.000.000` + suffix đ. **Turn 8 (`3ec7b5a`)** AddSupplier +Số tiền inline + NCC 5-màu palette + Winner badge "🏆 Trúng thầu" — AddSupplierDialog +prop detailId? +form thanhTien, sequential POST /suppliers (response {id}) → POST /quotes (nếu detailId + thanhTien > 0). NCC_PALETTES const 5 màu literal Tailwind (blue/purple/sky/teal/pink) cycle theo idx. Winner row override emerald-500 border-l + bg-emerald-100/70 + shadow-sm + ring-1 emerald-300 + badge rounded-full bg-emerald-600 text-white "🏆 Trúng thầu". **Turn 9 (`83aae8e`)** User feedback bỏ badge → revert icon ✓ stick cũ nhưng đậm hơn (text-base font-bold emerald-700) + tên NCC winner text-emerald-900 + hover transition (winner hover:bg-emerald-200/70, non-winner hover:bg-white/80 hover:shadow-sm). **Turn 10 (`66551db`)** AddSupplierDialog auto-fill từ master data khi chọn NCC dropdown — onChange lookup picked supplier, setForm ghi đè 4 field (contactName ← contactPerson / contactPhone ← phone / contactEmail ← email / note ← note). Hint emerald "✓ Đã tự điền từ Master". User vẫn override được. **Turn 11 (`6e338f7`)** Responsive cho laptop màn hình nhỏ 1280-1366px — 4-tầng pattern: sidebar fe-admin + fe-user `w-72``w-60 xl:w-72` (+48px lg) / PE Workspace 2-panel `lg:[320px_1fr]``lg:[260px_1fr] xl:[320px_1fr]` (+60px lg) / Section padding `px-5 py-4``px-3 py-3 sm:px-5 sm:py-4` (+16px xs) / HangMucCard `gap-3 p-3``flex-wrap gap-2 p-2 sm:gap-3 sm:p-3` (+8px xs). Net gain trên 1366px ~+132px width cho NCC table area. Memory `feedback_responsive_laptop_breakpoint.md` capture pattern. **Turn 12 (`ae1814c`)** SETUP MULTI-AGENT INFRASTRUCTURE 3 sub-agents (Investigator READ cyan + Implementer WRITE conditional yellow + Reviewer READ adversarial red) + em main coordinator. Pre-flight decision gate 6/6 ✅. Phase 1-4 execute: `.claude/agents/` 4 file (README ~9.7KB + investigator + implementer + reviewer) + `.claude/agent-memory/` 3 MEMORY.md seed (~6KB each). Customize SOLUTION_ERP: skills preload mỗi agent (reuse 6 skills hiện có) + bearer test (admin@solutions / nv.test@solutions) + prod UAT URL + Phase 9 UAT mode + DB Dev/Design distinct. Windows MAX_PATH pitfall handled — drop `isolation: worktree` khỏi implementer.md (project path 51 chars + Dropbox-managed nested overflow 260+ chars). Memory `feedback_multi_agent_setup.md` capture decision gate + ACCEPT/REFUSE criteria + NAMGROUP s41-s43 ROI reference. 3 agents **chưa spawn work** ở S20 turn 12 — seeds-only state. Trial Week 1 candidate Contract V2 wire Mig 28+29 (mirror PE pattern proven). **Stats cumulative S20:** 27 mig (+1 Mig 27 from turn 7) · 59 tables · ~142 endpoints (+1 PATCH /menus/{key}) · 34 FE pages (+1 MenuVisibilityPage) · ~61 menu key (+1) · 81 test pass unchanged · 44 gotcha unchanged · **16 memory entries (+2: responsive + multi-agent)** · 6 skills unchanged · **3 sub-agents NEW** · 14 commits S20. | `f568945` (t6) · `3ec7b5a` (t8) · `83aae8e` (t9) · `66551db` (t10) · `6e338f7` (t11) · `ae1814c` (t12) · (current Docs t13 wrap) |
| 2026-05-11 | Claude | **🎯 SESSION 20 turn 7 — Admin Ẩn/Hiện + Đổi tên menu eOffice (Mig 27, 5 chunk `2ea2d27``ef394f8``059bfcb``1ed6530`→Chunk E Docs)** — User UAT yêu cầu "tính năng Ẩn Hiện và Đổi tên hiển thị của các Menu bên ngoài Office, làm trong Trang Admin Page". Hỏi xác nhận "chưa có" — đúng. User clarify Q2=b "edit hiển thị bên ngoài, chỉ của eOffice thôi" → admin sidebar luôn giữ Label gốc, DisplayLabel CHỈ áp fe-user. Q1=a global (không per-role), Q3=a giữ USER_HIDDEN_KEYS hardcode + tầng IsVisible dynamic combine, Q4 UAT skip test. **Chunk A** Domain MenuItem +IsVisible bool=true +DisplayLabel string?(200) + EF config + Migration 27 AddVisibilityAndDisplayLabelToMenuItems (2 AddColumn) — 3-file rule, apply LocalDB _Dev + _Design OK. **Chunk B** BE API: MenuNodeDto + MenuItemDto +isVisible +displayLabel (sau CRUD flags trước Children). GetMyMenuTreeQueryHandler pass through, KHÔNG filter server-side — 2 FE app tự quyết. UpdateMenuItemCommand + Validator + Handler (trim DisplayLabel whitespace → null). MenusController +PATCH /api/menus/{key} [Authorize Policy=Permissions.Update] body {isVisible, displayLabel}. **Chunk C** Domain MenuKeys +MenuVisibility const + All[] + DbInitializer +leaf "Menu eOffice" Icon=Eye Order=94 (Workflows shift 94→95). Manual seed Mig 27 LocalDB _Dev (INSERT MenuItems + Permissions Admin). FE Admin: types/menu.ts +isVisible +displayLabel, lib/menuKeys.ts +MenuVisibility, Layout resolver +/system/menu-visibility, App.tsx +Route. NEW pages/system/MenuVisibilityPage.tsx ~210 LOC: PageHeader + 4 StatCard (Tổng/Hiển thị/Đã ẩn/Đã đổi tên) + Search + Table 5 cột (Key mono + parentKey ↳ / Tên gốc / Input "Tên hiển thị" inline placeholder "Mặc định: {label}" / Toggle button emerald-Eye / amber-EyeOff / Lưu khi dirty + Khôi phục khi custom). PATCH endpoint, invalidate ['menus','all'] + ['my-menu'] trigger live update sidebar. Row hidden bg-amber-50/40 highlight, custom label bg-brand-50/40. **Chunk D** fe-user types/menu.ts mirror. Layout.tsx filterForUser 2 tầng (USER_HIDDEN_KEYS structural + !isVisible dynamic). Helper effectiveLabel(n) = displayLabel?.trim() || label. Replace 3 callsite {node.label} → {effectiveLabel(node)}. USER_FIXED_TOP "__inbox" entry +isVisible:true cho type check pass. **fe-admin Layout KHÔNG đụng** — admin sidebar render Label gốc + show hết menu (user Q2=b). **Chunk E Docs (current)**. **Stats Session 20 turn 7**: 26→27 mig, 59 DB tables (no change), ~141→142 endpoints, 33→34 FE pages, ~60→61 menu key, 81 test pass (Q4 UAT defer), 44 gotcha (no new). Memory entries 14 (no new). | `2ea2d27` (A Mig 27) · `ef394f8` (B BE API) · `059bfcb` (C FE admin) · `1ed6530` (D FE user) · (current E Docs) |

View File

@ -0,0 +1,318 @@
# Session 21 turn 2 — RAG Hybrid setup planning + Cách A validation deep dive
**Date:** 2026-05-12 (tiếp S21 turn 1 từ 0030 sáng — sang sáng-chiều-tối discussion deep RAG)
**Dev:** Claude (Opus 4.7 1M Max — em main solo, no SOLUTION_ERP sub-agent spawn)
**Base commit:** `3a34831` (S21 turn 1 chốt cicd-monitor)
**Commits:** `1f8e9af` (RAG plan save) + this chốt (2 commit S21 turn 2)
## Bối cảnh
Sau S21 turn 1 chốt cicd-monitor (4 sub-agents seeds-only), bro đặt câu hỏi về RAG infrastructure cho **5 dự án future > 1M MD context**. Cuộc thảo luận deep ~15+ turn covering:
1. RAG fundamentals + Vector DB role
2. Embedding model "AI nhúng" + Voyage AI cost mechanics
3. Multi-project shared architecture (5 projects)
4. Audit procedure 3-tier + change tracking SQLite
5. UI/UX Streamlit dashboard 7 pages
6. Cách A defensive (giữ blanket) vs Cách B aggressive (cắt 60-70%)
7. Reasoning depth comparison (lazy current vs Cách A vs Cách B)
8. Industry validation via claude-code-guide research
9. Multi-agent cumulative cost reality (4 agents → ~520K cumulative blanket)
10. 3-layer hybrid pattern (Anthropic Contextual Retrieval: embeddings + BM25 + reranking)
## Deliverables
### File mới — `docs/rag-setup-plan.md` (commit `1f8e9af`, 1223 LOC)
Cross-project reference plan với 12 section comprehensive:
1. Context + Why
2. Architecture overview (6-layer diagram)
3. BLANKET load list (~100K, 28% MD)
4. RAG store list (~254K, 72% MD)
5. Tool stack recommend
6. Setup scripts copy-paste ready (~250 LOC Python)
7. Audit procedure 3-tier (weekly/monthly/quarterly)
8. Multi-AI client access (Claude Code + Desktop + Cursor + GPT-4)
9. Timeline rollout 10-14h dedicated session
10. Caveats + risks
11. Success metrics + decision gate
12. Future enhancements
### File extend S21 turn 2 (this chốt commit)
Add 2 sections vào `rag-setup-plan.md`:
- Section 13: Multi-agent cumulative cost reality (Anthropic 8-10× warning)
- Section 14: 3-layer hybrid RAG upgrade path (Phase 1-3 Anthropic Contextual Retrieval)
## Quyết định chốt — Cách A vs Cách B
### Chọn **Cách A** (defensive hybrid) ⭐
```
Blanket: GIỮ NGUYÊN ~120K em main (35% MD)
RAG: ADD as supplement (retrieve on-demand)
Multi-agent: 4 sub-agents share retrieve cache
Sub-agent spawn blanket: ~80-100K each (auto-inject + skills + spec)
Cumulative blanket 5 entities: ~520K
Heavy session billed: ~560K (saving 20% vs lazy)
```
**Why Cách A (priority bro: em main control flow strong):**
1. ✅ State ownership strong — em main biết direct project state
2. ✅ Decision quality 90% (vs Cách B 75-80% do fragmentation)
3. ✅ Wall-clock per task 12 phút (vs Cách B 16 phút)
4. ✅ UX smooth — em response fast direct cho state question
5. ✅ Risk-averse — graceful degradation nếu RAG fail (blanket fallback)
6. ✅ Multi-agent leverage cache hit 70-90% common queries
7. ✅ Quality recall +25-55pp (5-15 sources cross-validated vs lazy 1-3)
### Bỏ **Cách B** (aggressive cut)
```
Blanket: CẮT MẠNH 60-70% (40-50K còn lại)
RAG: PRIMARY access mechanism cho mọi thứ
```
**Why bỏ:**
1. ❌ Vi phạm priority "em main control flow strong"
2. ❌ State ownership weak — phải retrieve mỗi câu state question
3. ❌ UX latency +1-2s per state Q
4. ❌ Decision quality 75-80% do reasoning fragmentation
5. ❌ Risk severe nếu RAG fail (em main ngơ ngác)
6. ❌ Anthropic research warn: "context rot inevitable cutting aggressively"
7. ❌ Cascade retrieve problem (1 task → 2-3 retrieves)
## Industry validation via claude-code-guide research
Spawn 2 lần claude-code-guide agent research (NOT SOLUTION_ERP sub-agents):
### Round 1: Anthropic setup inventory (10 features)
- Memory tool beta (`content-management-2025-06-27`)
- Prompt caching extensions (5min/1h beta)
- Files API beta (`files-api-2025-04-14`)
- Citations stable
- MCP servers official + community (9,400+ in 2026)
- Voyage AI embedding partnership
- Context compaction tool
- Claude Agent SDK orchestration
- Batch API 50% discount
- RAG best practices Anthropic official
### Round 2: Industry practice validation
**5/5 dimensions Cách A fit Anthropic explicit recommend:**
| Dimension | Bro setup | Anthropic pattern |
|---|---|---|
| Context approach | Hybrid blanket+RAG | ✅ Recommended explicit |
| Sub-agent count | 4 | ✅ "3-5 optimal" |
| MD scale | 5 project > 1M | ✅ "Use RAG khi >200K" |
| Stack | Qdrant+Voyage+MCP | ✅ Production validated |
| Coordination | Em main + agents | ✅ "Coordinator+workers" |
**Source 4 Anthropic blog posts:**
- "Effective Context Engineering for AI Agents" (2025)
- "Contextual Retrieval" (Sept 2024 flagship)
- "Effective Harnesses for Long-Running Agents"
- "Multi-Agent Coordination Patterns"
**Community consensus (Tier 1 tools all Hybrid):**
- Cursor IDE `@codebase` indexing
- Continue.dev MCP transport
- Cline / Roo-Cline filesystem + AST + dynamic context
- Aider code-as-graph
- Sourcegraph Cody graph-aware
**ZERO** tools adopt aggressive Cách B pattern. **ALL** evolve toward Cách A hybrid.
## 3-layer hybrid pattern (Anthropic Contextual Retrieval Sept 2024)
```
Layer 1: Embeddings (Voyage-3-large)
→ Semantic + synonym + multilingual catch
Performance: baseline ~50% recall
+ Contextual prefix (Haiku-generated context):
→ +35% improvement = ~67% recall
Layer 2: BM25 (bm25s Python lib free)
→ Exact identifier + technical terms catch
+ Layer 1 = ~75% recall
Layer 3: Reranking (Voyage rerank-2)
→ Cross-attention deep relevance
+ Layer 1+2 = ~85% recall
```
**Phase rollout incremental:**
| Phase | Layer | Recall | Cost/month |
|---|---|---|---|
| Phase 1 (Week 1-4) | Layer 1 vector only | ~70% | ~$1.50 |
| Phase 2 (Month 2) | + Layer 2 BM25 | ~78% | ~$1.50 (BM25 free local) |
| Phase 3 (Month 3) | + Layer 3 + Contextual | ~92% | ~$4-5 |
## Multi-agent cost reality (Anthropic warn 8-10× multiplier)
```
Per entity blanket:
Em main: ~120K
Sub-agent each spawn: ~80-100K (auto-inject baseline + skills + spec)
Cumulative blanket 5 entities = ~520K
Heavy session full 4-agent spawn:
Lazy current: ~700K effective billed
Cách A: ~560K (-20% saving from multi-agent shared cache)
Cost multiplier vs solo em main: ~8-10×
Anthropic acknowledged: "Expect 3-10× token multiplier"
```
**Saving Cách A breakdown (-140K):**
- Em main lazy Read → retrieve: -25K
- 4 agents lazy Read → cached retrieve: -160K (share cache 70-90%)
- Reasoning streamlined: -20K
- Plus +60K retrieve cost added
- Net: -145K ≈ -20% per heavy session
## Stack validated
| Component | Tool | Reason |
|---|---|---|
| **Vector DB** | Qdrant local | Rust binary 50MB, agent-native 2026 leader |
| **Embedding** | Voyage-3-large | Anthropic partner, multilingual 26 lang, $0.18/M |
| **MCP server** | FastMCP Python | Official Anthropic SDK |
| **Chunking** | Custom adaptive Python | §6.5 compliant, transparent |
| **Tracking** | SQLite local | Event log + audit + cost analytics |
| **Dashboard** | Streamlit custom | 7 pages multi-project |
| **Re-index** | Pre-commit hook | Native git, delta on commit |
**Total cost 5 projects:** ~$1.50-5/month depending Phase. ~$0.50 initial embed.
## Em main solo S21 turn 2 (no SOLUTION_ERP sub-agent spawn)
```
Spawn này session:
✅ claude-code-guide × 2 (generic agent for Anthropic research)
❌ Investigator / Implementer / Reviewer / CI/CD Monitor (vẫn seeds-only)
Em main solo qua context paste + Write file + research delegate.
```
## Skills check
6 skills hiện tại unchanged. Decision KHÔNG add skill mới cho RAG vì:
- RAG là decision/architectural pattern, không phải workflow project-specific
- Cross-project applicable → memory entry phù hợp hơn skill
- Per rule §9.5 anti-pattern "viết skill chỉ để có thêm"
- Defer skill creation sau Phase 1 trial validate
## Tests
Unit test 81 unchanged (0 test added — pure planning, không code change).
## Memory entry mới
**`feedback_rag_hybrid_pattern.md`** (NEW — cross-project pattern reusable):
- Decision Cách A rationale (control flow priority)
- Multi-agent cost reality (8-10× multiplier)
- 3-layer hybrid pattern Phase 1-3 incremental rollout
- Stack validated (Voyage + Qdrant + FastMCP)
- When to apply / when NOT apply triggers
- Anti-patterns documented
- Anthropic 4 blog cross-ref
## Verify chain
| Check | Status |
|---|---|
| dotnet build | Không chạy (no .cs change) |
| dotnet test | Không chạy (no test added — pure docs) |
| npm build | Không chạy (no FE change) |
| Push origin | Pending end of turn |
| CI Gitea Actions | Skip per path filter `.md` |
| IIS prod deploy | KHÔNG xảy ra (CI skip, expected) |
## Docs updates
-`docs/STATUS.md` — Last updated S21 turn 2 + Recently Done row top
-`docs/HANDOFF.md` — TL;DR Session 21 turn 2 section + Last updated
-`docs/rag-setup-plan.md` — extend +Section 13 (cost reality) +Section 14 (3-layer)
-`docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md` — file này
- ✅ Memory user-level new: `feedback_rag_hybrid_pattern.md`
- ✅ Memory user-level: `MEMORY.md` index + 1 entry pointer
- ⏭ KHÔNG đụng: rules.md / architecture.md / gotchas.md / database/* / flows/* / skills/* / CLAUDE.md (no real change cho 8 file này)
- ⏭ KHÔNG flush 4 sub-agent MEMORY.md (chưa spawn, per §6.5 KHÔNG add noise)
## Handoff Session 21 turn 3+
### Plan I NEW — RAG Setup Implementation
**Trigger:** Bro confirm 5 dự án path + stack + pilot choice + Voyage API key + disk cleanup 5-8GB free.
**Schedule:** Dedicated session 10-14h weekend (per memory `feedback_drastic_refactor_scope` rule).
**Phases:**
- Phase 1 (Week 1-4): Layer 1 vector embeddings only — ~70% recall — ~$1.50/mo
- Phase 2 (Month 2): + Layer 2 BM25 hybrid — ~78% recall — ~$1.50/mo
- Phase 3 (Month 3): + Layer 3 Reranking + Contextual — ~92% recall — ~$4-5/mo
**Pre-flight task:** Spawn 🔵 Investigator audit MD inventory 5 dự án parallel → tinh chỉnh blanket list per project.
### Plan B Contract V2 wire (vẫn pending S21 turn 1)
- Trial Week 1 multi-agent kick-off SOLUTION_ERP
- 6 tasks (Mig 28+29 + Service + Controller + FE × 2 + Pin V2)
- 4 sub-agents pipeline coordinate (lần đầu spawn 4 agents thật)
### Plan C Test gap fill (vẫn pending)
Bundle Chunk E Plan B — 5 test pending:
- B4 silent 403 regression (gotcha #44 vi phạm §7)
- V2 Service `ApproveV2Async` UPSERT opinion
- Section gộp Chunk C render
- Mig 25 PATCH `/user-selectable`
- Mig 27 PATCH `/api/menus/{key}`
### Plan D-F-G unchanged
- D: Hard blockers ops (UAT/SMTP/creds/backup) — BLOCKED chờ user
- F: Audit định kỳ 2026-06-01 (~3 tuần nữa, KHÔNG tự chạy)
- G: Multi-agent trial 4-week (post-S21 t1 + S21 t2 setup complete)
## Stats cumulative S21 turn 2
| Metric | Trước S21 t2 | Sau S21 t2 | Δ |
|---|---|---|---|
| DB tables | 59 | 59 | 0 |
| Migrations | 27 | 27 | 0 |
| Endpoints | ~142 | ~142 | 0 |
| FE pages | 34 | 34 | 0 |
| Unit tests | 81 | 81 | 0 |
| Gotchas | 44 | 44 | 0 |
| **Memory entries** | 16 | **17** | **+1** (RAG hybrid pattern) |
| Skills | 6 | 6 | 0 |
| Sub-agents | 4 seeds-only | 4 seeds-only | 0 (chưa spawn) |
| **Commits S21** | 2 (`f1c61c9` + `3a34831`) | **4** | **+2** (1f8e9af + this chốt) |
| **MD plan files** | 0 | **1** | **+1** (`rag-setup-plan.md` 1223 LOC + 2 section extend) |
## Cross-ref
- S21 turn 1 session log: `2026-05-12-0030-s21-cicd-monitor-add.md`
- Plan file: `docs/rag-setup-plan.md` (1223 + extend ~300 LOC = ~1500 LOC)
- Memory new: `feedback_rag_hybrid_pattern.md` (cross-project reusable)
- Industry research: claude-code-guide × 2 spawn agent reports
- 4 Anthropic blog cross-ref trong memory entry
## Bài học chốt S21 turn 2
1. **Em main control flow strong là priority bro** — quyết định Cách A defensive over Cách B aggressive
2. **Multi-agent cost realistic 8-10× solo** — KHÔNG tránh được spawn baseline ~400K cumulative 4 agents
3. **Anthropic recommend 3-layer hybrid pattern** — embeddings + BM25 + reranking compound effect
4. **Industry consensus = hybrid** — Cursor + Continue + Cline + Aider all evolve toward hybrid
5. **Voyage Vietnamese quality cần verify Week 1** — voyage-3-large multilingual nhưng explicit Vietnamese benchmark chưa publish
6. **RAG setup = dedicated session 10-14h** — per `feedback_drastic_refactor_scope` rule
7. **5 projects scale workable** — single Qdrant + per-project collection + ~$2-5/month cost

View File

@ -1165,6 +1165,361 @@ Mitigation:
---
## 13. Multi-agent cumulative cost reality (Anthropic 8-10× warning)
> **Added S21 turn 2 (2026-05-12)** — clarification sau khi user catch gap "120K blanket KHÔNG bao gồm 4 agents".
### Per-entity blanket breakdown
```
Em main blanket: ~120K
STATUS + HANDOFF top + rules + architecture + 5 agent .md +
4 MEMORY.md auto-inject + skills desc + memory critical +
auto-inject system reminders
Per sub-agent spawn baseline: ~80-100K each
Agent system prompt (~5K) +
3 skills preload SKILL.md full (~21K, trigger semantic) +
Auto-inject MEMORY.md 25KB first 200 lines (~7K) +
Em main pass spec task (~10-15K) +
Em main paste common context excerpt (~30-50K) +
Auto-inject project context (~10K)
= ~80-100K per sub-agent spawn (per Anthropic docs)
4 sub-agents cumulative: ~400K
(4 × ~100K each, isolated context windows)
TOTAL cumulative blanket 5 entities: ~520K
Em main + 4 sub-agents combined (isolated windows, cumulative billing)
```
### Context windows are ISOLATED
```
KHÔNG phải 5 entities share 520K trong 1 context window 1M.
Mỗi entity có context window 1M RIÊNG:
Em main → context window 1M, dùng ~120K
Investigator → context window 1M, dùng ~100K
Implementer → context window 1M, dùng ~100K
Reviewer → context window 1M, dùng ~100K
CICD Monitor → context window 1M, dùng ~100K
→ Mỗi entity LOST-IN-MIDDLE threshold riêng (~700K each)
→ Mỗi entity capacity ~58 tasks before hit hard cap riêng
NHƯNG billing là CUMULATIVE 520K across all contexts:
Anthropic billing tổng tokens across all 5 windows
→ Hit weekly cap nhanh hơn solo em main 4-5×
```
### Heavy session token compound effect (Cách A vs lazy)
**Without RAG (lazy current — 4 agents spawn):**
```
Em main:
Blanket: 120K
Lazy Read on-demand: ~50K
Reasoning + coordinate: ~30K
= ~200K subtotal
4 sub-agents (each):
Spawn blanket: ~100K
Lazy Read inside agent: ~50K
Reasoning + work: ~30K
Each agent: ~180K
──────────────
4 agents subtotal: ~720K cumulative
SendMessage iteration:
10 round trips × ~30K nominal: 300K nominal
Cache hit 70%: ~90K effective
TOTAL HEAVY SESSION (lazy):
200K + 720K + 90K = ~1010K nominal
After cache discount: ~700K effective billed
```
**With Cách A RAG:**
```
Em main:
Blanket: 120K (unchanged)
RAG retrieve replace lazy Read: ~30K (-20K saving)
Reasoning streamlined: ~25K
= ~175K subtotal (saving 25K)
4 sub-agents (each):
Spawn blanket: ~100K (unchanged)
RAG retrieve (share cache 70-90% common queries): ~15K
Reasoning streamlined: ~25K
Each agent: ~140K (saving 40K each)
──────────────
4 agents subtotal: ~560K (saving 160K total)
SendMessage iteration: ~90K effective (unchanged)
TOTAL HEAVY SESSION (Cách A):
175K + 560K + 90K = ~825K nominal
After cache discount: ~560K effective billed
SAVING: -140K (-20%)
```
### Cost saving breakdown
| Component | Lazy current | Cách A | Saving |
|---|---:|---:|---:|
| Em main blanket (fixed) | 120K | 120K | 0 |
| Em main lazy Read → RAG retrieve | 50K | 30K | -20K |
| Em main reasoning streamlined | 30K | 25K | -5K |
| 4 agents spawn blanket (fixed) | 400K | 400K | 0 |
| 4 agents lazy Read → cached retrieve | 200K | 60K | **-140K** |
| 4 agents reasoning | 120K | 100K | -20K |
| SendMessage cached | 90K | 90K | 0 |
| **TOTAL EFFECTIVE BILLED** | **~700K** | **~560K** | **-140K (-20%)** |
**Saving 80% từ 4 agents** share retrieve cache (cache hit 70-90% common queries cross-agent).
→ Em main saving chỉ 25K (blanket unchanged, chỉ optimize Read → retrieve).
### Multi-agent leverage example concrete
```
Task Plan B Contract V2 wire:
🔵 Inv query "PE V2 schema pattern" → 15K retrieve + cached
🟡 Imp query same → cache hit 90% → 1.5K effective
🔴 Rev query same → cache hit 90% → 1.5K effective
🟢 CICD query same → cache hit 90% → 1.5K effective
Em main query same → cache hit 90% → 1.5K effective
Cumulative retrieve cost: 15K + 4×1.5K = 21K
Compare to lazy:
Each agent Read PE V2 file separately
5 entities × 20K Read = 100K cumulative
→ Saving 79K just for 1 cross-agent query
```
### Optimization tips để giảm cumulative
**Option 1: Spawn ít agents hơn**
- Decision gate 6-criteria mỗi task (per `feedback_multi_agent_setup` rule)
- Solo em main đủ → KHÔNG spawn agent
- Chỉ spawn agent nào THẬT cần
- Trong S20-S21: 4 agents seeds-only, em chưa spawn lần nào → cost ~120K em main thôi
**Option 2: Tune blanket sub-agent (100K → 80K)**
- Em main pass spec gọn (~10K thay 15K)
- Em main paste common context excerpt thay full (~20K thay 50K)
- Skills preload chỉ description (~3K thay 21K full SKILL.md)
→ Trigger SKILL.md full khi semantic match
- Per sub-agent: 100K → 80K
- 4 agents cumulative: 400K → 320K
- Heavy session: 560K → 480K (-15%)
**Option 3: SendMessage cache aggressive (1h TTL beta)**
- Anthropic extended cache `extended-cache-ttl-2025-04-11`
- Static prompts cache premium WRITE 2× base
- Subsequent reads 0.1× discount
- Multi-agent cùng cache prefix → benefit lớn
- Saving 10-15% additional
---
## 14. 3-layer hybrid RAG upgrade path (Anthropic Contextual Retrieval)
> **Added S21 turn 2 (2026-05-12)** — Anthropic flagship pattern Sept 2024.
### Pattern overview
```
Anthropic Contextual Retrieval = 3 layers compound:
Layer 1: Embeddings (Voyage-3-large)
→ Semantic + synonym + multilingual catch
+ Contextual prefix (Haiku-generated context):
Add chunk-specific context BEFORE embed
"This chunk discusses... in context of..."
→ Better recall via enriched vector
Layer 2: BM25 (bm25s Python lib free local)
→ Exact identifier + technical terms (function names, error codes, Mig numbers)
+ Contextual BM25 (same prefix pattern)
Layer 3: Reranking (Voyage rerank-2)
→ Cross-attention deep relevance
→ Re-score top 30 candidates → return top 5 truly relevant
```
### Performance compound effect
```
Baseline (naive vector embeddings): ~50% recall
+ Contextual embeddings: ~67% recall (-35% failure)
+ Hybrid Contextual + BM25: ~75% recall (-49% failure)
+ Reranking: ~85% recall (-67% failure)
```
📎 Source: [Anthropic Contextual Retrieval Sept 2024](https://www.anthropic.com/news/contextual-retrieval)
### Phase rollout incremental (recommend cho bro)
| Phase | Setup | Recall | Cost/month | Effort additional |
|---|---|---:|---:|---|
| **Phase 1** (Week 1-4) | Layer 1 vector only (Voyage-3-large) | ~70% | ~$1.50 | 10-14h initial |
| **Phase 2** (Month 2) | + Layer 2 BM25 (bm25s free local) | ~78% | ~$1.50 unchanged | 2-3h |
| **Phase 3** (Month 3) | + Layer 3 Voyage rerank-2 + Contextual prefix | ~92% | ~$4-5 | 3-4h |
### Phase 1 implementation (basic vector RAG)
Đã cover trong Section 5-6 plan. Bro implement Week 1-4 trial pilot.
### Phase 2 upgrade — Add BM25 hybrid
```python
# scripts/rag-mcp-server.py — upgrade
from bm25s import BM25
bm25 = BM25.load("./rag-data/bm25_index") # pre-built
@mcp.tool()
def rag_retrieve_hybrid(query, scope="all", k=5):
# Step 1: Vector search
query_vec = voyage.embed([query], model="voyage-3-large").embeddings[0]
vector_results = qdrant.search(COLLECTION, query_vec, limit=20)
# Step 2: BM25 search (local Python lib)
bm25_results = bm25.retrieve(query, k=20)
# Step 3: Merge + dedup
candidates = merge_dedup(vector_results, bm25_results) # ~30 chunks
# Step 4: Score combine (RRF reciprocal rank fusion)
final_scores = reciprocal_rank_fusion(vector_results, bm25_results)
return final_scores[:k]
```
### Phase 3 upgrade — Full Anthropic Contextual
```python
# scripts/rag-indexer.py — upgrade với contextual prefix
import anthropic
claude_haiku = anthropic.Anthropic()
def contextualize_chunk(chunk_content, full_doc_path):
"""Generate context prefix using Claude Haiku (cheap model)."""
full_doc = open(full_doc_path).read()
response = claude_haiku.messages.create(
model="claude-haiku-4-5", # cheap ~$0.0001/chunk
max_tokens=150,
messages=[{
"role": "user",
"content": f"""<document>
{full_doc[:5000]}
</document>
<chunk>
{chunk_content}
</chunk>
Give a brief context (50-100 words) explaining what this chunk is about and where it fits in the document. Be specific."""
}]
)
return response.content[0].text
# In indexer pipeline:
for chunk in chunks:
context = contextualize_chunk(chunk["content"], chunk["source"])
chunk["content_enriched"] = f"{context}\n\n{chunk['content']}"
# Embed enriched version → better recall
```
```python
# scripts/rag-mcp-server.py — final upgrade với reranking
import voyageai
@mcp.tool()
def rag_retrieve_full(query, scope="all", k=5):
# Step 1-3: Same as Phase 2 (vector + BM25 + merge)
candidates = hybrid_search(query, scope, top=30)
# Step 4: Voyage Rerank
rerank_response = voyage.rerank(
query=query,
documents=[c.content for c in candidates],
model="voyage-rerank-2", # ~$0.05 per 1000 queries
top_k=k
)
return [candidates[r.index] for r in rerank_response.results]
```
### Cost incremental analysis
```
Phase 1 → Phase 3 incremental cost:
Phase 1 (basic vector):
Voyage embed: ~$0.36 initial + ~$0.20/mo delta
= ~$1.50/mo total
Phase 2 (+BM25):
BM25 free local (Python lib)
Embedding cost same
= ~$1.50/mo total (unchanged)
Phase 3 (+Reranking + Contextual):
Voyage rerank-2: ~$0.05 per 1000 queries
600 queries/mo × $0.05/1K = $0.03/mo
Haiku contextual prefix: ~$0.0001 per chunk
Initial 5000 chunks × $0.0001 = $0.50 one-time
Delta ~100 chunks/mo × $0.0001 = $0.01/mo
+ Voyage rerank monthly: ~$0.05/mo per 1K queries × 5 projects
+ Re-embed enriched chunks: ~$0.50/mo
= ~$4-5/mo total
→ Quality jump 70% → 92% recall = +22pp
→ Cost jump $1.50 → $4-5/mo = +$3
→ Worth it after Phase 1 validation
```
### Why incremental rollout (vs all-in Phase 3 immediate)
1. **Validate Layer 1 quality first** — nếu Voyage Vietnamese kém → upgrade Phase 2-3 vô ích
2. **Measure baseline cost** — biết exact Voyage spend trước add rerank/contextual
3. **Identify retrieval miss patterns** — Phase 1 trial reveal weakness → target Phase 2-3 fix
4. **Risk-averse setup** — mỗi phase 2-3h add, rollback dễ nếu fail
5. **§6.5 narrative preserve** — KHÔNG over-engineer, build incremental
### When to skip Phase 2-3
- Phase 1 recall already > 85% → Phase 2-3 marginal benefit (Vietnamese-specific corpus)
- Cost monthly < $5 budget → stay Phase 1 OK
- Solo dev no Vietnamese exact terms heavy → BM25 less impactful
### When to MUST upgrade Phase 2-3
- Recall < 70% on benchmark indicate Phase 1 insufficient
- Em main report "miss exact identifier" frequently Phase 2 BM25 critical
- Multi-language queries common Phase 3 reranker stabilize
- Production quality target > 90% → Phase 3 required
---
## 📚 References + tools
### Anthropic official