From 0a3b74761296fdec83d063e908836a1d1c62165b Mon Sep 17 00:00:00 2001 From: pqhuy1987 Date: Tue, 12 May 2026 18:50:28 +0700 Subject: [PATCH] =?UTF-8?q?[CLAUDE]=20Docs:=20ch=E1=BB=91t=20Session=2021?= =?UTF-8?q?=20turn=202=20=E2=80=94=20RAG=20Hybrid=20setup=20planning=20+?= =?UTF-8?q?=20C=C3=A1ch=20A=20validation?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sau S21 turn 1 chốt cicd-monitor, bro clarify 5 dự án future > 1M MD tokens → discussion deep ~15 turn về RAG infrastructure. Em main solo (no SOLUTION_ERP sub-agent spawn), delegate claude-code-guide × 2 research Anthropic + community practice. Quyết định chốt: - Cách A defensive (giữ blanket 120K em main + RAG retrieve supplement) - Bỏ Cách B aggressive (cắt 60-70% blanket) — vi phạm priority em main control flow strong - Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider all hybrid) - 3-layer pattern Phase 1-3 incremental rollout (vector → +BM25 → +reranking, recall ~70% → ~92%) - Stack: Voyage-3-large + Qdrant local + FastMCP Python + Streamlit dashboard Multi-agent cost reality clarify (post-S21 t2): - Em main blanket: ~120K - 4 sub-agents spawn cumulative: ~400K - Total billed heavy session: ~560K Cách A vs ~700K lazy - Saving -20% từ multi-agent shared cache 70-90% - Anthropic acknowledge 8-10× multiplier multi-agent Files updated: - docs/STATUS.md (Last updated S21 turn 2 + Recently Done row top) - docs/HANDOFF.md (TL;DR Session 21 turn 2 section + Last updated) - docs/rag-setup-plan.md (+Section 13 multi-agent cost reality + Section 14 3-layer hybrid Phase 1-3, +355 LOC) - docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md (new session log) Memory user-level update (outside repo, separate update): - feedback_rag_hybrid_pattern.md (NEW cross-project pattern reusable) - MEMORY.md index (+1 entry pointer) Plan I NEW deferred — trigger bro confirm 5 dự án path + stack + pilot + Voyage API + disk cleanup → dedicated session 10-14h weekend (per feedback_drastic_refactor_scope rule). Stats: - 17 memory entries (+1 RAG hybrid) - 1 plan file rag-setup-plan.md (1500 LOC final) - 4 sub-agents seeds-only unchanged - 81 test unchanged - 4 commits S21 cumulative (f1c61c9 + 3a34831 + 1f8e9af + this) CI skip per path filter (all .md). Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/HANDOFF.md | 108 +++++- docs/STATUS.md | 4 +- .../2026-05-12-1800-s21-turn2-rag-planning.md | 318 ++++++++++++++++ docs/rag-setup-plan.md | 355 ++++++++++++++++++ 4 files changed, 783 insertions(+), 2 deletions(-) create mode 100644 docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md diff --git a/docs/HANDOFF.md b/docs/HANDOFF.md index ea7cee2..74e42b7 100644 --- a/docs/HANDOFF.md +++ b/docs/HANDOFF.md @@ -1,6 +1,112 @@ # HANDOFF — Brief 5 phút cho session tiếp theo -**Last updated:** 2026-05-12 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`. CI skipped per path filter (3 file `.md`). Cost reality update: ~750K spawn (3 → 4 agents) · ~1.35M heavy / ~700K optimized. Stats: 4 sub-agents seeds-only · 16 memory · 27 mig · 59 tables · ~142 endpoints · 81 test · 44 gotcha · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work — em main solo). Trial Week 1 kick-off S21 turn 2+ Plan B Contract V2 wire mirror PE pattern.**) +**Last updated:** 2026-05-12 (Session 21 turn 2 — **🎯 RAG Hybrid setup planning + Cách A validation deep dive. 2 commit (`1f8e9af` plan save 1223 LOC + this chốt). KHÔNG implement, plan only — defer chờ bro confirm 5 dự án future. Decision chốt: Cách A defensive (giữ blanket 120K em main + RAG retrieve) over Cách B aggressive (cắt 60-70% blanket). Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider). Stack: Voyage-3-large + Qdrant + FastMCP + Streamlit dashboard. Multi-agent cost reality: 4 agents → ~520K cumulative blanket → heavy session ~560K (Cách A) vs ~700K (lazy). 3-layer pattern Phase 1-3 rollout (embeddings + BM25 + reranking, ~70% → ~92% recall). Stats: +1 memory entry (`feedback_rag_hybrid_pattern`) +1 plan file (`rag-setup-plan.md` 1500 LOC). Sub-agents vẫn 4 seeds-only, em main solo session.**) +**S21 turn 1:** 2026-05-12 0030 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`. CI skipped per path filter (3 file `.md`). Cost reality update: ~750K spawn (3 → 4 agents) · ~1.35M heavy / ~700K optimized. Stats: 4 sub-agents seeds-only · 16 memory · 27 mig · 59 tables · ~142 endpoints · 81 test · 44 gotcha · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work — em main solo). Trial Week 1 kick-off S21 turn 2+ Plan B Contract V2 wire mirror PE pattern.**) + +## TL;DR Session 21 turn 2 — RAG Hybrid setup planning (Cách A chốt + 3-layer pattern) + +User clarify 5 dự án future > 1M MD tokens → cuộc thảo luận deep ~15 turn về RAG infrastructure. Em main solo (no SOLUTION_ERP sub-agent spawn), delegate 2 lần claude-code-guide agent research Anthropic + community practice. + +### Q&A deep dive 10 topics + +1. RAG fundamentals + Vector DB role (Qdrant) +2. Embedding "AI nhúng" + Voyage AI cost mechanics ($0.18/M tokens) +3. Multi-project shared architecture (5 projects → single Qdrant + per-collection) +4. Audit procedure 3-tier (weekly auto + monthly deep + quarterly major) +5. UI/UX Streamlit dashboard 7 pages design +6. Cách A defensive (giữ blanket 120K) vs Cách B aggressive (cắt 60-70%) +7. Reasoning depth comparison: lazy 60% → A 90% → B 75-80% +8. Industry validation: Anthropic + Cursor + Continue + Cline + Aider all hybrid +9. Multi-agent cost reality: 8-10× multiplier, ~520K cumulative blanket 5 entities +10. 3-layer hybrid pattern (Anthropic Contextual Retrieval Sept 2024) + +### Quyết định chốt — Cách A vs Cách B + +**Chọn Cách A** (defensive hybrid): +- Blanket: GIỮ NGUYÊN 120K em main + RAG retrieve supplement +- Sub-agent spawn baseline: ~80-100K each (4 agents = ~400K cumulative) +- Heavy session billed: ~560K (saving -20% vs lazy 700K) +- Quality recall: ~85% (vs Cách B 75-80% do fragmentation) + +**Why Cách A** (bro priority chốt): +- ✅ Em main control flow strong (state ownership direct, response fast) +- ✅ Decision quality 90% (multi-source cohesive reasoning) +- ✅ Wall-clock per task -20% (12 phút vs Cách B 16 phút) +- ✅ Risk-averse (graceful fallback blanket nếu RAG fail) +- ✅ Multi-agent leverage cache 70-90% hit common queries +- ✅ Industry-validated (Anthropic + Cursor + Continue + Cline + Aider) + +### 3-layer hybrid Phase rollout (Anthropic Contextual Retrieval) + +| Phase | Layers | Recall | Cost/mo | +|---|---|---|---| +| Phase 1 (Week 1-4) | Vector embedding only (Voyage-3-large) | ~70% | ~$1.50 | +| Phase 2 (Month 2) | + BM25 hybrid (bm25s free local) | ~78% | ~$1.50 | +| Phase 3 (Month 3) | + Voyage rerank-2 + Contextual prefix | ~92% | ~$4-5 | + +### Stack validated cross-industry + +- Voyage AI embedding (Anthropic partner, multilingual 26 lang) +- Qdrant local (Rust binary, "leading agent memory backend 2026") +- FastMCP Python (official Anthropic SDK) +- SQLite event log + Streamlit dashboard 7 pages +- Pre-commit hook re-index delta + +### Multi-agent cost reality (Anthropic warn 8-10× multiplier) + +``` +Per entity blanket Cách A: + Em main: ~120K + 4 sub-agents × ~100K spawn = 400K cumulative + Total: ~520K cumulative billed (not single context window) + +Heavy session 4-agent spawn: + Lazy: ~700K effective billed + Cách A: ~560K (-20% from multi-agent shared cache) +``` + +### Plan I NEW — RAG Setup Implementation (defer) + +**Trigger:** Bro confirm 5 dự án path + stack + pilot choice + Voyage API key + disk cleanup 5-8GB. + +**Schedule:** Dedicated session 10-14h weekend (per `feedback_drastic_refactor_scope`). + +**Phase rollout:** +- Phase 1 single project pilot 4-week trial +- Phase 2-3 upgrade incremental conditional on Phase 1 success +- Cost realistic: ~$2-5/month total cho 5 projects + +### Deliverables + +- ✅ `docs/rag-setup-plan.md` (commit `1f8e9af` 1223 LOC + extend S21 t2 ~300 LOC = ~1500 LOC final) +- ✅ Memory `feedback_rag_hybrid_pattern.md` (NEW cross-project reusable) +- ✅ MEMORY.md index +1 entry +- ✅ Session log this chốt +- ⏭ Implementation defer chờ trigger + +### Em main solo S21 turn 2 (no SOLUTION_ERP sub-agent spawn) + +3 spawn này session — KHÔNG phải 4 SOLUTION_ERP sub-agents: +- claude-code-guide × 2 (generic agent for Anthropic + industry research) +- 4 SOLUTION_ERP sub-agents (Inv/Imp/Rev/CICD) vẫn seeds-only + +### State chốt S21 turn 2 + +| Metric | Trước | Sau | Δ | +|---|---|---|---| +| DB tables | 59 | 59 | 0 | +| Migrations | 27 | 27 | 0 | +| Endpoints | ~142 | ~142 | 0 | +| FE pages | 34 | 34 | 0 | +| Unit tests | 81 | 81 | 0 | +| Gotchas | 44 | 44 | 0 | +| **Memory entries** | 16 | **17** | **+1** (RAG hybrid pattern) | +| Skills | 6 | 6 | 0 | +| Sub-agents | 4 seeds-only | 4 seeds-only | 0 | +| **Commits S21 cumulative** | 2 | **4** | **+2** | +| **Plan files** | 0 | **1** (`rag-setup-plan.md`) | **+1** | + +--- ## TL;DR Session 21 turn 1 — Add cicd-monitor (4th sub-agent, Path A chốt) diff --git a/docs/STATUS.md b/docs/STATUS.md index e5a4b38..ceeef54 100644 --- a/docs/STATUS.md +++ b/docs/STATUS.md @@ -2,7 +2,8 @@ > **Update rule:** trước khi bắt đầu 1 task → ghi row vào `🔥 In Progress`. Xong → chuyển sang `✅ Recently Done`. -**Last updated:** 2026-05-12 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`, CI skipped per path filter (`**/*.md` paths-ignore docs-only). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động (bundle hash unchanged / mig drift prod / endpoint 500) — recurring blind spot pattern em main solo S20 quên verify ~30% push. Cost reality update: ~750K spawn setup (3 → 4 agents) · ~1.35M heavy session · ~700K optimized cached. Stats: 4 sub-agents seeds-only (+1 cicd-monitor green) · 16 memory entries (no new, update existing `feedback_multi_agent_setup.md` 3 → 4 agents narrative) · 27 mig · 59 tables · ~142 endpoints · 81 test unchanged · 44 gotcha unchanged · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work S21 t1 nên KHÔNG có findings — em main solo via context + Write file).**) +**Last updated:** 2026-05-12 1800 (Session 21 turn 2 — **🎯 RAG Hybrid setup planning + Cách A validation deep dive. 2 commit (`1f8e9af` plan save 1223 LOC + this chốt). Em main solo (no SOLUTION_ERP sub-agent spawn), delegate claude-code-guide × 2 research Anthropic + community practice. Decision chốt: Cách A defensive (giữ blanket 120K em main + RAG retrieve supplement) over Cách B aggressive (cắt 60-70% blanket). Industry-validated cross 4 Anthropic blog + 5 community tools (Cursor/Continue/Cline/Aider all hybrid). Stack: Voyage-3-large + Qdrant local + FastMCP Python + Streamlit dashboard 7 pages + SQLite event log. Multi-agent cost reality: 4 agents → ~520K cumulative blanket → heavy session ~560K (Cách A) vs ~700K (lazy), saving -20%. 3-layer pattern Phase 1-3 rollout (Layer 1 vector → Layer 2 +BM25 → Layer 3 +reranking, recall ~70% → ~92%). Stats: +1 memory entry (`feedback_rag_hybrid_pattern.md`) +1 plan file (`rag-setup-plan.md` 1500 LOC final). 4 sub-agents vẫn seeds-only. Plan I NEW deferred chờ bro confirm 5 dự án path + stack + Voyage API key + disk cleanup 5-8GB.**) +**S21 turn 1:** 2026-05-12 0030 (Session 21 turn 1 — **🎯 Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier). 1 commit `f1c61c9` pushed `36e21c8..f1c61c9 main -> main`, CI skipped per path filter (`**/*.md` paths-ignore docs-only). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động (bundle hash unchanged / mig drift prod / endpoint 500) — recurring blind spot pattern em main solo S20 quên verify ~30% push. Cost reality update: ~750K spawn setup (3 → 4 agents) · ~1.35M heavy session · ~700K optimized cached. Stats: 4 sub-agents seeds-only (+1 cicd-monitor green) · 16 memory entries (no new, update existing `feedback_multi_agent_setup.md` 3 → 4 agents narrative) · 27 mig · 59 tables · ~142 endpoints · 81 test unchanged · 44 gotcha unchanged · 6 skills unchanged. KHÔNG flush 3 agent MEMORY.md (chưa spawn work S21 t1 nên KHÔNG có findings — em main solo via context + Write file).**) **S20 wrap:** 2026-05-11 22:00 (Session 20 wrap turns 1-12 — **🎯 14 commit `9dee00d` → `ae1814c`. PE Detail UI restructure 3 yêu cầu (t1-5) + Manual budget drop tên (t6) + Mig 27 admin menu eOffice (t7) + NCC palette 5-màu cycle + Winner icon ✓ đậm + AddSupplier auto-fill master + Responsive laptop nhỏ 4-tầng pattern (t8-11) + Multi-agent infrastructure setup 3 sub-agents (t12). 27 mig (+1) · 59 tables · ~142 endpoints (+1) · 34 FE pages (+1) · 61 menu key (+1) · 81 test pass unchanged · 44 gotcha · 16 memory entries (+2) · 3 sub-agents NEW. Phase 9 UAT iteration mode.**) **S20 turn 7:** 2026-05-11 17:00 (Session 20 turn 7 — **🎯 Admin Ẩn/Hiện + Đổi tên menu eOffice (Mig 27). 5 chunk `2ea2d27`→`ef394f8`→`059bfcb`→`1ed6530`→Chunk E Docs. User Q2=b: DisplayLabel CHỈ áp fe-user, admin sidebar giữ Label gốc. Domain MenuItem +IsVisible(true) +DisplayLabel(200). Mig 27 AddVisibilityAndDisplayLabelToMenuItems. BE PATCH /api/menus/{key} [Authorize Policy=Permissions.Update]. NEW FE-admin MenuVisibilityPage ~210 LOC (table inline edit per-row + Save dirty + Khôi phục mặc định + Toggle Eye/EyeOff + 4 StatCard). fe-user Layout filterForUser 2 tầng (USER_HIDDEN_KEYS hardcode + !isVisible dynamic) + effectiveLabel(displayLabel || label) replace 3 callsite. fe-admin Layout KHÔNG đụng. +1 menu key MenuVisibility "Menu eOffice" leaf System Order=94. 27 mig, 59 tables, ~142 endpoints, 34 FE pages, 81 test pass (Q4 UAT defer).**) **S20 prev:** 2026-05-11 (Session 20 — **🎯 PE Detail UI restructure 3 yêu cầu user UX. 4 chunk per-commit `9dee00d` → `2bba851` → `f2f01f4` → (current Chunk D Docs).** Q1=a (giữ Section "Chọn NCC TP" riêng), Q2=a "1 hạng mục trước tiên" (NCC shared, demo 1 hạng mục), Q3=a (chỉ hiện NV đã ký), Q4 public luôn (skip dotnet test mỗi chunk theo memory `feedback_uat_skip_verify`, vẫn `npm run build` × 2 app mỗi chunk vì có rename/remove function). **Chunk A (`9dee00d`)**: BE `CreatePurchaseEvaluationCommandHandler` thêm 1 PurchaseEvaluationDetail mặc định khi tạo phiếu — GroupCode="01", GroupName="Hạng mục chính", NoiDung=TenGoiThau, DonGiaNganSach=ThanhTienNganSach=Budget.TongNganSach hoặc BudgetManualAmount fallback 0; Changelog Insert audit. FE reorder PeDetailTabs (mirror 2 app) 1.Thông tin / 2.Hạng mục (lên #2) / 3.Chọn NCC / 4.NCC tham gia / 5.Ý kiến. **Chunk B (`2bba851`)**: ItemsTab restructure thành list `HangMucCard` (1 card / 1 hạng mục, expanded=true mặc định cho 1 hạng mục demo). Header card: GroupCode + NoiDung + 3 stat (KL/ĐG/TT) + NS link Δ nếu có + Pencil/Trash actions + ▼/▶ toggle expand. Expand body: NCC inline table columns NCC / Liên hệ / Điều khoản TT / **File báo giá** / ĐG chưa VAT / ĐG có VAT / Thành tiền / Action. Quote inline click cell → QuoteDialog cũ reuse. Add NCC + Sửa NCC reuse AddSupplierDialog/EditSupplierDialog cũ. Winner ✓ button mỗi NCC row. Drop function `SuppliersTab` (dead code ~134 LOC, replace bằng HangMucCard expand panel). Giữ AddSupplierDialog + EditSupplierDialog + SupplierAttachmentsCell (HangMucCard call lại). Section layout cuối: 1.Thông tin / 2.Hạng mục + Báo giá NCC (nested) / 3.Chọn NCC TP thắng thầu / 4.Ý kiến cấp duyệt — 4 section. **Chunk C (`f2f01f4`)**: Section Ý kiến restructure render layer (KHÔNG đụng Mig 26 schema — vẫn UPSERT 1 row / Level). LevelOpinionsSectionV2 forEach step → 1 `StepOpinionsBox` (replace grid-cols-2 cho N approver). Box header: "Bước N — Tên" + dept badge emerald + "X/Y đã duyệt" counter. Body: filter opinions theo step.order → sort levelOrder asc, signedAt asc → render `StepOpinionEntry` per signed opinion (tên NV + Cấp badge slate + admin override badge amber nếu có + emerald rounded-full timestamp + comment text). NV chưa duyệt KHÔNG hiển thị (Q3=a). Drop function `LevelOpinionBox` (replaced). Mirror fe-admin + fe-user. Verify build pass cả 2 app sau khi catch TS6133 `SuppliersTab` + `SupplierAttachmentsCell` unused (đã giải quyết: drop SuppliersTab, restore SupplierAttachmentsCell vào HangMucCard cột "File báo giá"). 81 test pass (no change — UAT defer)**) @@ -64,6 +65,7 @@ | Ngày | Ai | Task | Commit | |---|---|---|---| +| 2026-05-12 | Claude | **🎯 SESSION 21 turn 2 — RAG Hybrid setup planning + Cách A validation deep dive (2 commit `1f8e9af` plan save + this chốt)** — Sau S21 turn 1 chốt cicd-monitor, user clarify 5 dự án future > 1M MD tokens → cuộc thảo luận deep ~15 turn về RAG infrastructure. **Em main solo** (no SOLUTION_ERP sub-agent spawn), delegate **claude-code-guide × 2** spawn agent research Anthropic + community practice. **Q&A deep dive 10 topics**: (1) RAG fundamentals + Vector DB Qdrant role, (2) Embedding "AI nhúng" + Voyage AI cost mechanics ($0.18/M tokens), (3) Multi-project shared architecture (5 projects → single Qdrant + per-collection), (4) Audit procedure 3-tier (weekly auto + monthly deep + quarterly major), (5) UI/UX Streamlit dashboard 7 pages design (overview + drill-down + compare + audit + cost + change + admin), (6) Cách A defensive (giữ blanket 120K) vs Cách B aggressive (cắt 60-70%), (7) Reasoning depth comparison lazy 60% → A 90% → B 75-80%, (8) Industry validation Anthropic + Cursor + Continue + Cline + Aider all hybrid, (9) Multi-agent cost reality 8-10× multiplier ~520K cumulative blanket 5 entities, (10) 3-layer hybrid pattern Anthropic Contextual Retrieval Sept 2024. **Quyết định chốt Cách A** (defensive hybrid: giữ blanket 120K em main + RAG retrieve supplement, sub-agent spawn baseline ~100K each, 4 agents = ~400K cumulative, heavy session billed ~560K saving -20% vs lazy 700K, quality recall ~85%) over **Cách B bỏ** (aggressive cut 60-70% vi phạm priority em main control flow strong + reasoning fragmented + UX latency +1-2s/state Q + risk severe RAG fail). **Why Cách A** (bro priority chốt): em main control flow strong preserve, decision quality 90% multi-source cohesive, wall-clock -20% (12 phút vs 16), risk-averse graceful fallback, multi-agent leverage cache 70-90%, industry-validated 9 sources. **3-layer hybrid Phase rollout**: P1 (W1-4) vector only Voyage-3-large recall ~70% $1.50/mo · P2 (M2) +BM25 bm25s free recall ~78% $1.50/mo · P3 (M3) +Voyage rerank-2 + Contextual prefix recall ~92% $4-5/mo. **Stack validated** cross-industry: Voyage AI embedding (Anthropic partner, multilingual 26 lang, $0.36 initial), Qdrant local (Rust 50MB, agent-native 2026 leader, ~3GB disk 5 project), FastMCP Python (official SDK, ~100 LOC), SQLite event log (5 tables + audit history), Streamlit 7 pages. **Plan I NEW deferred** — trigger bro confirm 5 dự án path + stack + pilot + Voyage API key + disk cleanup → dedicated session 10-14h weekend (per `feedback_drastic_refactor_scope` rule). **Deliverables**: `docs/rag-setup-plan.md` 1223 LOC commit `1f8e9af` + extend S21 t2 ~300 LOC = ~1500 LOC final, memory `feedback_rag_hybrid_pattern.md` cross-project reusable, session log this chốt, MEMORY.md index +1 entry. **CI skipped** path filter (`.md`). **4 sub-agents vẫn seeds-only** (KHÔNG spawn S21 turn 2 nên KHÔNG flush MEMORY.md per §6.5 KHÔNG add noise). Tests baseline 81 unchanged. | `1f8e9af` (plan save) · this chốt (commit final) | | 2026-05-12 | Claude | **🎯 SESSION 21 turn 1 — Add con thứ 4 cicd-monitor (Path A — post-deploy verifier green READ tier, 1 commit `f1c61c9`)** — User chốt Path A sau pre-flight Plan G Trial Week 1: thêm sub-agent thứ 4 chuyên post-deploy verify (Gitea Actions poll + bundle hash 2 app verify + sqlcmd mig prod = repo latest + endpoint smoke). Trade-off: +~150K spawn extra mỗi run, đổi lại catch deploy ship fail tự động — recurring blind spot pattern em main solo S20 quên verify ~30% push. **2 file mới**: `.claude/agents/cicd-monitor.md` (~7KB) — system prompt + 8-step workflow (verify push → poll Gitea API → fail log grep → live curl smoke → bundle hash × 2 app + verify changed → sqlcmd mig prod = repo latest → report PASS/FAIL/PARTIAL/TIMEOUT/SKIPPED-DOCS) + 5-stage report table + gotcha #25/#39/#40/#41/#44 cross-ref + skill `iis-deploy-runbook`/`dependency-audit-erp`/`ef-core-migration` preload + Anti-pattern 9 rules. `.claude/agent-memory/cicd-monitor/MEMORY.md` (~5KB seed) — recurring CI bug patterns + 5-stage checklist + baseline build/bundle metrics + bearer test pattern admin/nv.test. **1 file update repo**: `.claude/agents/README.md` — 4-agent architecture diagram (green slot mới) + decision tree (after push code + prod issue diagnose branches) + memory routine 4 SendMessage + skills preload 4 agents + cost reality table 564K → 750K spawn / 1.2M → 1.35M heavy / 600K → 700K optimized + trial workflow Week 1-3 CI/CD Monitor spawn integrated + pass criteria + catch ≥1 deploy ship fail. **Memory user-level update**: `feedback_multi_agent_setup.md` — title 3 → 4 sub-agents, decision tree +CI/CD Monitor invocation branches (after push + user prod issue), skills preload list +CI/CD Monitor (iis-deploy-runbook + dependency-audit-erp + ef-core-migration), cost table update + trade-off rationale (recurring blind spot ~30% push S20). **CI skipped**: all 3 file changed `.md` → match `paths-ignore: '**/*.md'` per gotcha #41 → no Gitea Actions run → no IIS deploy (expected — agent infra là local Claude Code, không cần present trên prod). Push success `36e21c8..f1c61c9 main -> main`. **3 (now 4) sub-agents vẫn seeds-only**: chưa spawn work nào — em main solo via context paste + Write file. KHÔNG flush 3 agent MEMORY.md (chưa spawn work = không findings, per §6.5 KHÔNG add noise entry). cicd-monitor MEMORY.md có entry "setup 2026-05-12" trong seed. Trial Week 1 kick-off ở Session 21 turn 2+ với Plan B Contract V2 wire Mig 28+29 candidate (mirror PE pattern S17-S19 proven 1×). Tests baseline 81 unchanged (no test added — docs-only commit). | `f1c61c9` (Setup cicd-monitor + README 4-agent + memory update) | | 2026-05-11 | Claude | **🎯 SESSION 20 turns 6 + 8-12 — PE polish (NCC palette + autofill + responsive) + Multi-agent setup (7 commit `f568945` → `ae1814c`)** — Sau turn 7 wrap-up Mig 27, user iterate 7 polish/feature lớn nhỏ. **Turn 6 (`f568945`)** Manual budget "Nhập tay" drop tên field — 3 file × 2 app mirror (BudgetFieldRow + WorkspaceCreateView + HeaderForm) bỏ Input "Tên" UI khỏi manual mode, BE save `budgetManualName: null` luôn, VND format `1.000.000` + suffix đ. **Turn 8 (`3ec7b5a`)** AddSupplier +Số tiền inline + NCC 5-màu palette + Winner badge "🏆 Trúng thầu" — AddSupplierDialog +prop detailId? +form thanhTien, sequential POST /suppliers (response {id}) → POST /quotes (nếu detailId + thanhTien > 0). NCC_PALETTES const 5 màu literal Tailwind (blue/purple/sky/teal/pink) cycle theo idx. Winner row override emerald-500 border-l + bg-emerald-100/70 + shadow-sm + ring-1 emerald-300 + badge rounded-full bg-emerald-600 text-white "🏆 Trúng thầu". **Turn 9 (`83aae8e`)** User feedback bỏ badge → revert icon ✓ stick cũ nhưng đậm hơn (text-base font-bold emerald-700) + tên NCC winner text-emerald-900 + hover transition (winner hover:bg-emerald-200/70, non-winner hover:bg-white/80 hover:shadow-sm). **Turn 10 (`66551db`)** AddSupplierDialog auto-fill từ master data khi chọn NCC dropdown — onChange lookup picked supplier, setForm ghi đè 4 field (contactName ← contactPerson / contactPhone ← phone / contactEmail ← email / note ← note). Hint emerald "✓ Đã tự điền từ Master". User vẫn override được. **Turn 11 (`6e338f7`)** Responsive cho laptop màn hình nhỏ 1280-1366px — 4-tầng pattern: sidebar fe-admin + fe-user `w-72` → `w-60 xl:w-72` (+48px lg) / PE Workspace 2-panel `lg:[320px_1fr]` → `lg:[260px_1fr] xl:[320px_1fr]` (+60px lg) / Section padding `px-5 py-4` → `px-3 py-3 sm:px-5 sm:py-4` (+16px xs) / HangMucCard `gap-3 p-3` → `flex-wrap gap-2 p-2 sm:gap-3 sm:p-3` (+8px xs). Net gain trên 1366px ~+132px width cho NCC table area. Memory `feedback_responsive_laptop_breakpoint.md` capture pattern. **Turn 12 (`ae1814c`)** SETUP MULTI-AGENT INFRASTRUCTURE 3 sub-agents (Investigator READ cyan + Implementer WRITE conditional yellow + Reviewer READ adversarial red) + em main coordinator. Pre-flight decision gate 6/6 ✅. Phase 1-4 execute: `.claude/agents/` 4 file (README ~9.7KB + investigator + implementer + reviewer) + `.claude/agent-memory/` 3 MEMORY.md seed (~6KB each). Customize SOLUTION_ERP: skills preload mỗi agent (reuse 6 skills hiện có) + bearer test (admin@solutions / nv.test@solutions) + prod UAT URL + Phase 9 UAT mode + DB Dev/Design distinct. Windows MAX_PATH pitfall handled — drop `isolation: worktree` khỏi implementer.md (project path 51 chars + Dropbox-managed nested overflow 260+ chars). Memory `feedback_multi_agent_setup.md` capture decision gate + ACCEPT/REFUSE criteria + NAMGROUP s41-s43 ROI reference. 3 agents **chưa spawn work** ở S20 turn 12 — seeds-only state. Trial Week 1 candidate Contract V2 wire Mig 28+29 (mirror PE pattern proven). **Stats cumulative S20:** 27 mig (+1 Mig 27 from turn 7) · 59 tables · ~142 endpoints (+1 PATCH /menus/{key}) · 34 FE pages (+1 MenuVisibilityPage) · ~61 menu key (+1) · 81 test pass unchanged · 44 gotcha unchanged · **16 memory entries (+2: responsive + multi-agent)** · 6 skills unchanged · **3 sub-agents NEW** · 14 commits S20. | `f568945` (t6) · `3ec7b5a` (t8) · `83aae8e` (t9) · `66551db` (t10) · `6e338f7` (t11) · `ae1814c` (t12) · (current Docs t13 wrap) | | 2026-05-11 | Claude | **🎯 SESSION 20 turn 7 — Admin Ẩn/Hiện + Đổi tên menu eOffice (Mig 27, 5 chunk `2ea2d27`→`ef394f8`→`059bfcb`→`1ed6530`→Chunk E Docs)** — User UAT yêu cầu "tính năng Ẩn Hiện và Đổi tên hiển thị của các Menu bên ngoài Office, làm trong Trang Admin Page". Hỏi xác nhận "chưa có" — đúng. User clarify Q2=b "edit hiển thị bên ngoài, chỉ của eOffice thôi" → admin sidebar luôn giữ Label gốc, DisplayLabel CHỈ áp fe-user. Q1=a global (không per-role), Q3=a giữ USER_HIDDEN_KEYS hardcode + tầng IsVisible dynamic combine, Q4 UAT skip test. **Chunk A** Domain MenuItem +IsVisible bool=true +DisplayLabel string?(200) + EF config + Migration 27 AddVisibilityAndDisplayLabelToMenuItems (2 AddColumn) — 3-file rule, apply LocalDB _Dev + _Design OK. **Chunk B** BE API: MenuNodeDto + MenuItemDto +isVisible +displayLabel (sau CRUD flags trước Children). GetMyMenuTreeQueryHandler pass through, KHÔNG filter server-side — 2 FE app tự quyết. UpdateMenuItemCommand + Validator + Handler (trim DisplayLabel whitespace → null). MenusController +PATCH /api/menus/{key} [Authorize Policy=Permissions.Update] body {isVisible, displayLabel}. **Chunk C** Domain MenuKeys +MenuVisibility const + All[] + DbInitializer +leaf "Menu eOffice" Icon=Eye Order=94 (Workflows shift 94→95). Manual seed Mig 27 LocalDB _Dev (INSERT MenuItems + Permissions Admin). FE Admin: types/menu.ts +isVisible +displayLabel, lib/menuKeys.ts +MenuVisibility, Layout resolver +/system/menu-visibility, App.tsx +Route. NEW pages/system/MenuVisibilityPage.tsx ~210 LOC: PageHeader + 4 StatCard (Tổng/Hiển thị/Đã ẩn/Đã đổi tên) + Search + Table 5 cột (Key mono + parentKey ↳ / Tên gốc / Input "Tên hiển thị" inline placeholder "Mặc định: {label}" / Toggle button emerald-Eye / amber-EyeOff / Lưu khi dirty + Khôi phục khi custom). PATCH endpoint, invalidate ['menus','all'] + ['my-menu'] trigger live update sidebar. Row hidden bg-amber-50/40 highlight, custom label bg-brand-50/40. **Chunk D** fe-user types/menu.ts mirror. Layout.tsx filterForUser 2 tầng (USER_HIDDEN_KEYS structural + !isVisible dynamic). Helper effectiveLabel(n) = displayLabel?.trim() || label. Replace 3 callsite {node.label} → {effectiveLabel(node)}. USER_FIXED_TOP "__inbox" entry +isVisible:true cho type check pass. **fe-admin Layout KHÔNG đụng** — admin sidebar render Label gốc + show hết menu (user Q2=b). **Chunk E Docs (current)**. **Stats Session 20 turn 7**: 26→27 mig, 59 DB tables (no change), ~141→142 endpoints, 33→34 FE pages, ~60→61 menu key, 81 test pass (Q4 UAT defer), 44 gotcha (no new). Memory entries 14 (no new). | `2ea2d27` (A Mig 27) · `ef394f8` (B BE API) · `059bfcb` (C FE admin) · `1ed6530` (D FE user) · (current E Docs) | diff --git a/docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md b/docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md new file mode 100644 index 0000000..ce0471e --- /dev/null +++ b/docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md @@ -0,0 +1,318 @@ +# Session 21 turn 2 — RAG Hybrid setup planning + Cách A validation deep dive + +**Date:** 2026-05-12 (tiếp S21 turn 1 từ 0030 sáng — sang sáng-chiều-tối discussion deep RAG) +**Dev:** Claude (Opus 4.7 1M Max — em main solo, no SOLUTION_ERP sub-agent spawn) +**Base commit:** `3a34831` (S21 turn 1 chốt cicd-monitor) +**Commits:** `1f8e9af` (RAG plan save) + this chốt (2 commit S21 turn 2) + +## Bối cảnh + +Sau S21 turn 1 chốt cicd-monitor (4 sub-agents seeds-only), bro đặt câu hỏi về RAG infrastructure cho **5 dự án future > 1M MD context**. Cuộc thảo luận deep ~15+ turn covering: + +1. RAG fundamentals + Vector DB role +2. Embedding model "AI nhúng" + Voyage AI cost mechanics +3. Multi-project shared architecture (5 projects) +4. Audit procedure 3-tier + change tracking SQLite +5. UI/UX Streamlit dashboard 7 pages +6. Cách A defensive (giữ blanket) vs Cách B aggressive (cắt 60-70%) +7. Reasoning depth comparison (lazy current vs Cách A vs Cách B) +8. Industry validation via claude-code-guide research +9. Multi-agent cumulative cost reality (4 agents → ~520K cumulative blanket) +10. 3-layer hybrid pattern (Anthropic Contextual Retrieval: embeddings + BM25 + reranking) + +## Deliverables + +### File mới — `docs/rag-setup-plan.md` (commit `1f8e9af`, 1223 LOC) + +Cross-project reference plan với 12 section comprehensive: + +1. Context + Why +2. Architecture overview (6-layer diagram) +3. BLANKET load list (~100K, 28% MD) +4. RAG store list (~254K, 72% MD) +5. Tool stack recommend +6. Setup scripts copy-paste ready (~250 LOC Python) +7. Audit procedure 3-tier (weekly/monthly/quarterly) +8. Multi-AI client access (Claude Code + Desktop + Cursor + GPT-4) +9. Timeline rollout 10-14h dedicated session +10. Caveats + risks +11. Success metrics + decision gate +12. Future enhancements + +### File extend S21 turn 2 (this chốt commit) + +Add 2 sections vào `rag-setup-plan.md`: +- Section 13: Multi-agent cumulative cost reality (Anthropic 8-10× warning) +- Section 14: 3-layer hybrid RAG upgrade path (Phase 1-3 Anthropic Contextual Retrieval) + +## Quyết định chốt — Cách A vs Cách B + +### Chọn **Cách A** (defensive hybrid) ⭐ + +``` +Blanket: GIỮ NGUYÊN ~120K em main (35% MD) +RAG: ADD as supplement (retrieve on-demand) +Multi-agent: 4 sub-agents share retrieve cache +Sub-agent spawn blanket: ~80-100K each (auto-inject + skills + spec) +Cumulative blanket 5 entities: ~520K +Heavy session billed: ~560K (saving 20% vs lazy) +``` + +**Why Cách A (priority bro: em main control flow strong):** +1. ✅ State ownership strong — em main biết direct project state +2. ✅ Decision quality 90% (vs Cách B 75-80% do fragmentation) +3. ✅ Wall-clock per task 12 phút (vs Cách B 16 phút) +4. ✅ UX smooth — em response fast direct cho state question +5. ✅ Risk-averse — graceful degradation nếu RAG fail (blanket fallback) +6. ✅ Multi-agent leverage cache hit 70-90% common queries +7. ✅ Quality recall +25-55pp (5-15 sources cross-validated vs lazy 1-3) + +### Bỏ **Cách B** (aggressive cut) + +``` +Blanket: CẮT MẠNH 60-70% (40-50K còn lại) +RAG: PRIMARY access mechanism cho mọi thứ +``` + +**Why bỏ:** +1. ❌ Vi phạm priority "em main control flow strong" +2. ❌ State ownership weak — phải retrieve mỗi câu state question +3. ❌ UX latency +1-2s per state Q +4. ❌ Decision quality 75-80% do reasoning fragmentation +5. ❌ Risk severe nếu RAG fail (em main ngơ ngác) +6. ❌ Anthropic research warn: "context rot inevitable cutting aggressively" +7. ❌ Cascade retrieve problem (1 task → 2-3 retrieves) + +## Industry validation via claude-code-guide research + +Spawn 2 lần claude-code-guide agent research (NOT SOLUTION_ERP sub-agents): + +### Round 1: Anthropic setup inventory (10 features) + +- Memory tool beta (`content-management-2025-06-27`) +- Prompt caching extensions (5min/1h beta) +- Files API beta (`files-api-2025-04-14`) +- Citations stable +- MCP servers official + community (9,400+ in 2026) +- Voyage AI embedding partnership +- Context compaction tool +- Claude Agent SDK orchestration +- Batch API 50% discount +- RAG best practices Anthropic official + +### Round 2: Industry practice validation + +**5/5 dimensions Cách A fit Anthropic explicit recommend:** + +| Dimension | Bro setup | Anthropic pattern | +|---|---|---| +| Context approach | Hybrid blanket+RAG | ✅ Recommended explicit | +| Sub-agent count | 4 | ✅ "3-5 optimal" | +| MD scale | 5 project > 1M | ✅ "Use RAG khi >200K" | +| Stack | Qdrant+Voyage+MCP | ✅ Production validated | +| Coordination | Em main + agents | ✅ "Coordinator+workers" | + +**Source 4 Anthropic blog posts:** +- "Effective Context Engineering for AI Agents" (2025) +- "Contextual Retrieval" (Sept 2024 flagship) +- "Effective Harnesses for Long-Running Agents" +- "Multi-Agent Coordination Patterns" + +**Community consensus (Tier 1 tools all Hybrid):** +- Cursor IDE `@codebase` indexing +- Continue.dev MCP transport +- Cline / Roo-Cline filesystem + AST + dynamic context +- Aider code-as-graph +- Sourcegraph Cody graph-aware + +→ **ZERO** tools adopt aggressive Cách B pattern. **ALL** evolve toward Cách A hybrid. + +## 3-layer hybrid pattern (Anthropic Contextual Retrieval Sept 2024) + +``` +Layer 1: Embeddings (Voyage-3-large) + → Semantic + synonym + multilingual catch + Performance: baseline ~50% recall + ++ Contextual prefix (Haiku-generated context): + → +35% improvement = ~67% recall + +Layer 2: BM25 (bm25s Python lib free) + → Exact identifier + technical terms catch + + Layer 1 = ~75% recall + +Layer 3: Reranking (Voyage rerank-2) + → Cross-attention deep relevance + + Layer 1+2 = ~85% recall +``` + +**Phase rollout incremental:** + +| Phase | Layer | Recall | Cost/month | +|---|---|---|---| +| Phase 1 (Week 1-4) | Layer 1 vector only | ~70% | ~$1.50 | +| Phase 2 (Month 2) | + Layer 2 BM25 | ~78% | ~$1.50 (BM25 free local) | +| Phase 3 (Month 3) | + Layer 3 + Contextual | ~92% | ~$4-5 | + +## Multi-agent cost reality (Anthropic warn 8-10× multiplier) + +``` +Per entity blanket: + Em main: ~120K + Sub-agent each spawn: ~80-100K (auto-inject baseline + skills + spec) + +Cumulative blanket 5 entities = ~520K + +Heavy session full 4-agent spawn: + Lazy current: ~700K effective billed + Cách A: ~560K (-20% saving from multi-agent shared cache) + +Cost multiplier vs solo em main: ~8-10× +Anthropic acknowledged: "Expect 3-10× token multiplier" +``` + +**Saving Cách A breakdown (-140K):** +- Em main lazy Read → retrieve: -25K +- 4 agents lazy Read → cached retrieve: -160K (share cache 70-90%) +- Reasoning streamlined: -20K +- Plus +60K retrieve cost added +- Net: -145K ≈ -20% per heavy session + +## Stack validated + +| Component | Tool | Reason | +|---|---|---| +| **Vector DB** | Qdrant local | Rust binary 50MB, agent-native 2026 leader | +| **Embedding** | Voyage-3-large | Anthropic partner, multilingual 26 lang, $0.18/M | +| **MCP server** | FastMCP Python | Official Anthropic SDK | +| **Chunking** | Custom adaptive Python | §6.5 compliant, transparent | +| **Tracking** | SQLite local | Event log + audit + cost analytics | +| **Dashboard** | Streamlit custom | 7 pages multi-project | +| **Re-index** | Pre-commit hook | Native git, delta on commit | + +**Total cost 5 projects:** ~$1.50-5/month depending Phase. ~$0.50 initial embed. + +## Em main solo S21 turn 2 (no SOLUTION_ERP sub-agent spawn) + +``` +Spawn này session: + ✅ claude-code-guide × 2 (generic agent for Anthropic research) + ❌ Investigator / Implementer / Reviewer / CI/CD Monitor (vẫn seeds-only) + +Em main solo qua context paste + Write file + research delegate. +``` + +## Skills check + +6 skills hiện tại unchanged. Decision KHÔNG add skill mới cho RAG vì: +- RAG là decision/architectural pattern, không phải workflow project-specific +- Cross-project applicable → memory entry phù hợp hơn skill +- Per rule §9.5 anti-pattern "viết skill chỉ để có thêm" +- Defer skill creation sau Phase 1 trial validate + +## Tests + +Unit test 81 unchanged (0 test added — pure planning, không code change). + +## Memory entry mới + +**`feedback_rag_hybrid_pattern.md`** (NEW — cross-project pattern reusable): +- Decision Cách A rationale (control flow priority) +- Multi-agent cost reality (8-10× multiplier) +- 3-layer hybrid pattern Phase 1-3 incremental rollout +- Stack validated (Voyage + Qdrant + FastMCP) +- When to apply / when NOT apply triggers +- Anti-patterns documented +- Anthropic 4 blog cross-ref + +## Verify chain + +| Check | Status | +|---|---| +| dotnet build | Không chạy (no .cs change) | +| dotnet test | Không chạy (no test added — pure docs) | +| npm build | Không chạy (no FE change) | +| Push origin | Pending end of turn | +| CI Gitea Actions | Skip per path filter `.md` | +| IIS prod deploy | KHÔNG xảy ra (CI skip, expected) | + +## Docs updates + +- ✅ `docs/STATUS.md` — Last updated S21 turn 2 + Recently Done row top +- ✅ `docs/HANDOFF.md` — TL;DR Session 21 turn 2 section + Last updated +- ✅ `docs/rag-setup-plan.md` — extend +Section 13 (cost reality) +Section 14 (3-layer) +- ✅ `docs/changelog/sessions/2026-05-12-1800-s21-turn2-rag-planning.md` — file này +- ✅ Memory user-level new: `feedback_rag_hybrid_pattern.md` +- ✅ Memory user-level: `MEMORY.md` index + 1 entry pointer +- ⏭ KHÔNG đụng: rules.md / architecture.md / gotchas.md / database/* / flows/* / skills/* / CLAUDE.md (no real change cho 8 file này) +- ⏭ KHÔNG flush 4 sub-agent MEMORY.md (chưa spawn, per §6.5 KHÔNG add noise) + +## Handoff Session 21 turn 3+ + +### Plan I NEW — RAG Setup Implementation + +**Trigger:** Bro confirm 5 dự án path + stack + pilot choice + Voyage API key + disk cleanup 5-8GB free. + +**Schedule:** Dedicated session 10-14h weekend (per memory `feedback_drastic_refactor_scope` rule). + +**Phases:** +- Phase 1 (Week 1-4): Layer 1 vector embeddings only — ~70% recall — ~$1.50/mo +- Phase 2 (Month 2): + Layer 2 BM25 hybrid — ~78% recall — ~$1.50/mo +- Phase 3 (Month 3): + Layer 3 Reranking + Contextual — ~92% recall — ~$4-5/mo + +**Pre-flight task:** Spawn 🔵 Investigator audit MD inventory 5 dự án parallel → tinh chỉnh blanket list per project. + +### Plan B Contract V2 wire (vẫn pending S21 turn 1) + +- Trial Week 1 multi-agent kick-off SOLUTION_ERP +- 6 tasks (Mig 28+29 + Service + Controller + FE × 2 + Pin V2) +- 4 sub-agents pipeline coordinate (lần đầu spawn 4 agents thật) + +### Plan C Test gap fill (vẫn pending) + +Bundle Chunk E Plan B — 5 test pending: +- B4 silent 403 regression (gotcha #44 vi phạm §7) +- V2 Service `ApproveV2Async` UPSERT opinion +- Section gộp Chunk C render +- Mig 25 PATCH `/user-selectable` +- Mig 27 PATCH `/api/menus/{key}` + +### Plan D-F-G unchanged + +- D: Hard blockers ops (UAT/SMTP/creds/backup) — BLOCKED chờ user +- F: Audit định kỳ 2026-06-01 (~3 tuần nữa, KHÔNG tự chạy) +- G: Multi-agent trial 4-week (post-S21 t1 + S21 t2 setup complete) + +## Stats cumulative S21 turn 2 + +| Metric | Trước S21 t2 | Sau S21 t2 | Δ | +|---|---|---|---| +| DB tables | 59 | 59 | 0 | +| Migrations | 27 | 27 | 0 | +| Endpoints | ~142 | ~142 | 0 | +| FE pages | 34 | 34 | 0 | +| Unit tests | 81 | 81 | 0 | +| Gotchas | 44 | 44 | 0 | +| **Memory entries** | 16 | **17** | **+1** (RAG hybrid pattern) | +| Skills | 6 | 6 | 0 | +| Sub-agents | 4 seeds-only | 4 seeds-only | 0 (chưa spawn) | +| **Commits S21** | 2 (`f1c61c9` + `3a34831`) | **4** | **+2** (1f8e9af + this chốt) | +| **MD plan files** | 0 | **1** | **+1** (`rag-setup-plan.md` 1223 LOC + 2 section extend) | + +## Cross-ref + +- S21 turn 1 session log: `2026-05-12-0030-s21-cicd-monitor-add.md` +- Plan file: `docs/rag-setup-plan.md` (1223 + extend ~300 LOC = ~1500 LOC) +- Memory new: `feedback_rag_hybrid_pattern.md` (cross-project reusable) +- Industry research: claude-code-guide × 2 spawn agent reports +- 4 Anthropic blog cross-ref trong memory entry + +## Bài học chốt S21 turn 2 + +1. **Em main control flow strong là priority bro** — quyết định Cách A defensive over Cách B aggressive +2. **Multi-agent cost realistic 8-10× solo** — KHÔNG tránh được spawn baseline ~400K cumulative 4 agents +3. **Anthropic recommend 3-layer hybrid pattern** — embeddings + BM25 + reranking compound effect +4. **Industry consensus = hybrid** — Cursor + Continue + Cline + Aider all evolve toward hybrid +5. **Voyage Vietnamese quality cần verify Week 1** — voyage-3-large multilingual nhưng explicit Vietnamese benchmark chưa publish +6. **RAG setup = dedicated session 10-14h** — per `feedback_drastic_refactor_scope` rule +7. **5 projects scale workable** — single Qdrant + per-project collection + ~$2-5/month cost diff --git a/docs/rag-setup-plan.md b/docs/rag-setup-plan.md index 0481ad5..33930f4 100644 --- a/docs/rag-setup-plan.md +++ b/docs/rag-setup-plan.md @@ -1165,6 +1165,361 @@ Mitigation: --- +## 13. Multi-agent cumulative cost reality (Anthropic 8-10× warning) + +> **Added S21 turn 2 (2026-05-12)** — clarification sau khi user catch gap "120K blanket KHÔNG bao gồm 4 agents". + +### Per-entity blanket breakdown + +``` +Em main blanket: ~120K + STATUS + HANDOFF top + rules + architecture + 5 agent .md + + 4 MEMORY.md auto-inject + skills desc + memory critical + + auto-inject system reminders + +Per sub-agent spawn baseline: ~80-100K each + Agent system prompt (~5K) + + 3 skills preload SKILL.md full (~21K, trigger semantic) + + Auto-inject MEMORY.md 25KB first 200 lines (~7K) + + Em main pass spec task (~10-15K) + + Em main paste common context excerpt (~30-50K) + + Auto-inject project context (~10K) + = ~80-100K per sub-agent spawn (per Anthropic docs) + +4 sub-agents cumulative: ~400K + (4 × ~100K each, isolated context windows) + +TOTAL cumulative blanket 5 entities: ~520K + Em main + 4 sub-agents combined (isolated windows, cumulative billing) +``` + +### Context windows are ISOLATED + +``` +KHÔNG phải 5 entities share 520K trong 1 context window 1M. + +Mỗi entity có context window 1M RIÊNG: + Em main → context window 1M, dùng ~120K + Investigator → context window 1M, dùng ~100K + Implementer → context window 1M, dùng ~100K + Reviewer → context window 1M, dùng ~100K + CICD Monitor → context window 1M, dùng ~100K + +→ Mỗi entity LOST-IN-MIDDLE threshold riêng (~700K each) +→ Mỗi entity capacity ~58 tasks before hit hard cap riêng + +NHƯNG billing là CUMULATIVE 520K across all contexts: + Anthropic billing tổng tokens across all 5 windows + → Hit weekly cap nhanh hơn solo em main 4-5× +``` + +### Heavy session token compound effect (Cách A vs lazy) + +**Without RAG (lazy current — 4 agents spawn):** + +``` +Em main: + Blanket: 120K + Lazy Read on-demand: ~50K + Reasoning + coordinate: ~30K + = ~200K subtotal + +4 sub-agents (each): + Spawn blanket: ~100K + Lazy Read inside agent: ~50K + Reasoning + work: ~30K + Each agent: ~180K + ────────────── + 4 agents subtotal: ~720K cumulative + +SendMessage iteration: + 10 round trips × ~30K nominal: 300K nominal + Cache hit 70%: ~90K effective + +TOTAL HEAVY SESSION (lazy): + 200K + 720K + 90K = ~1010K nominal + After cache discount: ~700K effective billed +``` + +**With Cách A RAG:** + +``` +Em main: + Blanket: 120K (unchanged) + RAG retrieve replace lazy Read: ~30K (-20K saving) + Reasoning streamlined: ~25K + = ~175K subtotal (saving 25K) + +4 sub-agents (each): + Spawn blanket: ~100K (unchanged) + RAG retrieve (share cache 70-90% common queries): ~15K + Reasoning streamlined: ~25K + Each agent: ~140K (saving 40K each) + ────────────── + 4 agents subtotal: ~560K (saving 160K total) + +SendMessage iteration: ~90K effective (unchanged) + +TOTAL HEAVY SESSION (Cách A): + 175K + 560K + 90K = ~825K nominal + After cache discount: ~560K effective billed + +SAVING: -140K (-20%) +``` + +### Cost saving breakdown + +| Component | Lazy current | Cách A | Saving | +|---|---:|---:|---:| +| Em main blanket (fixed) | 120K | 120K | 0 | +| Em main lazy Read → RAG retrieve | 50K | 30K | -20K | +| Em main reasoning streamlined | 30K | 25K | -5K | +| 4 agents spawn blanket (fixed) | 400K | 400K | 0 | +| 4 agents lazy Read → cached retrieve | 200K | 60K | **-140K** | +| 4 agents reasoning | 120K | 100K | -20K | +| SendMessage cached | 90K | 90K | 0 | +| **TOTAL EFFECTIVE BILLED** | **~700K** | **~560K** | **-140K (-20%)** | + +→ **Saving 80% từ 4 agents** share retrieve cache (cache hit 70-90% common queries cross-agent). + +→ Em main saving chỉ 25K (blanket unchanged, chỉ optimize Read → retrieve). + +### Multi-agent leverage example concrete + +``` +Task Plan B Contract V2 wire: + 🔵 Inv query "PE V2 schema pattern" → 15K retrieve + cached + 🟡 Imp query same → cache hit 90% → 1.5K effective + 🔴 Rev query same → cache hit 90% → 1.5K effective + 🟢 CICD query same → cache hit 90% → 1.5K effective + Em main query same → cache hit 90% → 1.5K effective + + Cumulative retrieve cost: 15K + 4×1.5K = 21K + +Compare to lazy: + Each agent Read PE V2 file separately + 5 entities × 20K Read = 100K cumulative + + → Saving 79K just for 1 cross-agent query +``` + +### Optimization tips để giảm cumulative + +**Option 1: Spawn ít agents hơn** +- Decision gate 6-criteria mỗi task (per `feedback_multi_agent_setup` rule) +- Solo em main đủ → KHÔNG spawn agent +- Chỉ spawn agent nào THẬT cần +- Trong S20-S21: 4 agents seeds-only, em chưa spawn lần nào → cost ~120K em main thôi + +**Option 2: Tune blanket sub-agent (100K → 80K)** +- Em main pass spec gọn (~10K thay 15K) +- Em main paste common context excerpt thay full (~20K thay 50K) +- Skills preload chỉ description (~3K thay 21K full SKILL.md) + → Trigger SKILL.md full khi semantic match +- Per sub-agent: 100K → 80K +- 4 agents cumulative: 400K → 320K +- Heavy session: 560K → 480K (-15%) + +**Option 3: SendMessage cache aggressive (1h TTL beta)** +- Anthropic extended cache `extended-cache-ttl-2025-04-11` +- Static prompts cache premium WRITE 2× base +- Subsequent reads 0.1× discount +- Multi-agent cùng cache prefix → benefit lớn +- Saving 10-15% additional + +--- + +## 14. 3-layer hybrid RAG upgrade path (Anthropic Contextual Retrieval) + +> **Added S21 turn 2 (2026-05-12)** — Anthropic flagship pattern Sept 2024. + +### Pattern overview + +``` +Anthropic Contextual Retrieval = 3 layers compound: + +Layer 1: Embeddings (Voyage-3-large) + → Semantic + synonym + multilingual catch + ++ Contextual prefix (Haiku-generated context): + Add chunk-specific context BEFORE embed + "This chunk discusses... in context of..." + → Better recall via enriched vector + +Layer 2: BM25 (bm25s Python lib free local) + → Exact identifier + technical terms (function names, error codes, Mig numbers) + ++ Contextual BM25 (same prefix pattern) + +Layer 3: Reranking (Voyage rerank-2) + → Cross-attention deep relevance + → Re-score top 30 candidates → return top 5 truly relevant +``` + +### Performance compound effect + +``` +Baseline (naive vector embeddings): ~50% recall + ++ Contextual embeddings: ~67% recall (-35% failure) + ++ Hybrid Contextual + BM25: ~75% recall (-49% failure) + ++ Reranking: ~85% recall (-67% failure) +``` + +📎 Source: [Anthropic Contextual Retrieval Sept 2024](https://www.anthropic.com/news/contextual-retrieval) + +### Phase rollout incremental (recommend cho bro) + +| Phase | Setup | Recall | Cost/month | Effort additional | +|---|---|---:|---:|---| +| **Phase 1** (Week 1-4) | Layer 1 vector only (Voyage-3-large) | ~70% | ~$1.50 | 10-14h initial | +| **Phase 2** (Month 2) | + Layer 2 BM25 (bm25s free local) | ~78% | ~$1.50 unchanged | 2-3h | +| **Phase 3** (Month 3) | + Layer 3 Voyage rerank-2 + Contextual prefix | ~92% | ~$4-5 | 3-4h | + +### Phase 1 implementation (basic vector RAG) + +Đã cover trong Section 5-6 plan. Bro implement Week 1-4 trial pilot. + +### Phase 2 upgrade — Add BM25 hybrid + +```python +# scripts/rag-mcp-server.py — upgrade +from bm25s import BM25 + +bm25 = BM25.load("./rag-data/bm25_index") # pre-built + +@mcp.tool() +def rag_retrieve_hybrid(query, scope="all", k=5): + # Step 1: Vector search + query_vec = voyage.embed([query], model="voyage-3-large").embeddings[0] + vector_results = qdrant.search(COLLECTION, query_vec, limit=20) + + # Step 2: BM25 search (local Python lib) + bm25_results = bm25.retrieve(query, k=20) + + # Step 3: Merge + dedup + candidates = merge_dedup(vector_results, bm25_results) # ~30 chunks + + # Step 4: Score combine (RRF reciprocal rank fusion) + final_scores = reciprocal_rank_fusion(vector_results, bm25_results) + + return final_scores[:k] +``` + +### Phase 3 upgrade — Full Anthropic Contextual + +```python +# scripts/rag-indexer.py — upgrade với contextual prefix +import anthropic + +claude_haiku = anthropic.Anthropic() + +def contextualize_chunk(chunk_content, full_doc_path): + """Generate context prefix using Claude Haiku (cheap model).""" + full_doc = open(full_doc_path).read() + + response = claude_haiku.messages.create( + model="claude-haiku-4-5", # cheap ~$0.0001/chunk + max_tokens=150, + messages=[{ + "role": "user", + "content": f""" +{full_doc[:5000]} + + + +{chunk_content} + + +Give a brief context (50-100 words) explaining what this chunk is about and where it fits in the document. Be specific.""" + }] + ) + + return response.content[0].text + +# In indexer pipeline: +for chunk in chunks: + context = contextualize_chunk(chunk["content"], chunk["source"]) + chunk["content_enriched"] = f"{context}\n\n{chunk['content']}" + # Embed enriched version → better recall +``` + +```python +# scripts/rag-mcp-server.py — final upgrade với reranking +import voyageai + +@mcp.tool() +def rag_retrieve_full(query, scope="all", k=5): + # Step 1-3: Same as Phase 2 (vector + BM25 + merge) + candidates = hybrid_search(query, scope, top=30) + + # Step 4: Voyage Rerank + rerank_response = voyage.rerank( + query=query, + documents=[c.content for c in candidates], + model="voyage-rerank-2", # ~$0.05 per 1000 queries + top_k=k + ) + + return [candidates[r.index] for r in rerank_response.results] +``` + +### Cost incremental analysis + +``` +Phase 1 → Phase 3 incremental cost: + +Phase 1 (basic vector): + Voyage embed: ~$0.36 initial + ~$0.20/mo delta + = ~$1.50/mo total + +Phase 2 (+BM25): + BM25 free local (Python lib) + Embedding cost same + = ~$1.50/mo total (unchanged) + +Phase 3 (+Reranking + Contextual): + Voyage rerank-2: ~$0.05 per 1000 queries + 600 queries/mo × $0.05/1K = $0.03/mo + + Haiku contextual prefix: ~$0.0001 per chunk + Initial 5000 chunks × $0.0001 = $0.50 one-time + Delta ~100 chunks/mo × $0.0001 = $0.01/mo + + + Voyage rerank monthly: ~$0.05/mo per 1K queries × 5 projects + + Re-embed enriched chunks: ~$0.50/mo + = ~$4-5/mo total + +→ Quality jump 70% → 92% recall = +22pp +→ Cost jump $1.50 → $4-5/mo = +$3 +→ Worth it after Phase 1 validation +``` + +### Why incremental rollout (vs all-in Phase 3 immediate) + +1. **Validate Layer 1 quality first** — nếu Voyage Vietnamese kém → upgrade Phase 2-3 vô ích +2. **Measure baseline cost** — biết exact Voyage spend trước add rerank/contextual +3. **Identify retrieval miss patterns** — Phase 1 trial reveal weakness → target Phase 2-3 fix +4. **Risk-averse setup** — mỗi phase 2-3h add, rollback dễ nếu fail +5. **§6.5 narrative preserve** — KHÔNG over-engineer, build incremental + +### When to skip Phase 2-3 + +- Phase 1 recall already > 85% → Phase 2-3 marginal benefit (Vietnamese-specific corpus) +- Cost monthly < $5 budget → stay Phase 1 OK +- Solo dev no Vietnamese exact terms heavy → BM25 less impactful + +### When to MUST upgrade Phase 2-3 + +- Recall < 70% on benchmark → indicate Phase 1 insufficient +- Em main report "miss exact identifier" frequently → Phase 2 BM25 critical +- Multi-language queries common → Phase 3 reranker stabilize +- Production quality target > 90% → Phase 3 required + +--- + ## 📚 References + tools ### Anthropic official