# Session 27 — 2026-05-22 — Memory Curate + RAG Manual Control + Multi-agent Setup Fix **Dev:** Claude em main SOLUTION_ERP (sub-agent registry empty cả session - pitfall #1+#2 confirmed) + bro pqhuy **Duration:** ~5h (start ~17:00 → end ~22:00 GMT+7) **Base commit:** `d99069a` (S26 chốt cuối Plan AG6) **Final commits:** TBD (anh push manual sau approve) --- ## 🎯 Làm được ### Plan A.3 — RAG Manual Control + Custom Dashboard 🟧 **5 PS scripts** ở `D:\.claude-rag\scripts\` (em main solo cookie-cutter mirror — Implementer Case 2 ACCEPT criteria miss vì registry empty): - `start.ps1` (2.4 KB) - Qdrant background + health check (initial 3s warmup + TimeoutSec 5) - `stop.ps1` (1.4 KB) - Graceful stop + verify - `status.ps1` (6.2 KB) - 6-section terminal report colored - `dashboard.ps1` (17.4 KB) - Generate custom HTML + open browser - `boot.ps1` (1.4 KB) - Auto-boot sequence (Startup shortcut) 🟧 **6 `$PROFILE` aliases:** rag-start / rag-stop / rag-status / rag-dashboard / rag-restart / rag-boot 🟧 **Auto-boot Startup folder shortcut** `%APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup\rag-boot.lnk` — auto-run boot.ps1 mỗi lần Windows boot. 🟧 **Custom MCP Dashboard HTML** `D:\.claude-rag\dashboard.html` (13.3 KB auto-gen) — 7 panels: Qdrant + MCP + Collections + Voyage + System + Agent MEMORY + Recent Logs. Auto-refresh 60s. Anthropic orange theme. ### Plan A.4 — RAG Onboarding Guide 🟧 **`docs/guides/rag-onboarding-guide.md`** (26.3 KB / ~440 lines / 13 sections + §A2): - 13 sections: TL;DR, Background, Pre-check, Bootstrap 3 step, 6 MCP tools matrix, 4 sub-agent brief, Permission, Workflow, Troubleshooting, Tunings, Best practices, Status monitoring §12, Distributed Approach B §13 - §A2 Cấu trúc chuẩn em main pioneer (replaces Plan A.2 skip với full file tree + 8-step bootstrap checklist + 4 Gotcha discoveries S26-S27) - Native Qdrant vs Custom MCP dashboard comparison table ### Memory Curate (3 việc anh chốt OK) 🟧 **4 MEMORY curate** -60% size cumulative: - cicd-monitor: 72.4 KB → 15.8 KB (-78%) - implementer: 38.8 KB → 22.0 KB (-43%) - investigator: 34.9 KB → 16.3 KB (-53%) - reviewer: 34.5 KB → 17.9 KB (-48%) - Archive total 115 KB preserved verbose entries (rule §6.5 compliance) 🟧 **Audit drift §6.4 + §9.4** sớm (trigger +8 gotcha vượt threshold): - 5 file patched: 4 `26 migration → 31 migration` + 1 `41 bẫy → 49 bẫy` - Audit log file: `docs/changelog/skill-audit-2026-05-late.md` - KEEP narrative §6.5: contract-workflow `96→77 test` historical preserved ### Plan F1 — Qdrant Native Dashboard Fix (UAT catch by anh) 🟧 **Root cause:** Qdrant Windows binary zip không bundle Web UI static files. Log warn `Static content folder for Web UI './static' does not exist` + `http://localhost:6333/dashboard` returns 404. Em pioneer S26 chỉ download `qdrant-x86_64-pc-windows-msvc.zip` (28.3 MB binary) không biết Web UI cần download riêng. 🟧 **Fix:** Download `dist-qdrant.zip` v0.2.12 (6.59 MB, released 2026-05-21 — 0.3 ngày trước) từ qdrant-web-ui GitHub releases + extract vào `D:\.claude-rag\qdrant-bin\static\` + flatten `dist/` subfolder up + restart Qdrant → HTTP 200 "UI | Qdrant" ✓ 🟧 **Custom Dashboard panel link "Qdrant Native Dashboard"** giờ work end-to-end. ### Plan F2 — Multi-agent Setup Pitfalls Fix (VIPIX guide catch) 🚨 **Critical discovery:** VIPIX project xuất `docs/guides/multi-agent-pitfalls.md` 2026-05-22 — anh share cho em audit SOLUTION_ERP setup. Em phát hiện: - 4 file `.claude/agents/*.md` dùng `model: claude-opus-4-7` (full ID = 200K fallback, KHÔNG 1M Opus) - 4 file dùng non-standard field `effort: max` → silent reject possible - → Registry chưa load cả session S27 (Agent error: "Agent type 'investigator' not found") 🟧 **Fix applied 4 file:** `model: inherit` (kế thừa 1M Opus parent) + remove `effort: max`. Pending CLI restart cho hot-reload (pitfall #1). 🟧 **Memory user-level NEW:** `feedback_subagent_setup_pitfalls.md` (235 lines) - cross-project pitfall checklist VIPIX + SOLUTION_ERP evidence. Integrated vào MEMORY.md index. --- ## ⚠️ Anti-pattern phá vỡ S27 (retrospective analysis) Em chủ trì kiêm **5 roles** cả session vì registry empty: | Task S27 | Implementer ACCEPT fit? | Outcome | |---|---|---| | C1 Curate cicd-monitor 72KB | ❌ REFUSE #1 (judgment §6.5) | Em solo CORRECT | | C2-C4 Curate 3 agent MEMORY | ❌ REFUSE #1 | Em solo CORRECT | | C5 Audit drift §6.4 + §9.4 | ❌ REFUSE #1 | Em solo CORRECT | | **A3.1 Write 5 PS scripts** | ✅ **ACCEPT Case 2 (5 file cookie-cutter mirror)** | **Em miss delegate** | | A3.2 Dashboard HTML + generator | ❌ REFUSE #2/#7 (first time pattern) | Em solo CORRECT | | A4 Onboarding guide 440 lines | ❌ REFUSE #2 (docs judgment) | Em solo CORRECT | | F1 Qdrant Web UI fix | ❌ REFUSE #4 (bug reasoning chain) | Em solo CORRECT | | **F2 4 agent files fix** | ✅ **ACCEPT Case 1 (4 file mechanical same edit)** | **Em miss delegate** | **Verdict:** 2/8 task lẽ ra delegate Implementer (Case 1+2) nhưng registry empty → em main forced solo. Net loss ~30 phút time + miss cookie-cutter mirror discipline. **Reviewer Smart Friend guard miss critical:** - Em main S27 write rag-onboarding-guide.md claim "Qdrant Native Dashboard `http://localhost:6333/dashboard`" WITHOUT actual verify - Anh pqhuy UAT browser catch 404 → escalate - → Reviewer pre-commit spawn (nếu registry load) sẽ catch Cat 1 "Wire claim verify" — em main self-review compromised --- ## E2E verified ### Plan A.3 verify | Item | Status | Notes | |---|---|---| | 5 PS scripts execute | ✅ | rag-status / rag-start / rag-stop tested live | | Custom Dashboard HTML | ✅ | 13.3 KB, 7 panels rendered, browser opened | | `$PROFILE` aliases | ✅ | 6 functions loaded via `. $PROFILE` | | Startup shortcut | ✅ | `rag-boot.lnk` 1378 bytes created | | ASCII-only PS source | ✅ | Em fix 3 file (stop/boot/dashboard) sau Unicode mangling issue | ### Plan F1 verify | Item | Status | |---|---| | Qdrant native dashboard HTTP 200 | ✅ "UI | Qdrant" title | | Collections API live | ✅ proj_solution_erp returned | | static/ folder size | 16.07 MB (Web UI v0.2.12) | ### Plan F2 verify (partial) | Item | Status | |---|---| | 4 file edit `model: inherit` | ✅ Applied | | 4 file remove `effort: max` | ✅ Applied | | Sub-agent spawn test | ❌ Pitfall #1 - need CLI restart | | Re-spawn post fix | ❌ Still "Agent type not found" | | Registry hot-reload | ❌ Pending anh restart Claude Code CLI | ### Memory curate verify | Agent | Before | After | Status | |---|---:|---:|---| | cicd-monitor | 72.4 KB | 15.8 KB | ✅ -78% | | implementer | 38.8 KB | 22.0 KB | ✅ -43% | | investigator | 34.9 KB | 16.3 KB | ✅ -53% | | reviewer | 34.5 KB | 17.9 KB | ✅ -48% | | **Total** | **180.6 KB** | **72.0 KB** | **-60%** | --- ## 🐛 Bug gặp + fix | Bug | Fix | |---|---| | PowerShell 5.1 mangling Unicode `✓ → âœ"` trong .ps1 source | Replace ASCII-only `[OK]/[FAIL]/->/*/` trong PS code (HTML output unicode OK) | | `dashboard.ps1` parser fail với emoji 🧠🗄️🔌 trong here-string HTML | Rewrite ASCII-only PS source, dùng CSS-styled `icon-box` div thay emoji | | Qdrant native `localhost:6333/dashboard` 404 | Download `dist-qdrant.zip` v0.2.12 + extract `static/` + flatten + restart | | `start.ps1` health check timeout 30s fail false alarm | Adjust to 3s warmup + TimeoutSec 5 | | Qdrant crashed OOM "allocation 8.4MB failed" mid-session | rag-restart recovers (data persisted) | | 4 sub-agent `model: claude-opus-4-7` silent fallback 200K | Fix `model: inherit` per VIPIX pitfall #2 | | Sub-agent `effort: max` non-standard field silent reject | Remove field | | Re-bootstrap fail pydantic serialization error | Defer S28 — debug needed | --- ## 📚 Docs updates | File | Update | |---|---| | `docs/STATUS.md` | Last updated S27 chốt cuối (Memory curate + RAG manual control + multi-agent fix) | | `docs/HANDOFF.md` | TL;DR S27 với pitfall lesson + Plan B Contract V2 Pending S28 | | `docs/changelog/sessions/2026-05-22-s27-memory-curate-rag-dashboard-multi-agent-fix.md` | File này — session log đầy đủ | | `docs/guides/rag-onboarding-guide.md` | NEW 26.3 KB / 440 lines / 13 sections + §A2 cấu trúc chuẩn | | `docs/changelog/skill-audit-2026-05-late.md` | NEW audit log drift trigger sớm (~100 lines) | | `.claude/agents/*.md` × 4 | Fix `model: inherit` + remove `effort: max` | | `.claude/agent-memory/*/MEMORY.md` × 4 | Curate (-60%) + S27 entry proxy flush | | `.claude/agent-memory/*/archive/2026-05-*.md` × 4 | NEW archive files preserve verbose Q1 entries | | `.claude/skills/README.md` | `26 migration → 31` + `44 bẫy → 49` | | `.claude/skills/ef-core-migration/SKILL.md` | 4 patch count drift | | `.claude/skills/dependency-audit-erp/SKILL.md` | `41 bẫy → 49 bẫy` | | Memory user-level `feedback_subagent_setup_pitfalls.md` | NEW 235 lines cross-project pitfall checklist | | Memory user-level `feedback_rag_hybrid_pattern.md` | +1 bài học #8 (Qdrant Web UI static missing fix) | | Memory user-level `MEMORY.md` index | +1 entry feedback_subagent_setup_pitfalls | --- ## 🤝 Handoff S28+ ### Pending priority HIGH 1. **Anh restart Claude Code CLI** → verify 4 sub-agent load post `model: inherit` fix. Spawn test Investigator IMMEDIATELY mỗi session start (per `feedback_subagent_setup_pitfalls.md` §4). 2. **Plan B Contract V2 wire** (carry từ S25/S26) — kick off với 4 sub-agent ACTIVE: Investigator pre-flight + Implementer Case 2 mirror PE Mig 22-23 + Reviewer pre-commit + CICD Monitor post-push. 3. **Re-bootstrap RAG** SOLUTION_ERP để index S27 changes (rag-onboarding-guide.md + skill-audit-late + 4 MEMORY updates + feedback_subagent_setup_pitfalls). Bootstrap fail pydantic — debug + retry S28. ### Pending priority MEDIUM 4. **Audit S20-S26 memory log "Investigator spawn" claims** — verify history retroactive (có thật spawn hay nhầm general-purpose default?). Defer dedicated investigation S28+. 5. **Plan AI Phase 5** distributed bootstrap 4 project khác (NamGroup/DH/Ashico/Vipix) — em main project đó tự làm khi anh mở Claude Code project (Approach B distributed). 6. **Test debt catch-up Plan C bundle** (S22+1 + S25 + S26 bug fix chưa add regression — UAT mode defer per §7). ### Pending priority LOW 7. Benchmark RAG recall@10 golden dataset 100 query (gap optional) 8. Disaster recovery weekly backup Qdrant data → Dropbox (gap optional) 9. Gotcha #48 SQLite tie-break + #49 dual-phase UI confusion add `docs/gotchas.md` (carry từ S25) --- ## 📊 Thông số cumulative S27 | Metric | S26 chốt | S27 chốt | Δ | |---|---|---|---| | DB tables | 59 | 59 | 0 | | Migrations | 31 | 31 | 0 | | Endpoints | ~146 | ~146 | 0 | | FE pages | 35 | 35 | 0 | | Unit tests | 111 | 111 | 0 (UAT mode defer per §7) | | Gotchas | 49 | 49 | 0 (gotcha #48, #49 still pending docs add) | | Memory user-level | 23 | **24** | **+1 (`feedback_subagent_setup_pitfalls.md` NEW)** | | Skills project-local | 6 | 6 | 0 | | Sub-agents | 4 (broken registry) | 4 (fixed, pending CLI restart) | 0 count, +1 fix | | Docs files | — | **+2** | `rag-onboarding-guide.md` + `skill-audit-2026-05-late.md` | | PS Scripts infra | 0 | **+5** | `D:\.claude-rag\scripts\*.ps1` | | Custom Dashboard | 0 | **+1** | `dashboard.html` auto-gen | | Agent MEMORY total | 180.6 KB | 72.0 KB | **-60%** (115 KB archived) | | Commits remote | `e23f51c..d99069a` | unchanged | **0 push S27** (all local + memory + docs - anh chốt push thủ công) | ### Multi-agent ROI S27 | Agent | Spawn | Actual outcome | |---|---|---| | 🟦 Investigator | 0 (registry empty) | Em main solo audit + curate | | 🟨 Implementer | 0 (registry empty) | Em main solo 5 PS scripts + 4 file fix (lẽ ra delegate Case 1+2) | | 🟥 Reviewer | 0 (registry empty) | Em main self-review compromised (miss Qdrant 404) | | 🟩 CICD Monitor | 0 (no remote push) | N/A | | 👤 Em main solo | continuous | ~5h cả 8 task + meta-discovery pitfall | **Net learning S27:** Pitfall discovery + fix infrastructure → S28+ multi-agent CÓ THỂ work properly sau CLI restart. Anti-pattern em main solo = forced không phải choice. --- ## 🎓 Patterns reusable cross-project (S27 NEW) 1. **Multi-agent setup pitfall checklist** - 4 pitfall VIPIX + SOLUTION_ERP evidence (`feedback_subagent_setup_pitfalls.md`) 2. **PS scripts ASCII-only discipline** - HTML output unicode OK, PS source phải ASCII (CSS-styled badges + icon-box thay emoji) 3. **Qdrant Windows binary 2-step setup** - binary + Web UI static download separate (gotcha #8 `feedback_rag_hybrid_pattern.md`) 4. **Custom Dashboard PS generator pattern** - here-string HTML template + collect data via Invoke-RestMethod + Out-File UTF-8 + Start-Process browser 5. **Memory curate proxy pattern khi registry empty** - em main update sub-agent MEMORY trực tiếp với retrospective analysis (REFUSE log validation + ACCEPT miss flag) 6. **Audit log file separate khi drift sớm** - `skill-audit-YYYY-MM-late.md` thay vì rewrite existing audit log (preserve historical trail per §6.5) 7. **Dashboard SNAPSHOT vs LIVE distinction** - static HTML generated by PS = snapshot tại 1 thời điểm, browser meta refresh KHÔNG re-run PS. Anti-pattern em pioneer commit: meta refresh 60s falsely impression live. Fix: add prominent timestamp + warning banner + link to Qdrant native dashboard (LIVE API fetch). Pattern reusable cho bất kỳ custom dashboard PowerShell. 8. **NSSM Windows Service Option 4b upgrade pattern** - Qdrant binary natively no `--service` flag → wrap với NSSM (3 MB binary download from nssm.cc release/2.24, retry 503 transient). install-service.ps1 + fix-service-start.ps1 elevated scripts ready. Auto-start boot-time + auto-restart on crash + survive logout. Recipe transferable cho bất kỳ database/server binary cần Windows Service mode mà không có native support. 9. **Hybrid context loading discipline (Cách A defensive)** - Layer 1 blanket auto-load ~120K (CLAUDE+STATUS+HANDOFF+MEMORY index+skills+sub-agents) + Layer 2 RAG retrieve on-demand via 6 MCP tools. Decision gate per project MD size: < 200K skip RAG / 200K-1M lazy / > 1M MANDATORY. Token budget zones 5-tier (green/warning/approach/critical). 7 anti-patterns avoid (Read full session log cũ, search vague, em main solo qualify Implementer Case 1/2, skip store_memory, heavy session > 6h). Onboarding §A3 NEW comprehensive. --- ## 🔧 Plan A.3+ Upgrade Post-Wrap — Option 4b NSSM Windows Service > **Trigger:** Anh phản biện "Qdrant nên auto-start như database server thường (PostgreSQL/SQL Server đều Windows Service auto-start)" → em upgrade Option 4a (manual scripts S26 chốt) lên **Option 4b NSSM Windows Service**. > **Done:** Post-initial wrap-up cùng session S27 — anh chạy 2 elevated PS scripts → service Running successfully. ### NSSM install + service registration 🟧 **Download NSSM 2.24** ~351 KB từ `https://nssm.cc/release/nssm-2.24.zip` (retry 1× sau 503 transient) → extract `D:\.claude-rag\nssm\nssm.exe` (323 KB win64 binary) 🟧 **install-service.ps1** (5.4 KB elevated script): - Verify Admin privileges - Remove old service nếu exist (idempotent) - `nssm install Qdrant qdrant.exe` + AppDirectory `D:\.claude-rag\qdrant-bin` - Log paths `logs/qdrant-service.log` + `.err` với rotation 10 MB - `Start: SERVICE_AUTO_START` (boot-time auto-start trước login) - `AppExit Default Restart` + `AppRestartDelay 3000` (auto-respawn 3s sau crash) - DisplayName "Qdrant Vector DB (RAG Unified)" + Description friendly - Start service + health verify HTTP `localhost:6333/healthz` ### Bug gặp: WAL lock conflict 🟧 **Symptom:** Step [5/6] Start failed — service Paused state, log `Can't init WAL: Kind(WouldBlock)`. 🟧 **Root cause:** Em start Qdrant manual PID 9800 ngay trước install-service.ps1 → manual process held Write-Ahead Log file lock → service không init được WAL → fail start + Paused state. 🟧 **fix-service-start.ps1** (2.4 KB elevated script) — recovery: - Stop-Service -Force (kill Paused state) - Kill any remaining qdrant.exe (defensive cho orphan PID) - Wait 3s - Start-Service Qdrant - Verify HTTP + process PID + RAM 🟧 **Result confirmed:** Service Running PID 4476 RAM 101.8 MB. Auto-start ENABLED ✓ ### 4 PS scripts updated to Service mode 🟧 **start.ps1** → `Start-Service Qdrant` (cần Admin). Detect already Running + skip + show service info. 🟧 **stop.ps1** → `Stop-Service Qdrant -Force` (cần Admin). Defensive kill orphan qdrant.exe nếu detected. 🟧 **status.ps1** → `Get-Service Qdrant` (no Admin) thay process check + Section 1 expanded với service info (Status, StartType, DisplayName) + Process info defensive guard `try/catch` cho StartTime null khi spawn from service. 🟧 **boot.ps1** → drop Qdrant start (service auto-start) + verify service Running + regenerate Dashboard + open browser. Used by Startup folder shortcut `rag-boot.lnk`. ### Dashboard SNAPSHOT vs LIVE clarification 🟧 **Bug discovered:** Custom Dashboard `dashboard.html` show "Running" badge stale 10+ phút sau Qdrant DOWN. Root cause: `` chỉ reload HTML cached, KHÔNG re-run PS script. 🟧 **Fix applied:** Remove meta refresh + add prominent yellow banner top: ```html ⚠ Dashboard này là STATIC SNAPSHOT tại thời điểm generated (xem timestamp). Browser auto-refresh chỉ reload HTML cached, KHÔNG poll Qdrant API live. Muốn data live → mở Qdrant Native Dashboard (fetch API real-time). ``` 🟧 **Header timestamp** + label "SNAPSHOT (not live)" red color visible. 🟧 **Service status panel** added Get-Service info: Service (Running) / Auto-start (Automatic) / DisplayName. ### Onboarding guide §A3 NEW comprehensive 🟧 **Section §A3 Context loading Hybrid pattern** (~242 lines added) — 8 sub-sections: - §A3.1 Why Hybrid Cách A vs Cách B vs Skip RAG (decision gate) - §A3.2 Layer 1 Blanket auto-load checklist (~200-280 KB ≈ 50-70K tokens) - §A3.3 Layer 2 RAG retrieve decision tree (7-branch) - §A3.4 Token budget guidance 1M Opus (5 utilization zones) - §A3.5 6 MCP tools examples concrete (Vietnamese + English queries) - §A3.6 Best practices em main daily workflow (morning + in-session + EOS + monthly) - §A3.7 7 Anti-patterns cross-project (distilled VIPIX + SOLUTION_ERP) - §A3.8 Context monitoring + when to end session (heuristics + protocol) ### Memory user-level updates 🟧 **`feedback_rag_distributed_ownership.md`** Decision 2 upgraded Option 4a → 4b NSSM với gotcha S27 install (nssm.cc 503 transient + WAL lock conflict + Process.StartTime null guard). ### Final state Service mode | | State | |---|---| | Qdrant Service | Running (Automatic) ✓ | | HTTP /healthz | OK ✓ | | Collection proj_solution_erp | 3460 chunks (⚠ indexed_vectors=0 HNSW broken — pending re-bootstrap S28) | | Auto-start on Windows boot | ENABLED ✓ | | Auto-restart on crash | ENABLED 3s delay ✓ | | 7 PS scripts total | start/stop/status/dashboard/boot/install-service/fix-service-start | | Onboarding guide | 42.5 KB / 682 lines / 14 sections + §A2 + §A3 |